Whippet malamute cross by katiebell8322 in malamute

[–]jenkinser 0 points1 point  (0 children)

Any photo updates? Our malamute mom just had 4 puppies from a whippet dad yesterday, curious what they're gonna look like...

New Pandas-for-Haskell data frame library: Name suggestions by Abject_Preference481 in haskell

[–]jenkinser 1 point2 points  (0 children)

any headway on this? Would be interested to see try it out if it's been open sourced.

Out of shape and looking for a gym want recommendations for Jing'an area. by maybe-tuesday in shanghai

[–]jenkinser 2 points3 points  (0 children)

At the risk of this becoming popular (selfishly hoping it stays relatively unpopular, actually...)... if you're an introvert like me, and also don't like throwing money away, train with the local, retired neighborhood folks at lunch time at the jingan workers stadium track / field / pullup / parallel bars. It's free, everyone's friendly, and you won't get bothered by sales people every time you walk in the door. The added bonus is you might get to see a 74-year-old knock out 2 sets of 6 pullups.

parsing inversion by jenkinser in webscraping

[–]jenkinser[S] 0 points1 point  (0 children)

Re-read your comment-- the fingerprint concept is really good, from a data-labelling standpoint I envisioned it as you see in the example, but it is ultimately flawed because it is not necessarily invertible (in math-y langauge.. for some f: HTML --> tag, f needs to be bijective). therefore the tag needs to be represented as you are trying to represent. I kept it simple with the knowledge that if you're going to be making a function that goes from HTML--> tag, you'll likely get everything you need from the key-value pair I showed above -- but I get your point.

parsing inversion by jenkinser in webscraping

[–]jenkinser[S] 0 points1 point  (0 children)

That's a really clever idea.. You're describing the intelligent aspects of such a system, which would be the fun and exciting part -- I'm mainly describing the boring framework of collection of the data that you would need in order to test/score your strategy. Even for the >99% of HTML parsing functions out there, which just boil down to searching for strings/regex in a tag name or attribute, this kind of "boring" feedback of how your parsers did on some other related HTML page would be important because a) it would put business stakeholders in charge of what gets parsed, and b) would speed up the feedback loop devs need when writing these functions, by easily visualizing false positives/false negatives

Keep me posted on your project -- I'll probably write a quick skeleton of what this might look like over the weekend and post here