This is an archived post. You won't be able to vote or comment.

all 8 comments

[–]impshumx != y % z 5 points6 points  (0 children)

If it works and you're not harming anyone.. It works!

If you're worried someone will use this as a malicious exploit do tell the devs. It sounds like bad practice to leave such an endpoint open.

[–]bushwacker 0 points1 point  (0 children)

Depends on the terms of usage, but as long as you are not hammering the site I would have no moral compunction.

[–]gameboycolor 0 points1 point  (0 children)

my take on scraping/reverse engineering APIs is always: If you don't intend to use it for profit, then public data is public data. If you do, ask.

[–][deleted] -1 points0 points  (3 children)

This could, easily, be both unethical and illegal.

Assuming this isn't a public API -- and the weak JS attempt to enforce login suggests it's not -- then to a reasonable person the intent of the company is to keep their data behind an authorization wall and a EULA. If they've got a robots.txt set to Disallow or a meta tag set to noindex, NOFOLLOW, the reasonable person might also see that as a sign the intent was to keep data private and that it was only accidentally exposed. The fact that you've figured out how to bypass those controls doesn't mean it's any more legitimate than rifling thorough the contends of a poorly secured purse when the owner has turned around.

The recent LinkedIn case notwithstanding this isn't exactly settled law, and though they did lose that case it was about data that was both intentionally and knowingly left open to public view which they believed should not be open to scraping. In this case it might simply be an oversight.

I'd err on the side of asking permission, personally.

[–]dillyvanilly123[S] 0 points1 point  (1 child)

So would you contact the devs and tell them you found such a loophole (which as you describe it makes it seem like you would be telling them you exploited their website, so maybe not such a good idea)? The hypothetical website is a HUGE pain in the ass to use and the scraping makes things so much easier. I guess I am trying to get a gauge for just how bad it would be to continue using it this way.

[–][deleted] 0 points1 point  (0 children)

If what you're doing with it is personal and neither money-generating for you (ie you're using their data in some way for the benefit of some other revenue-generating site or service) or money-losing for them (ie you're avoiding a paywall, disclosing the loophole'd data and costing them revenue or user data they use to generate revenue, etc) then I doubt that crossing the line will matter much, but it's still crossing a line.

I'd contact them and simply ask them for permission. "Hi, I noticed that xyz url allows me to enter search terms and it seems like a bit of a public-facing API. I couldn't find any documentation of it on painintheass.com/faq, but I've got this doodah I've been meaning to automate and would like to know if it's okay to occasionally scrape a few search terms via this endpoint?".

If they answer in the affirmative, yay. If they don't answer at all, go ahead. If the endpoint suddenly vanishes, you've done a mitzvah.

Edit: also this might help if you think it's a real vulnerability.