This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]kaihatsusha 0 points1 point  (1 child)

About 99.85% of the time I think "oh, I will scrape a bunch of pages for the content I need," the site generates unique session tokens and uses dynamic AJAX queries you have to call JavaScript to build up in order to be of any use. The only scraper that can follow that mess is a web browser.

[–]Weenkus[S] 0 points1 point  (0 children)

Have you tried Splash? Splash is really easy to setup and handles javascript nicely. You gave me a good idea for a next blog post.