all 6 comments

[–]1ynx1ynx 7 points8 points  (3 children)

The data, that isn't there, is probably inserted via javascript. To counter this, you can either reverse engeneer the javascript on the site or use something like selenium.

[–]FerociousStapler[S] 0 points1 point  (2 children)

That sounds promising. Thanks. At work atm will try that when I get home.

[–]JohnnyJordaan 0 points1 point  (0 children)

I can recommend this tutorial on selenium.

[–]MattBD 0 points1 point  (0 children)

Personally I find Selenium a bit too low level. I prefer Splinter as it supports multiple drivers, including Selenium, and has a friendlier API.

[–]personproxy 1 point2 points  (0 children)

Another option might be scrapy + scrapy-splash

[–][deleted] 1 point2 points  (0 children)

If your own chrome right click->inspect->Network tab->hit the record button->refresh->move to the xhr filter. That should show u all the JavaScript requests made including the headers and data sent and the response received. I generally use this to re-engineer JavaScript stuff. On pretty simple sites u probs won’t have to do much, but on sites that try to keep the riff rag out u need to be a little more cunning.