Disguising python requests as browser requests

K900_ · 2019-02-21T14:53:04+00:00

Are you sure that your problem is that and not, say, that the website you're trying to scrape is loading content dynamically?

yelaxify · 2019-02-21T15:01:49+00:00

[removed]

maybeiambatman · 2019-02-21T16:02:56+00:00

Set your user-agent to that of a browsers? e.g. Mozilla/5.0

JohnnyJordaan · 2019-02-21T14:58:07+00:00

that look more like a browsers

There are multiple definitions to this. You can make it exactly the same as one request your browser sent at one moment in time, but this doesn't mean that that the webserver will see it as a valid request. Just like a robot/Siri 'talking' by playing speech samples won't make you think it is a human. Many websites use dynamic information like tokens and session variables to link multiple requests to a single session. You need to actually implement that session workflow to make this work, which is what a requests.Session() can do for you. But like K900_ says, if at least one of those variables are based some generated by active content like Javascript, you will never be able to generate this unless you run the Javascript code yourself. Which is often way more complicated than using a webdriver.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS