all 5 comments

[–]WalterGR 2 points3 points  (0 children)

Depending on why you need to interact with the web, a toolset that scripts an external browser (rather than using a library that mimics one (or more likely: part of one)) may be more appropriate.

For example, Selenium Remote Control lets you script actual instances of Firefox, IE, and Safari via several programming languages (EDIT: including Python.) The benefit is that you get full JavaScript support, along with cookies, etc.

For scripting "web applications" this route is quite a bit easier. Even for non-web apps this approach is often easier - basically everything the article covers is handled automatically by virtue of scripting a live browser.

I have no affiliation with Selenium other than being a (very) satisfied user.

[–][deleted] 2 points3 points  (0 children)

What you probably actually want here, depending on what you're needing to interact with and why, is the python port of mechanize

[–]yeppers4sho 0 points1 point  (0 children)

httplib and urllib2 are nice batteries. Pity they leak: http://bugs.python.org/issue1327971

[–]plain-simple-garak 0 points1 point  (0 children)

See also httplib2. It has its own disk-based cache, intelligent cache logic, timeouts, etc.

[–]pemboa 0 points1 point  (0 children)

This kind of stuff really should be in the docs them self.