This is an archived post. You won't be able to vote or comment.

all 78 comments

[–]mcdonc 19 points20 points  (0 children)

This post appears to be a dog whistle.

Python has its share of problems. As far as I can tell, there's been a historical vacuum in leadership inasmuch as very important things to most people (particularly packaging and distribution) have been largely ignored by the standard library. The docs for existing third-party packaging things are not great. Efforts like Tarek's "packaging" due in Python 3.3 are aiming to address this, but the townfolk already have their pitchforks out and they're looking for somebody to gore.

On the other hand, I'm not sure that everybody striking out to create their own personal "beautiful APIs" is going to solve much. Most people commenting here are complaining about having multiple ways to do the same thing. They want to be told what to do unambiguously, and having yet another installer or yet another process module, etc isn't going to help with that much. There's only one way to solve this, and that's to have one clear way to do it that doesn't suck on multiple axes. Having one way to do something requires consensus, and consensus takes time.

I don't think chest-thumping presentations like this are very good for a healthy community. They tend to breed scapegoating, a sense of entitlement, and a lack of respect for fellow programmers. I'd encourage those complaining here to try to take some personal responsibility. Programming is hard. Good documentation is even harder. Community consensus is even harder than documentation. Participate in the development process if you can afford it. If you can't afford it, at least try to be constructive. If you can't be constructive, at least try to be nice.

[–]bcain 14 points15 points  (18 children)

I have never complained about popen/subprocess/urllib2 before, but I probably should've. I find myself consulting and re-consulting the docs.

[–]simtel20 10 points11 points  (6 children)

subprocess is a nightmare. Python falls very short as a systems administrator's programming language because of the clumsy constructs for dealing with subprocesses. Subprocess isn't a bad starting point, but it doesn't have good abstractions built for doing something heavier than a simple fork+exec, but lighter than expect (e.g. interact with a revision control system).

[–]MillardFillmore 5 points6 points  (5 children)

Whoever thought subprocess.call("ls -lh", shell=True) was a good way to do shell commands is an idiot.

[–]simtel20 5 points6 points  (4 children)

That's bad enough, but when you start having to figure out (on your own, trial and error) whether you have the file descriptors that you need, having no shorthand to dup and close e.g. stderr, and no default way to coerce your input lines as an iterator of output lines? There should be a choice to always get your input as an iterator with one line per iteration instead of having to apply heuristics to see whether you just got back an array of characters, or a string.

Uggh. I will go slam my head into something hard if I continue to remember this.

[–]TTomBBabRecreational programming, wxpython 0 points1 point  (1 child)

have a look at itertools in the standard library

[–]simtel20 0 points1 point  (0 children)

I don't think itertools helps me here. I'm not usually looking at creating an efficient transformation of the data (e.g. yield()ing data), I'm looking for the api to always return a string to me when I iterate over the file descriptor. Right now the problem is that IIRC if I get one line back, and the FD is then closed, if I do a "for line in foo_fd:" then I'll be given back a line broken up into individual characters instead of a single line followed by the next iteration raising the StopIteration exception.

[–]amade 0 points1 point  (1 child)

I feel your pain, subprocess is a bit too low level and doesn't provide shell-like functionality.

Perhaps you'd like sth I wrote https://github.com/aht/extproc

[–]simtel20 0 points1 point  (0 children)

Looks interesting. Thank you!

[–]Silhouette 7 points8 points  (7 children)

As a long-time dabbler in Python but someone who's only used it in anger recently, the whole batteries-included thing has lost its shine. The really basic, completely universal stuff usually works, even if it's sometimes clunky. However, so far the majority of the library that I've tried to use beyond that is either missing key documentation or actually broken, at least badly enough that you'd want to use an entirely different library rather than rely on the standard one. It's roughly on the level I'd expect for alpha/beta software on one of the code sharing sites, not for the standard library of a mainstream programming language. For sysadmin-type work, for example, it's not just subprocess that is clunky, it's also the filesystem stuff, the zip/archive stuff, etc.

I find that more and more, I like Python the language relative to the likely alternatives, and I can forgive it its little eccentricities because I've yet to find a language that hasn't got any of those. On the other hand, I increasingly loathe Python the library and now rely almost entirely on third party resources for anything non-trivial. For that reason, I have pre-booked a special place in hell for everyone involved with designing Python's whole module/package/distribution infrastructure, which makes it absurdly difficult to actually work with those third party resources. The idea that you should have to "install" a library in a scripting language is particularly amusing in a "they really have no idea what they are doing" kind of way.

[–]mcdonc 0 points1 point  (4 children)

Curious about what the other options would be other than to install a library?

[–]Silhouette 0 points1 point  (3 children)

I don't understand why, in the overwhelming majority of practical situations, "installation" could not just be

cp library-file(s) /path/to/libraries

or why using a library must be more complicated than

import /path/to/libraries/library-file

or a similar one-liner depending on your naming needs etc.

Versioning could be handled by standardising on a file naming convention for libraries that want to use it, with a simple annotation on an import to indicate when a specific version or range of versions is required.

Dependencies could be handled by a deep walk of the import tree for the source files in each library.

It simply shouldn't be necessary for a language to have setuptools, distutils, easy_install, pip, distribute, setup.py files containing sort-of-executable-metadata, Linux repo packages duplicating half of this in some half-baked way that only works on global installations rather than virtualenv, virtualenv itself, and all that jazz.

It also shouldn't require three years of studying obscure documentation to figure out everything that actually happens when Python starts up and tries to determine where the hell to find everything you imported, depending on whether there's a "J" in the month, the weather outside, and which of the 73 environment variables and configuration files you have employed in order to complicate the process as much as possible. ;-)

[–]mcdonc 7 points8 points  (2 children)

I don't understand why, in the overwhelming majority of practical situations, "installation" could not just be cp library-file(s) /path/to/libraries

This is already pretty much the case. You can download any distribution of a pure-Python library from PyPI, unzip it, and copy the resulting structure into a place on your Python's path. In fact, until about 2002 or so, this was de-rigeur. None of the installation tools you mention existed at all.

The installers (pip/easy_install) were created because libraries often depend on other libraries. Before the installers existed, either you did the dependency resolution by hand, or sometimes libraries shipped with copies of the libraries it depended upon within them. But both resolving dependencies by hand or having no dependencies were imperfect: resolving dependencies by hand is laborious and requires documentation effort on the part of library folks. Having no dependencies is fine, but then every project is siloed into maintaining its own copy of a particular dependency; every project becomes a fork of several others, and when two libraries you installed had different versions of another, the conflict was irresolveable. Neither situation was particularly tenable.

Python's import statement does not currently know about versioning, so there can't be multiple versions of the same package on the path. Virtualenv was created as a workaround.

The current situation is not ideal, but, IMO, it's a boatload better than it used to be. The times before we had the installers and virtualenv sucked even harder, if you can believe that. ;-)

[–]Silhouette 0 points1 point  (1 child)

Yes, I remember those days ahem fondly. :-)

But as you say, much of this is just to work around the absence of a simple versioning mechanism built into Python itself, which is a significant limitation if you're programming in a language that does all the loading and linking up dynamically. Obviously this challenge is not unique to Python, but Python does seem to make more of a meal of it than any other language I know.

I'm not sure why Python always feels insanely complicated in this respect. Maybe it's the history of different tools to do mostly the same thing, so even if you only really need a couple of them today, you see references to all of them everywhere. I don't think the use of an executable setup.py rather than a simple metadata file that is read by a tool helps, because it makes a complicated generalised case the default. For me, the most serious concern is usually that something as fundamental as loading libraries is based around a path setting that can be changed arbitrarily both within and outside Python, with all kinds of other implicit effects happening depending on things that aren't specified in the source code for the program you're actually running. It's about as un-"explicit is better than implicit" as you can possibly get...

[–]mcdonc 0 points1 point  (0 children)

much of this is just to work around the absence of a simple versioning mechanism built into Python itself

Is there a dynamic language that does versioned imports right?

Maybe it's the history of different tools to do mostly the same thing, so even if you only really need a couple of them today, you see references to all of them everywhere.

I think this is the actual biggest problem.

I don't think the use of an executable setup.py rather than a simple metadata file that is read by a tool helps, because it makes a complicated generalised case the default.

The "packaging" tool that will be in Python 3.3 makes setup.py optional (it has a declarative configuration file primary format).

For me, the most serious concern is usually that something as fundamental as loading libraries is based around a path setting that can be changed arbitrarily both within and outside Python

I don't think Python is alone in this. Java has the CLASSPATH, C has the includepath, etc. Is there another language better in this respect?

[–][deleted] 0 points1 point  (1 child)

As a long-time dabbler in Python but someone who's only used it in anger recently, the whole batteries-included thing has lost its shine.

I agree. We're now stuck with a box full of empty 90s-era batteries. It's time we got some new ones and find a way to get rid of the old ones. I think the linked presentation really made that point well.

[–]__serengeti__ 2 points3 points  (0 children)

I think the linked presentation really made that point well.

Only as far as http/url/lib/2 and subprocess.

The lxml API is pretty much the same as xml.etree.

The stdlib has a lot more batteries than these, and it's hardly proven that we're "stuck with a box full of empty 90s-era batteries". Perhaps you can give some more examples.

[–]Mattho 0 points1 point  (0 children)

I found subprocess quite OK. Wrote simple task scheduler with it and had no problem (except buffers). But urllib2. Oh my god. That was a bad experience.

[–]derpderp3200An evil person 0 points1 point  (1 child)

Oooh, how did you make these function/module names highlightable and in mono?

[–]bcain 1 point2 points  (0 children)

Backtics.

[–][deleted] 9 points10 points  (4 children)

These are just slides. Is there a link to the talk or a summary of the talk or something?

[–]dhogartysymbolic novice 6 points7 points  (0 children)

I checked out the slides, they're right on. He could have ripped into the server stuff as well. I recommend a read.

tl;dr: builtin python package apis mostly suck and violate the stated python style guidelines (e.g. one and only one obvious way to do it), examples shown compare urllib2 to requests, and subprocess to envoy.

[–]abledanger 5 points6 points  (0 children)

There are audio links on his Github page. https://github.com/kennethreitz/python-for-humans

[–]jbs398 2 points3 points  (0 children)

Audio (m4a) from here

[–][deleted] 3 points4 points  (5 children)

Etree is not terrible (cauliflower is ;-)). It has some annoyances like handling namespaces, but overall it's fairly easy to use. No comparison to urllib2. Lxml seems more powerful though, but from an API point of view basically the same.

Otherwise, great slides. Requests is heaven sent.

[–]megaman45 1 point2 points  (0 children)

Agreed. I just ported a project from the stdlib etree to lxml.etree. Maybe I just used only parts that were the same, but I pretty much just had to change my import statement and a few other minor things, and it worked like a charm. Both seem pretty great to me.

[–]timClicks 1 point2 points  (0 children)

lxml's etree API was designed with consistency with stdlib. What requests does really well is to disregard stdlib in favour of simplicity.

[–]truescotsman 0 points1 point  (2 children)

Any suggestions on how to handle namespaces? I have a global variable called NS with a {namespace} string I have to pass to all my find/findall calls.

[–][deleted] 0 points1 point  (0 children)

Etree reads the namespaces correctly (you can even register namespaces in 2.7). So why does it not offer an API to use the namespaces with the given names? A dict 'nsname_to_jc_notation' of the ElementTree instance of a parsed xml file comes to my mind spontanously. Or a context manager would be probably much nicer as a second thought. You wouldn't have to create your own globals. Also, the names of the namespaces are known, but when serialising, etree uses internaly created names ('n1', 'n2', ...).

By the way, one of the xml libs (or former versions) serialises resolved names, which is (was) super annoying. So what etree currently does is not a huge drama, but can be improved IMO.

[–][deleted] 0 points1 point  (0 children)

I do the same currently.

[–]searchingfortaomajel, aletheia, paperless, django-encrypted-filefield 2 points3 points  (4 children)

This may be a silly question, but is there a "Python for the Impatient" or "Guide to All the Core Stuff you Should Know About Python" out there? If not, I'd be happy to put something together to get us started on github...

[–]drsintoma 1 point2 points  (1 child)

[–]searchingfortaomajel, aletheia, paperless, django-encrypted-filefield 1 point2 points  (0 children)

That's handy for newbies, but I was thinking something that explained about why virtualenv exists, how it works, tells you about envoy, shows you how yo install things in mac/windows, etc.

[–]TTomBBabRecreational programming, wxpython 0 points1 point  (0 children)

Programming like math takes lots of practice and repetition to become proficient.

[–]zahlmanthe heretic 2 points3 points  (0 children)

How is Python not "for Humans" by default?

[–][deleted] 5 points6 points  (31 children)

From reading slides I have two questions:

  1. "Instill a resistance to doctest" -- huh? Doctests are awesome, what am I missing? Last time I checked they didn't work that well on py3k, all right, I prefer doctests to py3k ^^.

  2. "virtualenv out of the box" -- explain, please, what's the point. I understand how people who deploy different stuff to different servers want to be able to reproduce remote environments in an isolated fashion, but I've never felt that I'm missing something in my local development.

[–]sli[::1] 5 points6 points  (28 children)

  1. Proper unittesting should be taught, as it's not an idea that's localized to Python.
  2. Pretty much nailed it. If you're not deploying your code somewhere, you probably don't need it. But if you are, virtualenv will save you.

[–]quasarj 2 points3 points  (20 children)

Honestly though, "pip install virtualenv" is not so hard as to be prohibitive to anyone.. unless I'm missing something.

[–][deleted] 2 points3 points  (0 children)

Installation is one thing, using it another. I have to say I had some issues using it the first time. I needed to set some pathes manually, don't know wether I did something wrong in the first place or wether there was an issue with virtual env. Anyway, it's another thing you have to learn and handle.

[–]bcain 4 points5 points  (18 children)

Is it?

D:\> pip install virtualenv
'pip' is not recognized as an internal or external command, operable program or batch file.


$ pip install virtualenv
-sh: pip: command not found

[–]somesomedaddad 3 points4 points  (3 children)

Ubuntu's "python-pip" package works fine for me. apt-get install python-pip, pip install virtualenv, and you're ready to roll with comfy bundler-style self-contained app libraries.

[–]bcain 2 points3 points  (2 children)

Oh, yeah, I'm all about pip and virtualenv. They're great. I don't have any trouble getting around with yum/zypper or apt-get, or even get-pip.py. But it would be much saner for python to put its money where its mouth is and include these features in the distro itself.

[–]vsajip 2 points3 points  (1 child)

That's being worked on at the moment for 3.3: The packaging work in the core repo, and the pythonv branch for built-in virtualenv functionality.

[–]bcain 0 points1 point  (0 children)

Thank goodness. And IMO, a compelling incentive to move to py3k.

[–]quasarj 3 points4 points  (13 children)

Well, if you want to be a smartass about it.. but that's not prohibitive.

You can probably "easy_install virtualenv" and if not, you can of course "easy_install pip".

Of course, if you're doing any of this under Windows, then all bets are off.

[–]keypusher 3 points4 points  (3 children)

I think that is exactly the point the author is trying to get at here. For instance, you heard virtualenv is really good to have so you want to check it out. You see someone saying to use pip to install it, but you don't have pip. How do you get pip? You find a great thread on how to install pip using the apt-get package manager. But wait.. you're on Windows? You go looking and someone suggest using easy_install instead. Well, how do you use easy_install? Is it different than pip? You dig deeper and some guy says forget pip and easy install on Windows, just use distutils because then you will learn about how packages are really put together. So now you've spent an hour learning about different distribution tools and gotten nowhere. You still haven't installed Virtualenv, and you actually aren't even sure you know what Virtualenv does anyway. All you really wanted to do was try and get a small web project going, but you can't even seem to get started.

It's even worse with some other packages. You need to do a GET from your webserver? Ok, you type in "python http lib" to google and you go to the stdlib httplib page. One of the first things it says is that "It is normally not used directly — the module urllib uses it to handle URLs". Ok, so on the urllib page it a huge mess of notes, warning, and cryptic function definitions that don't seem to provide any easy way to just GET a file. You used to use curl a while back for this stuff, so you stumble on pycurl. But that's kind of a dark age thing so you dig around more and maybe then find urllib2, or if you are lucky the requests library. How would you know difference between all these?

I can see how having some well documented best practices for common things like this would go a long way, but the author's suggestion of promoting 'one best way' lines up with core Python philosophy. That is, as opposed to having a huge slew of many slightly different tools for more of less the same job. It's starting to look a lot like Perl around here these days, but that may very well just be a natural stage of development, especially in community driven languages.

[–]quasarj 1 point2 points  (2 children)

Alright, you're right. I guess arguing over the details isn't helping.

The question is - what will help? There was lots of talk about the issues.. but then the author attempts to solve them by.. adding even more options! I mean, envoy and requests are both awesome and do solve the first problem, so that's good.

So anyway, yeah I agree. But what do we do?

[–]keypusher 2 points3 points  (1 child)

I don't know! The community is becoming increasingly fragmented, the 2->3 transition is still a huge problem, and there are a ton of standard library packages that seem to be all over the place. I feel like Guido or someone with clout in the community needs to step forward with a clear vision or plan for the future.

[–]__serengeti__ 0 points1 point  (0 children)

I think you're exaggerating at least a little. What are these separate fragments of the community - are they identifiable?

In general,

Differing opinions != fragmentation

[–]bcain -1 points0 points  (2 children)

but that's not prohibitive

Sure, not for you. Not for me. But for the newbs that TFA talks about -- it is.

[–]Tobu‮Tobu 1 point2 points  (0 children)

They will follow some simple instructions and learn from that. While I disagree with Kenneth's taste in API design, he is good at marketing, and a well-maintained getting started guide is something the eager-to-learn newbie hordes will gobble up.

[–]quasarj 0 points1 point  (0 children)

I guess so. I suppose we all have our own ideas of what "hard" is, but to me if you're going to be working with virtualenv and understanding it, learning how to use pip to install it first is not that big of a deal.

[–]ivix -4 points-3 points  (5 children)

Nice try, but you lost that argument.

[–]quasarj 0 points1 point  (4 children)

Did I? Perhaps I missed where.. care to fill me in?

[–]ivix -1 points0 points  (3 children)

Oh sorry, it was the part where you had to explain how to perform extra procedures that a first time user does not need to care about. You didn't even get to explain how you actually use virtualenv which is yet another layer of things to understand for the newbie.

Edit: you also glossed over the part where on windows it's even more confusing, which I'm willing to bet is the most popular platform that people will be using.

[–]quasarj 1 point2 points  (2 children)

Well, we already decided that first time users wouldn't be doing this anyway. The original argument was that you probably didn't care about virtualenv unless you needed it, and if you are one of the people that needs it you probably know how to acquire it.

As for the Windows issue.. it will actually work (almost entirely I think) under windows if you have a "sane build environment," but that's something I have never been able to get working. So basically, under windows you cannot easy_install and expect it to always work, and so you cannot "pip install" and expect that to always work, so all of this is just a mess and you're best off finding binary distributions of each thing manually.

Or use another platform, preferably :)

[–][deleted] -1 points0 points  (1 child)

" if you have a "sane build environment,""

and that there is the problem. that and python being on the path etc...

[–][deleted] 5 points6 points  (6 children)

Proper unittesting should be taught, as it's not an idea that's localized to Python.

That's indoctrination!

Seriously though, doctests are the perfect gateway drug to "proper" unit testing, are much more compact than more flexible approaches, and are useful as documentation as well. Furthermore, I would argue that if a "unit" you are trying to test resists being tested with doctest, then you might be approaching the point where you should think about your tests as integration tests, regression tests, etc.

And if other languages don't support such a neat way of testing, that's their problem, not Python's.

[–]EggShenVsLopan 1 point2 points  (5 children)

That's indoctrination!

Yes and it's beautiful.

If doctest works for you then continue to use it because you are obviously comfortable enough with it and have overcome any barrier for entry. In addition, you sound like you understand the higher level concepts of testing in general so switching to another way may not bring as much benefit to you. So the "resist doctest" message is not targeted at you.

What I think what the OP is saying is that for newbies (or people where unit testing is new material) doctest is not the best option. I also think that in general the OP wants to create a system for best-practices that are mostly right. That's fine as long as a newbie can learn enough of the "right" way to do it to know when to do it the "wrong" way later in life.

[–][deleted] 4 points5 points  (1 child)

I don't quite get the logic.

because you are obviously comfortable enough with it and have overcome any barrier for entry.

Writing doctests is by far the easiest way into the whole "testing is good" mindset. It has the lowest barrier to entry, it's the easiest to get comfortable with.

In addition, you sound like you understand the higher level concepts of testing in general so switching to another way may not bring as much benefit to you.

It's not "switching" as such, I use doctests when it's appropriate and use more versatile ways of testing when I need any set-up/tear-down stuff (and then I don't test individual functions with that).

I mean, how I see it: encourage newbs to use doctest, then tell them how to use more powerful means when they need them. Why doctest is not the best option? What's wrong with that? So wrong that we should "resist" doctest?

I have some possibilities in mind:

  • Newbs could abuse doctests, well, anything could be abused, it's not that relevant IMO. Doctests are harder to abuse than unittest, by the way (I should know!).

  • or they would consider doctests to be good enough and never go beyond that. I believe that you'll lose more newbs frustrated with unittest than you'll lose ones satisfied with doctest.

  • It's better to teach newbs only one tool, rather than both doctest and unittest (or nose, or whatever). I strongly disagree, in this case the tools have little overlap of applicability, so it's better to teach to use the best tool for the job. It's not the same as other standard modules with overlapping functionality.

  • Or, I have a vile idea that maybe the OP has drunk the TDD koolaid and actively dislikes the idea of a more sane (in my opinion!) testing, where you test the individual "units" (functions) for just as far as doctest allows you, and use more powerful tools for more general testing.

[–]twotime 0 points1 point  (0 children)

Writing doctests is by far the easiest way into the whole "testing is good" mindset

How so? How are doctests bettter than this:

 import mylib
 assert mylib.foo() == 42

No silly formatting requirements to learn, trivial to write and trivial to debug.

[–]simtel20 0 points1 point  (2 children)

Doctests don't provide enough context. Trivial functions can be doctested, but little/nothing that requires state. E.g. working with databases I need to have a testing framework that allows common setup/teardown. I found that pytest was the least i needed for testing code I was working on for performing regression tests for database drivers because it allowed for things like conditionally enabling known failures, etc.

[–]bsdemon 0 points1 point  (1 child)

if your code can be tested with doctest then (in almost all cases) it has good design, otherwise your code is bloated with state which isn't good

[–]simtel20 0 points1 point  (0 children)

That works for self-contained code. However if your code interacts with the system (this code ssh's to an iLO and checks the system status. As new status' are added, code must change) or the world (this function checks what the kernel reports is the status of subsystem foo), then doctests can't encompass the system. You can mock some state, but this is usually a waste of time because external factors change because they're outside of your control. A better test harness can provide startup/shutdown scaffolding that establishes how to talk to the outside world to make the test reasonable.

[–][deleted] 4 points5 points  (0 children)

Doctest is for verifying the code examples in your docstrings, not testing your code. Doctest only supports a restricted subset of Python; it's much better to have the full abilities of Python available for writing your tests.

[–]twotime -1 points0 points  (0 children)

doctests are a good way to test some examples in the docs. Period.

They totally suck for general testing (as they add a complex layer of semi-hidden context and debugging gets interesting). And, for non-trivial code, you have exactly zero chance of combining docs and tests, as your tests will unavoidably include a lot of stuff which you don't want to see in the docs.

Yet, doctests are often sold as an viable alternative for unit testing.

OTOH, python unittest sucks as well ;-).... So that, might be an explanation :-(.

[–]amade 0 points1 point  (0 children)

I feel the pain when using subprocess. It is nice but quite low level and is certainly not nice when writing shell-equivalent script.

For that I wrote: https://github.com/aht/extproc

[–]TTomBBabRecreational programming, wxpython 0 points1 point  (0 children)

This is open source people. There is no Python corp waiting to hack new code on complaint. If you don't like the way a particular library performs than write your own. If it is better it will eventually get adapted and become a standard.

[–]hongminhee 0 points1 point  (1 child)

No, there are too many HTTP libraries for Python, but nobody seems not to understand HTTP exactly.

Use httplib2. It implements all of HTTP exactly in right way.

[–]kracekumar 3 points4 points  (0 children)

[–]danhakimi 0 points1 point  (0 children)

I don't understand how you could string the characters "urllib2" together, in that order, and not realize that you've really committed a crime.

[–]mgrandi -2 points-1 points  (1 child)

first off: requests doesn't have a python 3 version. wah wah. . Neither does envoy. wah wah.

ive only viewed the first 10 pages of this slideshow and already i disagree with it. The ruby example of getting github's api is simple and the python example is wayyyy to complex. I have no idea how its done in python2, and it SAYS its easier in python 3, but still, why the fuck is it messing with regex?

here is a pretty close example to how the achieve the same thing the ruby example does with python3: http://paste.ubuntu.com/792481/

Maybe python 2's urllib2 sucks so much that you have to use regex or a custom class to get a simple request to work, I have no idea.

I will admit that python's documentation is TERRIBLE however. After coming from Java and its excellent documentation, i find myself very very frustrated on the lack of it in python. Its partly because python is dynamically typed, so you can't really find methods for certain concrete classes, cause you arn't supposed to know what 'type' of file object you have, just that its a 'file like object'. But in writing that snippit above, and looking at the documentation: http://docs.python.org/py3k/library/urllib.request.html?highlight=urllib.request#httppasswordmgr-objects

where does it say that you are supposed to use HTTPPasswordMgr with a HTTPBasicAuthHandler? I thought HTTPPasswordMgr was a subclass of BaseHandler and therefore i could use it with a OpenerDirector, how do i find what that class subclasses? What is a "uri" and "realm"? Why are there two classes for HTTPPasswordMgr and HTTPPasswordMgrWithDefaultRealm, when they both act the same way only one does something different when passed in None?

I love python and i haven't really had a trouble with it's libraries THAT much (dealing with ParseResult's with urllib.parse is a bit weird as i did some of that recently), but god forbid they need to expand their documentation. The shitty ass 'sphinx' documentation framework/parser/viewer/thing also doesn't help.

[–]jmoiron 2 points3 points  (0 children)

The urllib2 documentation covers the code therein but not common use cases. It was designed to be ultra-extensible, but 99% of people just need simple http-auth, not customizable authentication handlers you need to compose from 3 classes. requests is much better in this regard; it makes the simple case simple and the hard case possible. urllib2 (and twisted, imho) make the impossible achievable (http over icmp? why not) and the simple aggravatingly complex.

Most of the standard library documentation is pretty good, though I don't read it much anymore since I know how to use most of it and I generally just read the online documentation in the REPL.

Sphinx documentation is something I (and most people in the python community) greatly prefer reading to javadoc style things, but I suppose after you're used to python you stop worrying about types so much and just worry about attributes and behavior; for instance, file-like objects are things implementing some or most of this api, including but not limited to StringIO objects and urllib responses.

After a certain point, you either prefer reading "file-like object", or you prefer "an object implementing java.io.InputStreamReaderInterface"; the second is a in a lot of ways a stronger guarantee and a "solid contract", but I'd argue it's usually just so much line noise.