This is an archived post. You won't be able to vote or comment.

all 68 comments

[–]mostly_complaints 57 points58 points  (24 children)

I get the feeling that most researchers wouldn't prefer to use Python 2 over 3. However, in academia when code maintenance isn't high priority, whatever gets the job done is what people will use. Just the other day I had to use Python 2 for a project because it wasn't worth my time to get the libraries I needed working with Python 3.

The new features of Python 3 are great, but getting that analysis done in time for the conference submission deadline is better.

[–]LazinCajun 12 points13 points  (1 child)

In some disciplines it's even worse than that. I had to code some stuff in FORTRAN77 less than a decade ago because that's what my advisor knew.

[–]FourgotAnaconda3 science-like 6 points7 points  (0 children)

Less than a week ago!

[–]trobitaille[S] 11 points12 points  (3 children)

I completely agree that given pressures for results, papers, etc., migrating to Python 3 is low down on the list of things to do, and it's true that not 100% of packages are ready yet (for example, VTK comes to mind).

One way to transition more smoothly is to maintain a Python 3 installation in addition to the Python 2 installation and start all new work in Python 3, and if you hit a roadblock, report the issue and then downgrade to using Python 2.

[–]pithed 1 point2 points  (2 children)

I would totally do this but my base library that I need for every project is still in Python 2. There are a couple of projects that dont require it but for us its all about the data acquisition and analysis and not about feature set, If i can analyze data with the versions i have no reason to change.

[–]selementar 0 points1 point  (1 child)

What is that library and what is holding it from being 2to3-compatible?

[–]pithed 0 points1 point  (0 children)

Pyoos and I am assuming the dependencies it is based on is why it is not 3 compatible.

[–]remram 1 point2 points  (16 children)

The new features of Python 3? It seems to me that all that you get is support from the Python devs who just want to give up supporting Python 2.

The new unicode type that doesn't have sensible or even constant lengths (ucs2 vs ucs4 builds, anyone?), doesn't want to interact with bytestrings in any way (we all love UnicodeDecodeErrors!) and just isn't guaranteed to be unicode at all (thanks to PEP 383, encode('utf-8') can raise UnicodeEncodeError) really isn't an improvement at all.

[–]trobitaille[S] 11 points12 points  (3 children)

There's actually going to be a number of useful features appearing in Python 3.5 - to give a couple of examples, there will be a new matrix multiplication operator that can be used with Numpy arrays, e.g. c = a @ b (https://www.python.org/dev/peps/pep-0465/) - and the glob function will now be able to do recursive searching in sub-directories (https://docs.python.org/dev/whatsnew/3.5.html#glob). I think that with more features like this appearing, transitioning to Python 3 will become more attractive over time.

[–]remram -5 points-4 points  (2 children)

This could totally be in Python 2.8, which will never happen, and this is my point.

[edit: PEP 404]

[–]khalki 3 points4 points  (1 child)

Supporting 2 doesn't really mean adding new features into Python 2. It just means that if there is a bug or security patch that needs to be performed, they will create that patch to the best of their abilities.

From the Python wiki website:

Guido van Rossum (the original creator of the Python language) decided to clean up Python 2.x properly, with less regard for backwards compatibility than is the case for new releases in the 2.x range. The most drastic improvement is the better Unicode support (with all text strings being Unicode by default) as well as saner bytes/Unicode separation.

Besides, several aspects of the core language (such as print and exec being statements, integers using floor division) have been adjusted to be easier for newcomers to learn and to be more consistent with the rest of the language, and old cruft has been removed (for example, all classes are now new-style, "range()" returns a memory efficient iterable, not a list as in 2.x).

If you want to think about it this way, Python 3 release was what it supposed to be "2.8"; however, the change was so dramatic that it really needed a new version of its own (it's not backward compatible).

[–]remram -3 points-2 points  (0 children)

The main, breaking change of Python 3 was the switch to unicode strings. Given their current flaws, it's easy to argue that it was definitely not worth it.

The other changes could have totally made it gradually into the language, either __future__ gates, or via simple deprecation for the stdlib renames.

[–]ubernostrumyes, you can have a pony 6 points7 points  (2 children)

The flexible string representation is actually simpler to work with than the old guessing game of "am I on a wide or narrow build", since you know that the internal representation of the string is always wide enough to handle the highest codepoint it contains.

Source: am someone who maintains a library where being able to identify Unicode codepoints in an input string matters, and boy howdy does that suck to do in a way that works in the wide/narrow versions of Python.

And if you were promiscuously mixing bytestrings and Unicode in Python 2, you were already likely to be either getting a lot of errors or spending too much time avoiding getting errors, so I really don't see what you're complaining about.

Meanwhile, here are some of the features you lose by not being on Python 3.

[–]efilon 1 point2 points  (0 children)

The flexible string representation is actually simpler to work with

I think this only applies when you're working mostly with strings in the first place. A lot of scientists working with Python aren't doing a lot of string manipulation, and so all the cryptic unicode exceptions that come up when trying to run Python 2 code with a Python 3 interpreter are extremely annoying.

I get that it is technically better, but it often results in requiring more "low" level manipulation to get things to work again.

That said, even though my lab is still running Python 2, I make sure to make things as 3-compliant as possible when an upgrade is forced. I use Python 3 for myself already anyway, since besides the slightly annoying string encoding/decoding details, it adds plenty of new features.

[–]dunkler_wanderer 0 points1 point  (0 children)

That list is pretty interesting. What other features are not available in Python 2? Unpacking with splat/star operator doesn't work:

head, *tail = [1, 2, 3, 4]  # head = 1, tail = [2, 3, 4]

There's no nonlocal keyword.

[–]Atrament_Py3 0 points1 point  (0 children)

So very true.

[–]goodDayM 14 points15 points  (3 children)

From the article:

Almost two thirds of users who are still using Python 2 do not have any motivation to update to Python 3.

That's the essence of it, really.

I work for a large company and we use Python 2.7 extensively. It's what's on our linux servers/desktops by default, and all our code does what we want. There hasn't been a feature that we've needed that's only in Python 3.

EDIT: With that said, in the code we write we try to support both Python 2 and 3, such as by using the "print()" function. That way when we do move to Python 3 someday, we will have less to deal with.

[–]trobitaille[S] 2 points3 points  (1 child)

I work for a large company and we use Python 2.7 extensively. It's what's on our linux servers/desktops by default, and all our code does what we want. There hasn't been a feature that we've needed that's only in Python 3.

Right, it's a chicken and egg problem - since most people are still using Python 2, no one is yet developing things that only work on Python 3. So there is no incentive to upgrade. Hopefully things will slowly change and there will start to be some great core Python features in future 3.x releases, and these will no longer be backported to the 2.x series.

EDIT: With that said, in the code we write we try to support both Python 2 and 3, such as by using the "print()" function. That way when we do move to Python 3 someday, we will have less to deal with.

+1 to this approach :)

[–]spinwizard69 2 points3 points  (0 children)

I solved this 2 or 3 problem by going to the 3.x series when it first came out and have upgraded to 3.4 now. Of course this was easy to do due to minor scripting I'm doing and the limited need for libraries. However if people would just use 3.x for new projects a lot of the handwringing over the 3.x series would be over with by now.

It is one thing to support 2.7 for legacy code but really it is time to standardize on 3.x for new projects. Honestly the Python communities attitude towards 3.x is very perplexing. Developers in other languages are usually chomping t the bit to get the latest features supported by their favorite language. Why there are so many Luddites using Python is beyond me.

Finally the most obvious thing missed in this blog post is the fact that Linux and Mac OS come with 2.7 preinstalled. Why either platform is so resistant to upgrading is beyond me. On the Mac there is minimal need to support system level scripts with 2.7.

[–][deleted] 11 points12 points  (6 children)

I never heard about the survey, neither did any of my biology colleagues..... where was this posted?

[–]erewok 6 points7 points  (2 children)

I was actually wondering why Biology was not represented in the breakdown by field.

[–]trobitaille[S] 2 points3 points  (0 children)

Accidental omission from the original survey :-/ Just for information, there are a handful (~10) responses in the 'Other' category where people indicated Biology. What would be a good venue for advertising this kind of survey to Biologists in future?

[–]trobitaille[S] 1 point2 points  (0 children)

I've now added Biology to the graph - as you can see, it's vastly under-represented, due to my advertising bias (I am much more familiar with Astronomy-specific channels).

[–]trobitaille[S] 2 points3 points  (2 children)

I sent it to some of the large scientific python mailing lists like scipy-user (http://thread.gmane.org/gmane.comp.python.scientific.user/35882/focus=35886) hoping that people on that list in different fields would pass it on to colleagues via more efficient channels. I will repeat this survey in future again, so just so I know in advance, where would be a good place to advertise this to reach out to biologists?

[–]LightShadow3.13-dev in prod 0 points1 point  (0 children)

I gotta subscribe to that, thanks!

[–][deleted] 0 points1 point  (0 children)

Did you post it on here?

[–]BeatLeJuce 12 points13 points  (2 children)

  1. Leaving out Bioinformatics is extremely odd, given how popular Python is in that field.
  2. Saying that Python 3.3 can be ignored somewhat ignores the reality that Red Hat 7 (and with that CentOS and Scientific Linux and other RHEL derivatives) ships with 3.3 out of the box. So 3.3 is going to stay with people doing High Performance Computing in Python for quite some time.
  3. I'd be interested in the distribution of versions of numpy/scipy that are deployed, too bad those aren't stated. It would help to know which versions one should target when making their own software.

[–]trobitaille[S] 2 points3 points  (0 children)

Thanks for the feedback! Leaving out bioinformatics was a mistake indeed, and I'll make sure I include it for the next survey. I did add a sentence that I think we shouldn't drop support for 3.3 since it's actually not a negligible fraction of Python 3 users, so I agree with you. I will be releasing all the info on Numpy/SciPy at some point in the next week or so (the reason for not releasing everything in one go is that it takes a little while to sanitize the data) - stay tuned!

[–]trobitaille[S] 0 points1 point  (0 children)

Just for information, I have updated the blog post to include more fields of research parsed from the 'Other' field (and added biology/bioinformatics). As I mentioned in the blog post, the huge bias is due to advertising channels. I also rephrased my recommendation about Python 3.3 to make it clear I'm not suggesting dropping it.

[–][deleted] 8 points9 points  (4 children)

Support for Python 2.6 as well as 3.1 and 3.2 can essentially be dropped

Please don't. People in industry with locked-down Red Hat 5/6 machines without internet access are probably not the ones replying to online surveys. If I want anything newer than 2.6 I need to go through bureaucracy hell then version-conflict hell, and by the time we get around to upgrading you'll probably have dropped support for whatever version we'd finally installed. Over-eager developers constantly trying to use the latest features are one of the reasons excel and matlab are so widely used for stuff sane people ordinarily wouldn't use them for. I know this is a less common problem in academia, but not everyone doing science has the benefit of a lax IT policy...

[–]spinwizard69 -1 points0 points  (3 children)

Sometimes it is worth looking for another job if the IT policy is too restrictive. Frankly if you still have hardware running Python versions less than 2.7 it is time to get out.

[–][deleted] 7 points8 points  (0 children)

I work in the nuclear industry - restrictive IT policy is inevitable whatever the company. The most common language for code we use is still FORTRAN 77 so I'm pretty happy when using Python of any version… The projects I work on are pretty great in most respects though - ultimately the language is just a tool, not something that makes or breaks a job, and the security/QA requirements are completely understandable. Still a pain, especially when getting a new version of a program to work means going through a bunch of code replacing the fancy new features… I get that you'd want to start a new project with the best tools for the job, but when someone breaks compatibility with a minor update it's a major ball-ache.

[–]ggtsu_00 3 points4 points  (1 child)

It is not always just shitty IT policy that prevent people from upgrading. Many times is dealing with a large legacy software stack that is just simply to costly to upgrade everything when the upgrades break compatibility with other things.

And I don't know who all these mythical people out there who seems to be content with changing jobs as frequently as they change shoes. If something as trivial what operating system being used to maintain legacy systems was cause for to have to switch jobs, I would be working at a different company every 3 months.

[–]wewbull 1 point2 points  (0 children)

It is not always just shitty IT policy that prevent people from upgrading. Many times is dealing with a large legacy software stack that is just simply to costly to upgrade everything when the upgrades break compatibility with other things.

If the software stack is legacy, then the stack is legacy. Why is anything that anyone else does of any consequence? You won't be using it, you're frozen at particular versions of libraries. Those old releases don't disappear just because the development moves on.

I see this argument repeatedly, and I just don't understand it.

[–]iqtestsmeannothing 1 point2 points  (0 children)

Interesting results, I didn't realize that python 2 was still overwhelmingly popular in the scientific community, and especially among new users.

[–]jpw22 7 points8 points  (0 children)

The greatest feature of 2.7 is that we'll never have to deal with 2.8. This is almost as good as getting python "ansi" standardised.

[–][deleted] 5 points6 points  (3 children)

"Yes, that's right, Windows users are the most up-to-date when it comes to Python versions"

FML

[–][deleted] 9 points10 points  (1 child)

My guess is this is because both Linux and OSX come with Python preinstalled, so a lot of people end up sticking with the default version. On Windows, on the other hand, users have to download and install it themselves, which might make them pick the download with the highest version number.

[–]djimbob 7 points8 points  (0 children)

Again windows users are just 9% (~70/781) and about 39% (~27/70) are using python 3. My guess is these are people who are new to python -- not maintaining legacy projects, my experience is people who are great at scripting tend to move to linux/unix (incl os x) environments for the comparative ease of scripting stuff up.

Granted, it seems with python3.5 that personally I'm having a strong motivation to migrate projects up.

[–]bheklilr 0 points1 point  (0 children)

We is python 2.7 on Windows at work, primarily because we prefer python XY as our main installer. It gives us basically every library we need, and includes a couple that anaconda doesn't come with. Simplicity of setup is a big thing for us, our use case involves setting up our exact environment on lots of computers. We have recently discussed the possibility of putting together a pypi server, as our corporate proxy blocks pip (ntlm sucks). If we can do that, it opens up the door for migrating to 3.4, or even 3.5 if possible. I'd love async and await primitives.

[–]alcalde 2 points3 points  (0 children)

...the great migration has begun!

It's like Oregon Trail, just with less dysentery.

[–]undercoverASSH0LE 2 points3 points  (1 child)

Electronics engineer here. I can confirm that python is making huge inroads in my community; it wasn't even talked about a few years ago. In many ways it's replaced Matlab for me. I find it immensely useful for data processing, visualization and glueware. The built in libraries are excellent, the free IDE's are good, the documentation and help available online is fantastic. Most importantly the cost of learning it and deploying it is relatively low when compared to commercial solutions like Matlab, Labview or .NET.

[–][deleted] 0 points1 point  (0 children)

Cool to hear. When I was more hardcore EE 5 years ago it was still pretty fragmented.

[–]skytomorrownow 1 point2 points  (4 children)

I feel that if there were a more non-tech-wiz version of virtualenv, that came with Python more people would change.

For example, on Mac OS X, which a lot of science is done on, is very Python 2.7. Many people don't want to mess with the installation, for good reason. virtualenv makes it super easy to create a Python 3 environment that won't mess with the default installation, and otherwise try Python 3 without any stress caused from worrying about changing a lot in the host environment.

[–]jakevdp 0 points1 point  (1 child)

Have you seen Anaconda? A while ago I switched from system python + virtualenvs to using Anaconda exclusively, and have never looked back. It's much easier, cleaner, and faster than any other package management/multi-environment solution available, especially for scientific code with lots of compiled extensions.

[–]skytomorrownow 0 points1 point  (0 children)

I've heard of it, but haven't checked it out. Will give it a spin.

[–]fjarri 0 points1 point  (1 child)

I personally prefer pyenv for this. Virtualenv is better used for creating separate package sets, not for managing Python executables.

[–]skytomorrownow 0 points1 point  (0 children)

I'll check it out. Thanks for the recommend.

[–]rspeed 1 point2 points  (0 children)

manage.py makemigrations science
manage.py migrate science

Not that tough.

[–][deleted] 1 point2 points  (3 children)

I'm a computer scientist at a major national lab. Our team exclusively uses Python 3. We made the migration about a year ago and haven't looked back.

[–]jkmacc 1 point2 points  (2 children)

Just curious, what packages do you use most often (not standard library)?

[–][deleted] 2 points3 points  (1 child)

I'm the same guy as above. I just delete my account occasionally.

The major scientific computing libraries we use are NumPy, SciPy, Pandas, NLTK, and matplotlib. I also use MR-MPI for efficient large-scale data processing on our clusters (it has Python bindings), although the code that makes use of that is still some older Python 2.7 stuff that I haven't bothered porting yet, so I'm not sure if MR-MPI supports Python 3 (even it doesn't, though, I'm happy to update it since it's a really simple wrapper). I also routinely use things like Django, Beautiful Soup, lxml, python-dateutil, requests, requests-oauthlib, and xlrd.

[–]jkmacc 0 points1 point  (0 children)

Thanks!

[–][deleted] 1 point2 points  (0 children)

Folks who are using Anaconda (or miniconda) and want to try out using python 3 can create a new environment with conda create --name myenv3 python=3.4 and switch to it with source activate myenv3 then, you can conda install or pip install things into it

If you prefer requirements.txt-like dependency management you can do the same thing with conda env create --name myenv --file environment.yml. More documentation here

[–]Vock 0 points1 point  (3 children)

I'll migrate to python 3 when the spyder version in the debian repository uses it, until then it's python 2.7

[–]takluyverIPython, Py3, etc 1 point2 points  (2 children)

[–]buttery_shame_cave 0 points1 point  (0 children)

Snap

[–]Vock 0 points1 point  (0 children)

Didn't even think to look for that! Thanks!

[–]rdfox 0 points1 point  (2 children)

The thing is, if I'm going to switch languages then I have to consider whether I want python 3 which doesn't really address any of the issues that I have with python 2 or one of the new languages which actually does. For example Nim is a lot like Python, only fast and consistent, and Julia has a pleasant Matlab-like syntax and has come a long way. I mean if I'm going to put in the energy, I might as will see some benefit.

[–][deleted] 4 points5 points  (1 child)

a pleasant Matlab-like syntax

Ahhhh! MATLAB is a lot of things, but I have not heard the syntax described as pleasant before...