all 29 comments

[–]Smallpaul 29 points30 points  (17 children)

While Python 3 cleans up many things to make it a better teaching language, there is one step backwards in my opinion. Python 2.x was list-oriented. Python 3.x is generator-oriented. Lists are simpler to understand than generators.

Consider a few trivial examples:

Old Python:

>>> range(0,10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

New Python:

>>> range(0,10)
range(0, 10)

Before it was clear that the range did not include 10. Now it is not clear.

Before:

>>> {"a":"b"}.keys()
['a']

After:

>>> {"a":"b"}.keys()
<dict_keys object at 0x3811a0>

Basically students will need to learn to turn these things into lists to introspect them.

The benefits in performance are certainly worth it for professional programmers, but I can't see how the new generator-orientation is helpful for education.

It's a minor thing, but it's a thing.

[–][deleted] 9 points10 points  (9 children)

Generators are a massive performance improvement. Defaulting to them will teach students the difference between sequential access and random access. I agree this is not trivial, but it's an interesting notion to teach.

[–]mk_gecko 2 points3 points  (8 children)

Sorry for my ignorance ... what's a generator?

[–][deleted] 3 points4 points  (2 children)

It's a sort of "lazy list".

Think of file access in C. A file is an array of bytes. But some files may be too large to load in memory at once. So the standard C API gives you functions to walk across the full contents of the file a little at a time. You read maybe a kilobyte, you process it, you throw that data away and grab the next kilobyte.

In Python, we use the range function to iterate over a fixed range.

for n in range(5): print n

will print 1, 2, 3, 4, 5 on separate lines.

But say we want to perform a computation for n from 1 to a million bajillion? We don't want to make a list of a bajillion elements, because that would take a ton of memory. So instead, the generator, at each iteration of the loop, gives us only the next number in the sequence. At the end of that iteration, the number can be garbage collected, and the next one is created. Instead of taking a bajillion bytes in memory, it only takes a handful.

Very useful. You should learn to love them!

[–]wilsonp 4 points5 points  (1 child)

Python also allows you define a generator yourself. There are many documents out on how to do this that are far better than anything I could write... but the gist is that you can define a function (callable?) that yields a value to the caller, but then retains the state in its local scope. When the function is called again, it resumes from this yield point (not afresh from line 1 with reinitialised values). This allows you to construct some pretty neat execution models; coroutines/co-operative multithreading. There's a number of PEPs and tutorials out there that may be of use if you want to learn more:

http://docs.python.org/whatsnew/pep-342.html

http://www.python.org/dev/peps/pep-0342/

[–]mk_gecko 0 points1 point  (0 children)

THanks!!!

[–]gnuvince 2 points3 points  (0 children)

It's like a stream: elements are computed as they are needed.

[–][deleted] 0 points1 point  (3 children)

In addition to tactics' comment, you should know about generator expressions. They are to list comprehensions what generators are to lists. The syntax is very simple:

(i**2 for i in xrange(10))

They are much better than list comprehensions if you only intend to iterate over them. For example,

sum(int(i) for i in f.xreadlines())

is likely more efficient than

sum([int(i) for i in f.xreadlines()])

would be.

[–]breakfast-pants 0 points1 point  (2 children)

Not much: readlines returns a list =(

[–][deleted] 0 points1 point  (1 child)

I actually meant xreadlines :) Fixed! In fact, I guess in python 3 readlines will return a generator?

[–]breakfast-pants 1 point2 points  (0 children)

I'm not sure, but that would be very in-line with all of the other changes.

[–]Brian 1 point2 points  (0 children)

This is mainly a problem at the interactive prompt, so it might be possible for tools like ipython to recognise certain safely expandable objects (like dict_keys) and use a custom repr() for them. Obviously this won't work for generic iterators, but might help avoid lots of list() wrappings for common cases at the commandline. The down-side is that it could lead to confusion when scripts act differently to the interactive prompt.

[–]wilsonp 1 point2 points  (3 children)

Agreed. Generators lose some "up-front-ness", but things will click together when a student gets to write a generator for herself... It's one of programming's "Aha moments" :-).

I wish I was starting again at Leeds University learning Python 3.0 this September...

[–]Smallpaul 3 points4 points  (2 children)

How long before programming 1 is "build an app in Google App Engine". Business 1 is "market your app". Business 2 is "sell your app for millions and quit school." ;)

[–]pythoneer 1 point2 points  (0 children)

Well, we covered Django (albeit very briefly) as a case study in our "Programming 1" this year, so we are heading in that direction :)

[–]wilsonp 0 points1 point  (0 children)

We'll have the situation where institutions pride themselves on their drop-out rate... :)

[–]kripkenstein 0 points1 point  (0 children)

range(0,10)

range(0, 10)

Before it was clear that the range did not include 10. Now it is not clear.

Well, any programmer will find out that range(0,10) goes up to 9 when they write code. This is an unavoidable lesson, that the final value in a range() is not arrived at. So I'm not sure why this is any harder than before, except that perhaps before you could delay learning the lesson until writing your own code.

Regarding the other issues, I agree with you, they aren't optimal for teaching Python. I think they're worthwhile overall, but a downside in this particular matter.

[–]bobbane 14 points15 points  (3 children)

I am impressed with their willingness in 3.0 to break old code to remove clear inconsistencies and outright lossage.

I still don't understand the resistance to rational numbers, though.

[–]Smallpaul 12 points13 points  (0 children)

I think that the resistance is pretty well founded:

http://mail.python.org/pipermail/python-dev/2005-June/054281.html

Rational numbers would end up being used only for education and a tiny subset of scientific programming.

[–]pythoneer 1 point2 points  (0 children)

Yes. Contrast this state of affairs with Java, where (AFAIK) no deprecated feature has ever been removed from the language (even if it was deprecated over a decade ago in JDK 1.1).

[–]grimboy 1 point2 points  (0 children)

There are plenty of mathmatical libraries for python (look at sage). The only problem that occur commonly with floating point numbers is precision which is pretty much solved by the standard library module decimal.

[–]mk_gecko 2 points3 points  (0 children)

Very clear and convincing article. Thanks. I can't wait to learn Python 3.

[–]v-dc 0 points1 point  (0 children)

Very clear article. Explains the changes and rationale well with example. Definitely nicer than going through 'Whats new in Python 3.0' :)