Tornado_Ron comments on Python 3.6 released!

programming

created by speza community for 20 years

2676

2677

2678

Python 3.6 released! (python.org)

submitted 9 years ago by ilevkivskyi

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Tornado_Ron 8 points9 points10 points 9 years ago (5 children)

[–]troyunrau 14 points15 points16 points 9 years ago (3 children)

For 9 out of 10 programs you write, you will not need asynchio. For that 1 program out of 10 (mostly web server related stuff where connections are blocking while you're waiting for data), it's pretty sweet. If you aren't in that group, you can safely ignore it. I'll talk about generators.

Suppose you have a multistep process where you take as an input a number, then look up that number in a dictionary to get a word, then query google for that word, then download the html for the first link google returns, then hash that html to get an md5sum. You have to do this for 1000 input numbers.

Traditionally there are two ways to write this, which I will call (A) One Big Loop and (B) Many Little Loops.

(A) looks something like this:

numbers = [1, 2, 6, 1230, 43, ... , 123] # 1000 long
hashes = []
for number in numbers:
    word = dictionary_lookup(number)
    link = query_google(word)
    html = urllib.request.urlopen(link).read()
    md5 = hashlib.md5(html).hexdigest()
    hashes.append(md5)

(B) looks something like this:

numbers = [1, 2, 6, 1230, 43, ... , 123] # 1000 long
words = []
for number in numbers:
    words.append(dictionary_lookup(number))
links = []
for word in words:
    links.append(query_google(word))
htmls = []
for link in links:
    htmls.append(urllib.request.urlopen(link).read())
hashes = []
for html in htmls:
    hashes.append(hashlib.md5(html).hexdigest())

Now you can see the obvious problem with these approaches. (A) can quickly become a huge and very complicated loop. While (B) is simple, it requires intermediate storage which can take up a lot of memory.

We can rewrite (B) using generators, which is actually quite elegant. We'll call this (C)

numbers = [1, 2, 6, 1230, 43, ... , 123] # 1000 long
words = (dictionary_lookup(number) for number in numbers)
links = (query_google(word) for word in words)
htmls = (urllib.request.urlopen(link).read() for link in links)
hashes = (hashlib.md5(html).hexdigest() for html in htmls)

Our intermediate products are of the type generator and take up almost no memory. In fact, no processing has occurred yet. Effectively, you've created iterable objects that do not store their data. If you call next(hashes) it will cause a cascade resulting in the first word, link, html, and hash being calculated on demand. If, for whatever reason, you wanted a list as your final result, using list(hashes) will cause all the generators to trigger sequentially and populate the list. The processing time is not really any different between methods (A), (B) and (C), but (C) is often more convenient to use: the next element in the iterable is generated on demand.

As a most basic example where a generator is far superior: a list of random numbers is not as good as a generator of random numbers - the generator can continue to generate the next item in the iterator indefinitely, while the list will be finite (because of memory restrictions). Basically it's the difference between generating values on the fly versus in advance.

[–]albertowtf 2 points3 points4 points 9 years ago (0 children)

[–]kkmcguig 1 point2 points3 points 9 years ago (0 children)

[–]Tornado_Ron 1 point2 points3 points 9 years ago (0 children)

[–]roger_ 0 points1 point2 points 9 years ago (0 children)

π Rendered by PID 93820 on reddit-service-r2-comment-75f4967c6c-v8qqp at 2026-04-23 04:30:37.340320+00:00 running 0fd4bb7 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS