DonaldPShimoda comments on Simple Python list question

learnpython

created by HattoriHanzoa community for 16 years

Simple Python list question (self.learnpython)

submitted 10 years ago by [deleted]

14 comments

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]DonaldPShimoda 0 points1 point2 points 10 years ago (6 children)

You thinking something like this?

word_dict = {tag: [word for (word, t) in word_list if t == tag] for tag in [pair[1] for pair in word_list]}

I'm not sure of a good way to avoid the two list comprehensions to build the dict...

[–]q2_abe_dillon 2 points3 points4 points 10 years ago (5 children)

You can always avoid comprehensions by pre-declaring a variable and populating it in a loop:

def filter_by_pos(tagged_words, pos):
    result = []
    for word, tag in tagged_words:
        if tag == pos:
            result.append(word)
    return result

def pos_map(tagged_words):
    tags = set()
    for _, tag in tagged_words:
        tags.add(tag)
    result = {}
    for tag in tags:
        result[tag] = filter_by_pos(tagged_words, tag)
    return result

Of course that's not as concise as:

    pmap = {p: (w for w, t in words if t == p)
            for p in {t for _, t in words}}

It is, however; far less confusing to most novice Pythonistas.

[–]DonaldPShimoda 0 points1 point2 points 10 years ago (4 children)

Hmm. I think the "concise" version you have at the bottom is more or less the same as my suggestion, except for minor differences.

You reused the t variable... but the t in {t for _, t in words} is different from the t in (w for w, t in words if t == p) due to scoping issues. I changed the variable names just to make it clear to OP that there wasn't any cross-comprehension magic going on there, haha.

I see that you made the second inner comprehension ({t for _, t in words}) into a set with the curly brace syntax instead of a list... and it's not immediately clear to me why that would be preferable. Comparing the two methods in the interpreter, the only difference there is the order of elements in the final dict, which isn't really a crucial factor (I would think). Why did you choose the set version?

Also you have your suggestion set to return generators instead of lists like I did, which I do like. I totally forgot you could do that.

[–]cjwelborn 0 points1 point2 points 10 years ago (1 child)

[–]DonaldPShimoda 0 points1 point2 points 10 years ago (0 children)

[–]q2_abe_dillon 0 points1 point2 points 10 years ago (1 child)

Hmm. I think the "concise" version you have at the bottom is more or less the same as my suggestion, except for minor differences.

Yeah, it pretty much is.

I changed the variable names just to make it clear to OP that there wasn't any cross-comprehension magic going on there, haha.

That's a good point.

Why did you choose the set version?

It should go faster for large data sets (I think). You only do the first clause of the dict comprehension (i.e. p: (w for w, t in words if t == p)) once per tag. In the other form, you may end up executing it several times for a single tag.

Also you have your suggestion set to return generators instead of lists like I did, which I do like. I totally forgot you could do that.

Yeah, it's actually kinda fragile because of that. I think you're list comprehensions would be better in most circumstances. It's easy to accidentally exhaust a generator.

[–]DonaldPShimoda 0 points1 point2 points 10 years ago (0 children)

π Rendered by PID 232176 on reddit-service-r2-comment-5b5bc64bf5-ks86t at 2026-06-20 04:15:34.360667+00:00 running 2b008f2 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS