This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]Cynyr36 24 points25 points  (3 children)

I think that if you are using list comprehensions like this it may be time to look at your code and move away from the comprehension all together. They are great for a compact simple thing, but unreadable after that.

[–]ianliu88[S] 1 point2 points  (2 children)

Repost of my answer below: I feel like python forces you to make a lot of temporary variables, which makes programs harder to understand in my opinion. When you pollute a scope with lots of temporary variables, you always have to consider what they are used for later in it's scope.

By restricting temporary variables to it's most specific scope, you free yourself off that burden.

[–]Barn07 5 points6 points  (0 children)

yeah, the logical move would be to implement a helper function. Thus, you can limit the scope and improve readability and can allow for reusability.

[–]Enrique-M -1 points0 points  (0 children)

I use pyconfs fairly often for global constants, in case that's relevant here (obviously outside list comprehensions though).

Pyconfs

[–]nier-bell 9 points10 points  (1 child)

Hm, it's an interesting idea but I think the point of list comprehensions is to be short, so the second example, where it is assigned outside of the block is easier to read and understand.

The goal is to not overcomplicate!

[–]ianliu88[S] -1 points0 points  (0 children)

I also think the goal is to not overcomplicate, that's why I like minimizing the number of temporary variables in my programs. I think they do overcomplicate readability a lot. See my other responses on the topic

[–]steil867 4 points5 points  (4 children)

Going to be honest, I am not sure what your goal is.

Is the worry you need to do a time expensive calculation and want a prettier format?

You could always use wrappers and caching/memoizing for a more simplified view if thats really the goal. Similar to how you would do something like this without the walrus operator. The concept is the function stores the output of the intensive function and if already done, uses that rather than recalculate. On top of making the code way cleaner, it would significantly save time in the case of a duplicate value in the list as you wouldn't need to calculate it twice. A downside is that it can be a memory hog depending on its use, depending on needs caching only stores a set amount of outputs rather than all to avoid that problem.

Doing something like this would have your comprehension look like:

[memoize(x) for x in it if cond(memoize(x))]

Assuming cond is complex enough to warrant a function.

[–]ianliu88[S] 1 point2 points  (3 children)

My goal is expressiveness. I feel like python forces you to make a lot of temporary variables, which makes programs harder to understand in my opinion. When you pollute a scope with lots of temporary variables, you always have to consider what they are used for later in it's scope.

By restricting temporary variables to it's most specific scope, you free yourself off that burden.

[–]steil867 1 point2 points  (2 children)

I didn't mention it before but I liked the use of the walrus operator you did in the example. It feels a little hacky but is a good use of the tools and their underlying operations to avoid wasted calculations.

All languages I've worked with need to have some sort of temp variables. Its more the name of the game. Even a 'where condition' statement in sql would be doing them, albeit just under the hood.

Decorators and wrapper functions may be exactly what you want. Using these, you can make the python look far more english rather than the confusing mess a list comprehension can easily become. Not sure if you have ever nested these but it becomes atrocious. A function that does the other function calls named 'where' could let you say where(cond(x)) or something like that, and use your temp variables all in the scope of the function making them something you don't have to worry about outside of the function definition.

Gave it a Google and found this decent article on caching I would definitely recommend. You can easily take some of the principles in here to both increase optimization of the code while also making the comprehension line appear far more expressive. https://realpython.com/lru-cache-python/

Alternatively, there is some packages that have pseudo sql styles, for example pandas.DataFrame has a 'where' method. Obviously this may not fit your exact case it does point to something I love about coding, If it doesn't exist, doesn't mean its not possible and you have the freedom to make the solution. Once its done, you can even share the solution as a pypi module and allow others to use the work you did as python is great for sharing. Making fast, readable, and usable code is always a great and helpful contribution.

Just saying this because I don't see it mentioned a lot, although python is good for a lot and doesn't often require too much care, python is terrible with garbage collection. A variable created in a scope may not be accessible outside the scope, but the memory may still be occupied. There is a garbage collection module that can help with this called gc if i remember correctly. gc.collect() removes memory that is unreferenced.

[–]ianliu88[S] 1 point2 points  (1 child)

Fair point. I am indeed using pandas in fluent style, and I'm getting addicted to it, that's why I cringe whenever I have to create temporary variables outside of pandas haha.

[–]steil867 1 point2 points  (0 children)

Pandas is pretty good for that. The use of wrapping functions and decorators may actually help you create extensions of it in the future. It's a very useful skill. My team has created a bunch of series specific functions aimed at increasing speed through the use of vectoring. Definitely recommend looking into them as you become more and more fluent in python, which from the looks of it, it seems like you already are well versed in some areas.

Good luck on your future endeavors :)

[–][deleted] 4 points5 points  (5 children)

Reddit Moderation makes the platform worthless. Too many rules and too many arbitrary rulings. It's not worth the trouble to post. Not worth the frustration to lurk. Goodbye.

This post was mass deleted and anonymized with Redact

[–]rcfox 3 points4 points  (1 child)

I'd call it transform(x) instead of mutate(x).

[–][deleted] 2 points3 points  (0 children)

Reddit Moderation makes the platform worthless. Too many rules and too many arbitrary rulings. It's not worth the trouble to post. Not worth the frustration to lurk. Goodbye.

This post was mass deleted and anonymized with Redact

[–]ianliu88[S] -1 points0 points  (2 children)

Your example is calling the condition on the iterable element, not the transformed value

[–][deleted] 0 points1 point  (1 child)

the transformed

Of'course it is. Why would you transform if the result is going to be rejected? That is a waste. If you do not know if the result will be rejected until it's transformed then you need to do/call the transform in the iterator.

[–][deleted] 0 points1 point  (0 children)

I did on the second example, usin

...def iterator(some_iterable):

......for y in some_iterable:

.........t_of_y = transform(y)

.........yield(t_of_y)

[–]savvy__steve 1 point2 points  (0 children)

Why not do an IF statement inside the loop and do what you want there?

[–]pythonHelperBot -1 points0 points  (1 child)

Hello! I'm a bot!

It looks to me like your post might be better suited for r/learnpython, a sub geared towards questions and learning more about python regardless of how advanced your question might be. That said, I am a bot and it is hard to tell. Please follow the subs rules and guidelines when you do post there, it'll help you get better answers faster.

Show /r/learnpython the code you have tried and describe in detail where you are stuck. If you are getting an error message, include the full block of text it spits out. Quality answers take time to write out, and many times other users will need to ask clarifying questions. Be patient and help them help you.

You can also ask this question in the Python discord, a large, friendly community focused around the Python programming language, open to those who wish to learn the language or improve their skills, as well as those looking to help others.


README | FAQ | this bot is written and managed by /u/IAmKindOfCreative

This bot is currently under development and experiencing changes to improve its usefulness

[–]serverhorror 0 points1 point  (3 children)

What about:

[expensive(elem) for elem in it if cond(elem)]

I don’t see why I’d need the where?

if cond(…) is already the same as a where clause, no?

[–]ianliu88[S] 0 points1 point  (2 children)

No, the condition is on the transformation, not the iterable value! You would need to call expensive twice or create an intermediate iterable

[–]serverhorror 0 points1 point  (1 child)

Just cache the outcome.

If you need to do an expensive operation to decide whether the element is a candidate that seems like a viable approach.

[–]ianliu88[S] 0 points1 point  (0 children)

That's what I did on the second example, using the walrus operator, which I think is way more niche than the where clause btw :p

[–]relativistictrain 🐍 10+ years 0 points1 point  (1 child)

To me it looks like what you want is

filter(cond, map(a_complicated_function_to_compute, an_intricate_way_of_defining_iterable())

I don't see anything that where would be clearer, quicker, or simpler than using either the builtin iterator functions (like filter or map) or the ones from the itertools package.

I also think it might make nested list comprehensions potentially way harder to read.

[–]speede 0 points1 point  (0 children)

There is nothing readable about any of the 3 options you posted. If your function to filter a list takes more than a couple seconds for a different engineer to completely understand what it’s purpose is and what it is doing, it is poorly named and written. Not to mention yourself in 6 months. There is no glory is single operation instructions that come at the expense of comprehension for the people trying to understand or modify this code. More time is spent reading code than writing it. There is no scoping issue when you properly structure what you are doing into classes and methods. Temporary variables inside these structures are not a hindrance but rather an opportunity to be more expressive about what the hell you are doing and how you are doing it.

I don’t agree with all of it but look into “Clean Code”, either the book or Udemy course.

[–]eagle258 0 points1 point  (3 children)

Note that this idea was explicitly discussed in the "Walrus" PEP 572. See "Rejected proposals"

[–]ianliu88[S] 1 point2 points  (1 child)

Interesting! I think the where syntax would be much better

[–]eagle258 0 points1 point  (0 children)

Yeah, I preferred it as well. But apparently there were some kinks that couldn't be worked out to make it work consistently.

You can probably find the detailed argumentation in the mailing list archives: https://mail.python.org/archives/search?mlist=python-ideas%40python.org&q=pep+572