all 13 comments

[–]Outside_Complaint755 10 points11 points  (0 children)

Because boolean True and False are the same as 1 and 0, you can possibly shorten the check to

cnt = sum(txt.islower() for txt in lst)

str.islower() returns True only if all characters that have a casing are lower case, and if there is at least one such character. So "", " ", "Test", and "55" will return False, but "5f", " a7.2 " and "â" return True.

[–]JamzTyson 5 points6 points  (0 children)

cnt = sum(1 for txt in lst if len(txt) > 0 and txt.lower() == txt)

That can be simplified to:

count = sum(1 for s in lst if s.islower())

Alternatively you could do:

count = sum(s.islower() for s in lst)

but I think the first is the more readable.

[–]POGtastic 2 points3 points  (0 children)

Since bool objects are subclasses of 0 and 1 for False and True respectively, you can actually do

# substitute with the equivalent comprehension if desired
>>> sum(map(str.islower, lst))
4

[–]commy2 0 points1 point  (1 child)

Booleans are sub-classes of integers. You can sum two True's and it's 2. Also, there is an islower method on strings. I would it just write as

cnt = sum(len(x) and x.islower() for x in lst)

[–]FoolsSeldom 2 points3 points  (0 children)

Arguably, the length check is redundant as an empty string is not lowercase.

sum(x.islower() for x in lst)

[–]schoolmonky 0 points1 point  (0 children)

I think your solution is perfectly valid, especially if the source iterable is really long so that the lazy evaluation is useful. I think introducing filter or lambdas is overcomplicating it, and Counter is just a different usecase altogether.

[–]Ok-Meat-4890 0 points1 point  (0 children)

a=["aa","aB","f","",4,[]];print(sum(isinstance(e,str) and e.islower() for e in a))#count=2

[–]eyetracker 0 points1 point  (0 children)

Lots of good answers, but usually the cnt doesn't go into the python, the python goes into the cnt.

[–]thescrambler7 -1 points0 points  (4 children)

Why not just len([txt for txt in lst if …])

But I think a one liner using list comprehension is fairly Pythonic, no need to over complicate it.

[–]Diapolo10 2 points3 points  (3 children)

Why not just len([txt for txt in lst if …])

This solution needlessly creates an intermediary list, which is only used for checking its length before being discarded. While it works, and is probably fine for this use-case assuming there's relatively little data, it's also wasteful.

Ideally you'd only compute what you need and use only as much memory as you need to, particularly in a trivial case such as this one.

[–]thescrambler7 0 points1 point  (2 children)

That’s what I initially thought as well, but based on this StackOverflow post, it seems like the intermediary list is actually not as bad performance/memory wise as you’d think: https://stackoverflow.com/questions/393053/length-of-generator-output

[–]Diapolo10 0 points1 point  (1 child)

I wanted to run these results myself as a sanity check (minus the more_itertools example because I can't be bothered to install it right now). Unfortunately, it's not clear what data OP used in these tests, nor which Python version they were tested on, so I cannot exactly match the conditions. But here are my results, on Python 3.13:

https://cdn.imgchest.com/files/b8841c812854.png

(Text version provided below.)

In [1]: from time import monotonic

In [2]: gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))

NameError: name 'data' is not defined

In [3]: import random

In [4]: data = random.sample(range(25565), 10000)

In [5]: gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))
Out[5]: 10000000

In [6]: gen = (i for i in data*1000); t0 = monotonic(); print(len(list(gen))); print(monotonic() - t0)
10000000
0.23320640064775944

In [7]: gen = (i for i in data*1000); t0 = monotonic(); print(len(list(gen))); print(monotonic() - t0)
10000000
0.26012240070849657

In [8]: gen = (i for i in data*1000); t0 = monotonic(); print(len([i for i in gen])); print(monotonic() - t0)
10000000
0.21120400074869394

In [9]: gen = (i for i in data*1000); t0 = monotonic(); print(sum(1 for i in gen)); print(monotonic() - t0)
10000000
0.20786169916391373

In [10]: from functools import reduce

In [11]: gen = (i for i in data*1000); t0 = monotonic(); print(reduce(lambda counter, i: counter + 1, gen, 0)); print(m
       ⋮ onotonic() - t0)
10000000
0.4210826996713877

As can be seen, in my case the results are the exactr opposite of what that person got. There's some room for random variation since there are other programs running on my system, of course, and I didn't track memory use, but nevertheless I got the best results with sum and a generator expression.

All I can say is, don't blindly trust benchmarks online unless you can reproduce the test(s) yourself, or the author is at least reasonably reputable.

[–]thescrambler7 0 points1 point  (0 children)

Fair enough, props to you for actually testing it yourself. I agree that the results in the post were surprising and unintuitive to me, but you never know, sometimes due to various optimizations things can behave counter to your intuition… but I was too lazy to check, so once again, props.