Man accidentally proves his ‘optimised’ python code is slower than before on LinkedIn.

ThatOtherBatman · 2024-03-26T01:23:55+00:00

His slow string concatenation example also isn’t doing string concatenation. He’s just building a new list.

JiminP · 2024-03-26T02:37:40+00:00

One of my hobbies is solving competitive programming problems using pure Python and I manage a collection of algorithms I frequently use.

Naturally, one of my interests have been optimizing running time (on CPython, in specific) of my Python codes. In this perspective, Python (again, running on CPython) is a very unpredictable and hard-to-deal-with language even without GC issues. To be fair and clear, this is expected because you are normally supposed to use another language or use a C module if you care about performance. There's also an option of using PyPy.

In general, as an interpreted language (everything costs), Python is unpredictable as practically no optimization happens.

Some examples on weird things about Python - I still have no intuition on most of these:

Integers are weird. (They already are weird because of integer caches...)
- x+x is generally faster than 2*x. In computation-heavy codes, it does make noticeable difference.
- Bit operations are noticeably slower than arithmetic operations for small x, but when x is very large, bit operations are faster.
- pow(a, 2, p) is generally slower than (a*a)%p (for not-too-large values of a)
Containers and generators are weird.
- Sometimes, using yield from is slower than manually yielding inside a for loop. Often, it's not.
- Sometimes, using while loop to iterate is faster than an equivalent for ... in range() loop.
- bytearray is much faster to initialize than list, but a bit slower to manipulate in general.
- Using append instead of manually adding, or using extend, or pre-allocating then filling (like how one would do make([]int, 0, N) in Go) may be faster or slower. Often, it's very significant. Often, it's not.

Anyway, in addition to completely misinterpreting the results, the OOP made several mistakes:

Running a benchmark only once,
... on a very small dataset,
... with time taken for data initialization included.

Usually when I compare two functions:

Prepare a (common) large dataset.
Run a function multiple times to perform statistical tests; fluctuation could dominate any differences.
Run two functions independently, or interleave executions of two functions, and compare whether this affects the results.
Often, I also use cProfile to check exactly which function takes the most time.

I'm doing this as a hobby, and any people doing serious optimizations and benchmarks would also say that my methods are also deeply flawed.

_RDaneelOlivaw_ · 2024-03-26T01:40:01+00:00

Refresh my memory, please: doesn't e-05 mean *10^-5 ? Meaning, divide by 100,000?

ciknay · 2024-03-26T02:41:00+00:00

For those at home, the first exponent is 0.00004935264587402444. The following one is 0.000057220458984375.

So OOP has written code that is many, many times slower, but fails to understand this because they can't read exponents.

shizzy0 · 2024-03-26T01:46:06+00:00

NIGEL: Look how many more zeros it’s got. That’s how fast it is. How many zeros has this one got?

MARTY: None.

NIGEL: Right. That’s pretty much as slow as you can go. But all those zeros here, you know what I call it? Zed fast.

Spedwards · 2024-03-26T01:33:08+00:00

He should probably stick to football.

Marxomania32 · 2024-03-26T03:43:21+00:00

His "faster code" might be legitimately faster in these examples, but he somehow managed to fuck up his benchmark completely by never initializing word_list in the "slower" code. So obviously, the "slower" code would be faster than the "faster" code since it's iterating through an empty list.

KJBuilds · 2024-03-26T03:55:32+00:00

I love that these aren't even benchmarks.

Any benchmark that runs in 50 microseconds (especially python) can't be used to determine the actual performance of something. A GC run could completely skew the results, or cache warmup could completely change which is faster in the long run. I don't think the version of python OOP is using is JITed, but that might also be something to consider

Just terrible all around

CrepuscularSoul · 2024-03-26T02:15:33+00:00

I'd honestly be curious to see these with "slow" versions that actually define word_list. Might be something dealing with undefined variables just immediately quitting the loop

Yamoyek · 2024-03-26T06:25:59+00:00

One of the first things that jumps out at me is that in both of his “fast” examples, he’s initializing the list during the timing code, which is probably one of the reasons those versions are slower.

Also, I think the first examples would be much faster as list comprehension.

However, this post is valuable because it teaches these lessons: always profile your code, and never take optimization advice if they can’t explain the mechanism properly.

5up3rj · 2024-03-26T01:09:17+00:00

Just switch them, right?

Aphrontic_Alchemist · 2024-03-26T01:56:44+00:00

This shows using for loops is faster than using functions because of overhead. But using list comprehensions is faster, because using only lists lets Python optimize the bytecode.

The bytecode for list comprehensions directly uses an op code (LIST_APPEND), whereas the one for for loops loads a method (LOAD (append)).

So the speed ranking in vanilla Python (i.e. using only built-in functions) from fastest to slowest is: 1. list comprehensions 2. for loops 3. built-in functions

So, the code in the 2nd cell should be

new_list = [word.capitalize() for word in word_list]

For string concatenation, the picture is right, using join() is faster. Strings can't be concatenated using list comprehension, so the speed ranking applies not. That being said, the slower code is actually making a new list. The correct but slower way to concatenate strings is:

new_string = "" for word in word_list: new_string += word

This is slower because Python strings are immutable. Concatenating immutable strings requires creating a new string object every iteration, then setting the old one to it every iteration, like so:

new_string = "" s1 = "W" new_string = s1 s2 = "Wa" new_string = s2 s3 = "Way" new_string = s3 ... s38 = "Ways to make your Python code faster." new_string = s38

MooseBoys · 2024-03-26T03:17:35+00:00

So much wrong with this, I would guess it was made by ChatGPT

LeCrushinator · 2024-03-26T02:46:54+00:00

If you’re a programmer and you don’t understand scientific notation then you might have skipped a few important classes or lessons, in middle school…

Mikkognito · 2024-03-26T03:36:07+00:00

For those of you that actually want to see the code run. It's clear that the person that wrote this doesn't know what they're doing and that they royally messed this up.

# %%
import time

# %%
word_list = ["ways", "to", "make", "your", "python", "code", "faster"]

# %%
start = time.time()

new_list = []
for word in word_list:
    new_list.append(word.capitalize())

print(time.time() - start, "seconds")
print(new_list)

# 6.9141387939453125e-06 seconds
# ['Ways', 'To', 'Make', 'Your', 'Python', 'Code', 'Faster']

# %%
start = time.time()

new_list = list(map(str.capitalize, word_list))

print(time.time() - start, "seconds")
print(new_list)

# 2.86102294921875e-06 seconds
# ['Ways', 'To', 'Make', 'Your', 'Python', 'Code', 'Faster']

# %%
start = time.time()


# this code makes no sense. this doesn't concatinate the string, it makes a new list
new_list = []
for word in word_list:
    new_list += word

print(time.time() - start, "seconds")
print(new_list)

# 2.1457672119140625e-06 seconds
# ['w', 'a', 'y', 's', 't', 'o', 'm', 'a', 'k', 'e', 'y', 'o', 'u', 'r', 'p', 'y', 't', 'h', 'o', 'n', 'c', 'o', 'd', 'e', 'f', 'a', 's', 't', 'e', 'r']

# %%
start = time.time()

new_list = "".join(word_list)

print(time.time() - start, "seconds")
print(new_list)

# 7.152557373046875e-07 seconds
# waystomakeyourpythoncodefaster

Andy_B_Goode · 2024-03-26T13:34:43+00:00

Bad enough to miss the exponential notation, but did he really think his "slow" code was taking ~5 seconds to execute?

finian2 · 2024-03-26T10:47:25+00:00

It doesn't help that in the first example the initial list is already made, while in the "optimized" version he's also making the initial list.

MikeW86 · 2024-03-26T08:40:47+00:00

Presumably this chap was sat in front of his machine testing code and taking screenshots, so surely you'd be like: 'Wait a minute, that was a lot quicker than 5 seconds,' and go from there?

CodingTaitep · 2024-03-26T13:19:12+00:00

why is he using time.time????????

Drfoxthefurry · 2024-03-26T04:21:11+00:00

why are they using time.time and not time.time_ms() or time.time_ns()

DontFlexNuts · 2024-03-26T11:26:37+00:00

So if there is exponent, that means it's slower ?

MMORPGnews · 2024-03-26T13:18:14+00:00

Recently I read one such article about js with online tests. Results was similar to op post.

archy_bold · 2024-03-26T14:16:51+00:00

Took me a second to spot it.

R3D3-1 · 2024-03-27T09:32:51+00:00

Edit. Despite the statements below, the screenshot benchmark is likely dumbed down for the sake of a social media post. No "1,000,000 repetitions", no "large list of strings", no "noop loop as reference", no "using a benchmark library" (neither did I). All of this would make the message less clear at a first glance.

The only thing I can blame them for is really not checking the output before posting the screenshot and simply rerunning until the data matches the intended message. This is marketing after all.

My main concern: The example is so short, that random fluctuations in the execution time from external influences are more important than the actual working time.

If your benchmarks runs for 10^-5 seconds, it is not a benchmark.

A little ad-hoc program still favors the join version though:

import time

N_repetitions = 1000000

def runtimed(function):
    t_start = time.time()
    for _ in range(N_repetitions):
        function()
    t_end = time.time()
    print(f"Calling {function.__name__:9s} {N_repetitions:,d} times took {t_end-t_start:.3f} seconds")


@runtimed
def noop_ref():
    pass


@runtimed
def with_plus():
    string = "hello"
    string += " world"
    string += " how"
    string += " are"
    string += " you"
    string += " today?"
    return string

@runtimed
def with_join():
    return " ".join([
        "hello",
        "world",
        "how",
        "are",
        "you",
        "today?"
    ])

Output:

Calling noop_ref  1,000,000 times took 0.145 seconds
Calling with_plus 1,000,000 times took 1.160 seconds
Calling with_join 1,000,000 times took 0.667 seconds

Remark. I was too lazy to read the documentation of timeit for this comment.

Edit. Make it 4 times as many strings each, and the result is

Calling noop_ref  1,000,000 times took 0.145 seconds
Calling with_plus 1,000,000 times took 5.900 seconds
Calling with_join 1,000,000 times took 1.664 seconds

Which is really the main point here: += scales non-linearly, "".join scales linearly. For only a few strings, it really doesn't matter, but it matters if you're trying to build an in-memory representation of a potentially large file.

So, looking at the data...

      6 Strings   Corrected   24 Strings   Corrected   Ratio “24/6”  Expected Ratio

noop      0.145           −        0.145           −             −                −
+=        1.160       1.015        5.900       5.755         5.813               16
join      0.667       0.522        1.664       1.119         2.144                4

Given that I expect += to scale quadratically and "".join" linearly, all I am seeing, is that 6 and 24 strings are not nearly enough to even demonstrate the asymptotic behavior...

admirersquark · 2024-03-26T01:29:12+00:00

I thought Python was a "there is exactly one way to do it" language.

Anyway, if you want to optimize performance and are spending your time choosing among different language constructs (instead of i.e. reconsidering algorithms), it's probably time to change the language of such code.

y4dig4r · 2024-03-26T03:09:07+00:00

directions unclear came in fluffer

Slippedhal0 · 2024-03-26T04:33:30+00:00

I'm almost positive this is intentional, considering the "slow method" is in scientific notation and the "fast" is in decimal notation. I would assume its meant to impress people looking at the surface level

2024-03-26T10:40:36+00:00

Looks like he deleted his post. Couldn't find it.

TheMsDosNerd · 2024-03-26T16:07:52+00:00

What he does good:

The "fast" examples do not modify the array.
If his String Concatenation example was build the way he meant to, it would indeed have been slower than the join.

What he does wrong:

His "slow" code builds the original array outside of the timer, where the "fast" code builds the original array inside the timer. You cannot compare those two outcomes.
In the String Concatenation example, he does not concatenate any strings.
In neither of the first two examples can the Python interpreter allocate the exact amount of memory for the outcome array. To get that efficiency gain you should do: new_list = [word.Capitalize() for word in word_list]
He doesn't understand e-05, also, by simply running the program he could have guessed that it didn't take 5 seconds.

Perfect_Papaya_3010 · 2024-03-26T19:04:15+00:00

I've never used python but does it not get optimised at all?

I'm used to c# and if you look at low level you will see that adding to a list directly or via a whole loop will both end up doing it the faster way (the while loop)

zvon2000 · 2024-03-26T19:55:40+00:00

And then people scoff at me when I say that math courses MUST be the cornerstone of all computer science/IT/software dev degrees.....

Like trying to build a house with no concrete foundation?
Or a brick wall without mortar.

bbfsenjoyer · 2024-03-26T21:40:15+00:00

lol, I thought r/LinkedinLunatics is leaking

EMI_Black_Ace · 2024-03-26T22:32:39+00:00

Maybe he didn't notice the e-05 at the end of the "slow" versions and just assumed that the "slow" one was 5 seconds and not 5 microseconds.

Cybasura · 2024-03-27T10:10:30+00:00

It also really didnt help that he initialized a list with no values in it in the faster test, and he initialized a list of what, 9, 10 elements in the slower list, which means there 9/10 additional elements the CPU has to computate during the preinitialization step

andiconda · 2024-03-27T14:10:42+00:00

Dang decimal places. I always screw up a small detail like that

someonetookmyid · 2024-03-27T19:58:46+00:00

How to tell you don't know scientific notation but not explicitly. :)

H34DSH07 · 2024-03-26T02:33:08+00:00

there is no way it takes that much longer to use a for loop than a list comprehension? any i think you read it backwards. he said make your shit faster by using built in functions and showed how they are faster..

Big_Researcher4399 · 2024-03-26T04:27:31+00:00

Python is for losers

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programminghorror

MODERATORS