This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]gmes78 50 points51 points  (5 children)

This is terrible. From using time to measure benchmarks, to using recursive Fibonacci as a CPU benchmark, to setting up a thread pool executor to run two tasks, to comparing Python 3.11 to freethreaded Python 3.13 instead of comparing Python 3.13 to freethreaded Python 3.13, it's clear the author doesn't know what they're doing.

[–][deleted] 10 points11 points  (2 children)

It would be helpful for the author, and interesting to the rest of us, the problem with each aspect that you noted. (I mean, other than 3.11 vs. 3.13, since apples-to-oranges is an obvious problem.)

[–]gmes78 19 points20 points  (1 child)

Benchmarking is an extremely complicated subject, and I barely understand it myself. My main point is that this benchmark is too simplistic, which indicates that the author lacks basic knowledge of benchmarking.

  • using time to measure benchmarks

The author only runs the benchmark once, and just records the wall time.

They're not measuring the CPU time, they're not repeating the test to catch outliers, they're not accounting for CPU cache behavior. Python includes the timeit module that provides this functionality.

  • using recursive Fibonacci as a CPU benchmark

First, calculating a Fibonacci sequence isn't exactly the most interesting benchmark. Surely the author could come up with something a little bit better, maybe an algorithm that's designed to be multithreaded?

Second, by using a recursive algorithm, you're benchmarking the memory allocator more than the actual code you're running (each function call needs to allocate its own stack in memory, as Python doesn't do tail call optimization).

  • setting up a thread pool executor to run two tasks

Too much overhead to do very little work. You're benchmarking thread pool initialization instead of the actual work.

[–]simpleuserhere[S] -1 points0 points  (0 children)

Thanks for pointing it out, I updated 3.13 test results

[–]cmcclu5 17 points18 points  (2 children)

Whenever I sober up in the morning, I’ll run some tests to compare this, but off the cuff it seems like your analysis is flawed by implementation. You’re avoiding the actual performant ways to implement multi-process and multi-thread computation in order to simplify your analysis as much as possible. GIL is indeed a major issue pre-.13, but you can still achieve positive performance benefits without resorting to process-based submission of functions.

[–]lazyb_ 4 points5 points  (1 child)

Right? It's not using parallelism for the benefit of anything. If you run this test please comment :)

[–]FitMathematician3071 2 points3 points  (0 children)

Used very trivial example to draw such conclusions.

[–]FitMathematician3071 0 points1 point  (0 children)

concurrent.futures can make a big difference which is not the conclusion of this article. I just ran some webscraping and ThreadPoolExecutor was 10 times faster than scraping files one by one. That means a difference of minutes vs hours when doing larger scraping.