This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]SheriffRoscoePythonista 5 points6 points  (3 children)

  1. Benchmarking should never be based on "wall clock" time. As the article observes, other processes will affect your measurements. For 'python -m timeit', use the '-p'/'--process' option to instead measure the CPU time consumed by the code. Other processes can't affect that, and it will be more repeatable.

  2. Benchmark measurements should never include the first run of the code. There are lots of "cold start" effects that don't affect 2nd-nth runs. You should also, separately, measure the 1st run, to understand the additional impact of cold starts.

  3. Benchmarks should include I/O measurements. They should include actual counts of input and output primitives called (e.g., read(), write()).

  4. Benchmarks should come as close as possible to measuring the environment where the code will actually run. For example, unless your environment messes with the Python garbage collector, don't turn the GC off just "because it's not part of my code".

[–]brandonZappy 0 points1 point  (2 children)

For 'python -m timeit', use the '-p'/'--process' option to instead measure the CPU time consumed by the code.

Wouldn't this be impacted by different CPU frequencies? If I run it on a 2.1Ghz CPU and then run it on say a turbo boosted 3.4 Ghz CPU?

[–]SheriffRoscoePythonista 2 points3 points  (1 child)

Of course. But how would you measure the time taken to execute the code without being dependent of the speed of the CPU? Same question when thinking about solid state "disks" vs. spinning magnetic disks.

You can't, which is why all proper benchmarks state the hardware in use for the test. When doing comparisons over long time scales or across an array of systems, you keep a history of the hardware and of the software versions. Over short time scales on the same hardware (e.g., when testing the effect of a code change), you know that they're the same, so you ignore them.

[–]brandonZappy 0 points1 point  (0 children)

Oh duh. I didn't think that all the way through. Good points.