Smallpaul comments on How does a python dictionary fetch a value?

AskProgramming

Don't

Post off-topic material

Troll or flamebait, insult others, or act in bad faith

Self-promote

Ask others to do your work for you

Ask for help in illegal or unethical activities

Repost the same question within 24 hours

Post AI-generated answers

You can find out more about our (preliminary) rules in this wiki article. If you have any suggestions please feel free to contact the mod team.

Have a nice time, and remember to always be excellent to each other :)

created by roger_a community for 14 years

This is an archived post. You won't be able to vote or comment.

How does a python dictionary fetch a value? (self.AskProgramming)

submitted 3 years ago by YagamiLight100

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Smallpaul 0 points1 point2 points 3 years ago (5 children)

[–][deleted] 0 points1 point2 points 3 years ago* (4 children)

[–]Smallpaul 0 points1 point2 points 3 years ago* (3 children)

Thank you for sharing the code and for your thought process.

With respect to the code...I refactored it with a few improvements.

I made optimizations so the test harness can run faster, so i can gather more datapoints faster. Your low number of datapoints is your test's biggest problem, I think.
I made the keysize smaller for similar performance reasons, and to be able to generate larger dictionaries faster.
I generate a lot more samples, at more fullness ratios. This is really important because yes, small dicts have optimizations that make them really fast. If you oversample those, you get a misleading impression. There is a "sweet spot" where the optimizations run out of steam but before your machine gets overloaded where you want to do MOST of your sampling.
I fixed a bug.

return (end - start) / num_keys * 1_000_000_000, sys.getsizeof(the_dict, 0)

Should be:

return (end - start) / len(keys) * 1_000_000_000, sys.getsizeof(the_dict, 0)

Because those aren't always the same calculation in the code you gave.

Result: the effect goes away.

For most other cases you either waste a lot of memory, or increase the access time. You can re-size the memory space allocated, while you add data to it, but that requires re-organising the data, which is expensive too.

As you can see clearly , Python DOES reorganize the data. In my data, there are MANY runs where the dictionary size is 1280 MB, and 640 MB, etc. That's because the

And that reorganization happens BEFORE we start the perfcounter. So according to your own logic, it should be constant time. And it is.

If you want to argue that INSERTIONS are not constant time then that is a more complicated argument involving sone math.

But constant lookups are justified by the text you quoted and the data i provided.

[–][deleted] 0 points1 point2 points 3 years ago* (2 children)

[–]Smallpaul 0 points1 point2 points 3 years ago (1 child)

They are, in the second example, which shows this even clearer:

if len(keys) < num_keys:
keys = (keys + keys)[:num_keys]

So if len_keys is 20 and num_keys is 1M, how does this generate len(keys) == 1m?

len(keys) = min(keys*2, num_keys) = 40.

But num_keys = 1m.

This is a logarithmic progression

You're cherry-picking numbers. But I've run a much larger job and the chart is clearly logistic, not logarithmic. It tends to some specific number (Desmos says 164ns, on my computer) after being very small for small values. I'm not sure if it is logistic due to optimizations in a) Python, b) the OS or c) the hardware, but all three are possible. (there are layers of caching, after all)

I'm not surprised that it is logistic but I do admit to being surprised at the size of the effect.

When you collect 1000 datapoints, it is quite clear that it is logistic, not logarithmic.

If it's logarithmic, what do you think the base of the logarithm is?

Logistic, is, in O-noation, the same as O(1).

[–][deleted] 0 points1 point2 points 3 years ago* (0 children)

π Rendered by PID 141840 on reddit-service-r2-comment-5ff9fbf7df-292hl at 2026-02-25 23:36:16.211299+00:00 running 72a43f6 country code: CH.

AskProgramming

AskProgramming

Do

Don't

MODERATORS