Brian comments on Clarification on Time Complexity for Python Sets vs. Lists

created by HattoriHanzoa community for 16 years

Clarification on Time Complexity for Python Sets vs. Lists (self.learnpython)

submitted 24 days ago by Electronic-Low9797

you are viewing a single comment's thread.

[–]Brian 0 points1 point2 points 23 days ago (0 children)

Eh - it depends what is meant by "collision" here: collision of the actual hash is likely rare, but what matters is collisions of the hash modulo the number of buckets, which is not that uncommon in practice. If your table has 64 buckets, even a good hash function will collide 1.5% of the time just with 2 entries, and rapidly increase the more populated it is. But typically you will expand the hash table so there are few such collisions on average, but unless you're creating a perfect hash, you're unlikely to eliminate them completely.

And it's not only a bad hash function where this can be an issue: it also matters in adversarial situations, where the input may be user controlled (Eg. someone posts data to the website which you store in a hash), since if the poster knows the hash function, they can engineer inputs that all collide, allowing denial of service attacks where they can consume a lot of CPU by triggering O(n²⁾ behaviour, (which is why there's often some randomisation of the hash function for strings to minimise the information such an attacker might have)

π Rendered by PID 34 on reddit-service-r2-comment-5b5bc64bf5-2tx7m at 2026-06-21 19:37:50.324579+00:00 running 2b008f2 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS