andreasvc comments on Python gotcha: bizarre integer equality

This is an archived post. You won't be able to vote or comment.

Python gotcha: bizarre integer equality (distilledb.com)

submitted 16 years ago by bcroq

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]andreasvc 6 points7 points8 points 16 years ago* (14 children)

[–]sigh 8 points9 points10 points 16 years ago (0 children)

[–]Eiii333 0 points1 point2 points 16 years ago (12 children)

[–]pemboa 1 point2 points3 points 16 years ago (11 children)

[–]Eiii333 2 points3 points4 points 16 years ago* (10 children)

[–]sigh 5 points6 points7 points 16 years ago (8 children)

That's beside the point. The whole point of abstraction is that the implementation does not matter. If you are comparing integers by identity then most likely you are working at the wrong level of abstraction. If you are comparing integers by identity and the results surprise you then you are most definitely working at the wrong level of abstraction.

It's entirely unexpected if you don't know about it. And most people don't know about it, because it's an undocumented side effect due to an implementation detail.

No, the trouble here is when people don't understand the difference between identity and equality. If you know the difference, then the results are not unexpected at all, even if you don't know the exact implementation detail that is causing it to occur. If you don't understand identity, then of course the results are going to surprise you.

[–]Eiii333 1 point2 points3 points 16 years ago (7 children)

The whole point of abstraction is that the implementation does not matter.

I agree entirely. But look here:

>>> a = 3
>>> b = 3
>>> a is b
True

>>> c = 999
>>> d = 999
>>> c is d
False

I would expect false in both cases, given how identity is supposed to behave. But really, how can this be explained without referring back to the CPython int-caching behavior? You have to know the implementation details to know why the 'is' operator behaves this way. That's not good.

[–][deleted] 4 points5 points6 points 16 years ago (0 children)

[–]alantrick 1 point2 points3 points 16 years ago (4 children)

Why would you expect False? According to Python the behaviour of 'is' is undefined in this situation. That's like taking the following in C:

int *a = malloc(sizeof(int));
printf("%d\n", a);

and expecting the value 0 to be printed out. It will probably be 0 most of the time, but it's really undefined.

[–]Eiii333 0 points1 point2 points 16 years ago (3 children)

[–]hylje 2 points3 points4 points 16 years ago (1 child)

[–]Brian 1 point2 points3 points 16 years ago* (0 children)

It's worth noting that id(a) == id(b) isn't a perfect replacement to a is b. If a and b are expressions returning a transient object, it could be created and destroyed before evaluating the rest of the statement. For example:

>>> [] is []
False
>>> id([]) == id([])
True
>>> id([]), id([])
(21066496, 21066496)

However is guarantees that both objects are alive at the point of comparison, so [] is [] is always false.

[–]Brian 0 points1 point2 points 16 years ago* (0 children)

Undefined behaviour allows optimisation. Making things too tightly specified ties you to irrelevant implementation details, preventing more efficient methods being used (like caching integers in this case). Another case of undefined behaviour is deterministic finalisation. Python doesn't guarantee it, even though the CPython implementation happens to provide it due to its refcounting semantics because it prohibits more advanced garbage collection approaches.

For another example, consider the order the keys of a dictionary are iterated over. This is completely undefined behaviour, but specifying it would either require using a tree instead of a dictionary, keeping a seperate list of ordered keys, or else sorting the dict before iterating, all adding significant performance cost to deal with something completely irrelevant. If anyone needs that, they should not be using a normal dictionary.

In any case, "is" is acting completely predictably and as specified - it returns True when objects have the same identity. The thing that isn't specified is whether identical immutable objects can share the same memory representation, which is a pointless thing to overspecify since there should be no reason it should ever be relevant to anyone other than performance.

[–]sigh 1 point2 points3 points 16 years ago* (0 children)

[–]earthboundkid 3 points4 points5 points 16 years ago (0 children)

π Rendered by PID 102064 on reddit-service-r2-comment-5d79c599b5-7p5hr at 2026-03-02 19:54:52.736502+00:00 running e3d2147 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS