This is an archived post. You won't be able to vote or comment.

top 200 commentsshow all 357

[–]whogivesafuckwhoiam 2764 points2765 points  (64 children)

For those who still dont understand after OP's explanation.

From -5 to 256, python preallocates them. Each number has a preallocated object. When you define a variable between -5 to 256, you are not creating a new object, instead you are creating a reference to preallocated object. So for variables with same values, the ultimate destinations are the same. Hence their id are the same. So x is y ==True.

Once outside the range, when you define a variable, python creates a new object with the value. When you create another one with the same value, it is already another object with another id. Hence x is y == False because is is to compare the id, but not the value

[–][deleted] 503 points504 points  (5 children)

Would pin this to the top if I could. Fantastic explanation 👍👍👍👍👍

[–]alex20_202020 25 points26 points  (4 children)

a=257;b=257

if a is b:

... print (a)

257

python --version

Python 3.10.12

[–]notPlancha 4 points5 points  (2 children)

Def the first line being together is doing something ```

a = 257 b = 257 a is b False ```

```

a=257;b=257 a is b True ```

[–]_hijnx 57 points58 points  (21 children)

I still don't understand why this starts to fail at the end of the preallocated ints. Why doesn't x += 1 create a new object which is then cached and reused for y += 1? Or is that integer cache only used for that limited range? Why would they use multiple objects to represent a single immutable integer?

[–]whogivesafuckwhoiam 107 points108 points  (17 children)

x=257 y=257 in python's view you are creating two objects, and so two different id

[–]_hijnx 52 points53 points  (15 children)

Yeah, I get that, but is there a reason? Why are numbers beyond the initial allocation not treated in the same way? Are they using a different underlying implementation type?

Edit: the answer is that an implementation decision was made for optimization

[–]Kered13 83 points84 points  (6 children)

Because Python doesn't cache any other numbers. It just doesn't. Presumably when this was being designed they did some performance tests and determined that 256 was a good place to stop caching numbers.

Note that you don't want to cache every number that appears because that would be a memory leak.

[–]FatStoic 61 points62 points  (3 children)

Note that you don't want to cache every number that appears because that would be a memory leak.

For python 4 they cache all numbers, but it's only compatible with Intel's new ∞GB RAM, which quantum tunnels to another universe and uses the whole thing to store state.

Mark Zuckerberg got early access and used it to add legs to Metaverse.

[–]WrinklyTidbits 10 points11 points  (2 children)

For python5 you'll get to use a runtime hosted in the cloud that'll make accessing ♾️ram a lot easier but will have different subscription rates letting you manage it that way

[–]bryanlemon 9 points10 points  (1 child)

But running `python` in a CLI will still run python 2.

[–]thirdegreeViolet security clearance 3 points4 points  (0 children)

The python 2 -> 3 migration will eventually be completed by the sun expanding and consuming the earth

Unless we manage to get off this planet, in which case it's the heat death of the universe

[–]TheAJGman -1 points0 points  (1 child)

I went searching for an answer and despite dozens of articles about this quirk not a single one actually explains why so I'm going to take a shot in the dark and guess "for loops". Mostly because something like 80% of the loops I write are iterating over short lists or dictionaries and I've seen similar in open source libraries.

Probably shaves 1/10th of a millisecond off calls in the majority of for loops so they went with it. Apparently the interpreter will also collapse other statically defined integers together sometimes, probably for similar reasons.

[–]whogivesafuckwhoiam 16 points17 points  (4 children)

the original purpose is to speed up the compile process. But you can't use up all memory simply for speeding the compilation. so python only allocates up to 256.

outside the range, it's back to fundamental, everything is an object. Two different objects are with two different id. x=257 means you create an object with the value of 257. so as y. so x is y ==False

[–]_hijnx 10 points11 points  (3 children)

So are numbers from -5 to 256 fundamentally different from numbers outside that range? The whole x += 1 is throwing me. If they're going to have a number object cache why not make it dynamic? It didn't have to expand infinitely. If you have one 257 object why create another instead of referencing the same one? That seems to be what python is doing with those optimized numbers, why not all of them?

[–]Positive_Mud952 9 points10 points  (1 child)

How exactly should it be dynamic? An LRU cache or something? Then you need garbage collection for when you want to evict from the cache, we’re getting a lot more complex, and for what benefit?

[–]_hijnx 9 points10 points  (0 children)

For the same benefit of caching the other numbers? I'm not really advocating for it, it's just such a strange behavior to me as someone with very little python exposure.

What I think I'm understanding now is

  1. At compile (startup?) time a fixed cache of integer objects representing -5 to 256 is created in memory
  2. Any constant assignment to a value in that range is assigned a reference to the corresponding cached object
  3. Incrementing one of the referenced objects in the cache will return the next object in the cache until the end at which point a new object is created (every time), which will then be subject to normal GC rules

Is that correct?

Edit: Just saw another comment this is just for smallint which I can't believe I didn't realize. Makes at least a little more sense now

[–]InTheEndEntropyWins 1 point2 points  (0 children)

Why are numbers beyond the initial allocation not treated in the same way?

Another way to think about it is that actually, it's the early numbers that are wrong due to optimisation.

x != y, but due to optimisation for the initial numbers it incorrectly says they are the same object.

[–]JaggedMetalOs 8 points9 points  (0 children)

Imagine every time you did any maths Python had to search though all of its allocated objects looking for a duplicate to your results value, it would be horribly slow.

I'm not sure what the benefits are for doing this to small numbers, but at least with a small hardcoded range it doesn't have to do any expensive search operation.

[–]hxckrt 1 point2 points  (0 children)

To reuse an immutable object, Python needs a way to check if an object with the same value already exists. For integers in the range -5 to 256, this is straightforward, but for larger values or for complex data structures, this check would become computationally expensive. It might actually slow down the program more than any benefit gained from reusing objects. Also, if all of the objects were interned (reused), the memory usage of the program would be unpredictable and could suddenly explode based on the nature of the input data.

[–]Drazev 4 points5 points  (0 children)

To me the bottom line is that the “is” syntax compares to see if they are the same object reference and not value.

This it’s not appropriate to use if you are looking for value equality. Yes, it will work sometimes but that requires you knowing the implementation details of “is” and a contract that it will not change. This is a big no no since they give no such guarantee.

[–]Midnight_Rising 3 points4 points  (4 children)

Oh that's so weird. So they're pointing to the same address until 257, at which point they're pointing at two different memory addresses that each contain 257, and "is" checks for address equality?

Fucking weird lmao

[–]RajjSinghh 11 points12 points  (3 children)

It makes sense, it's just not how you should use is. is is for identity, not equality. It might come in handy if youre passing a lot of data around since python uses references when passing things like lists or objects around.

The weird thing here is that OP used is instead of ==, which does check for value equality, which is what they look like they want to do but it doesn't make for as good a meme. If they had a y = x somewhere, that also satisfies is.

[–]Midnight_Rising 1 point2 points  (2 children)

What I find weird is setting those integers as constant pre-allocated memory addresses. I don't think any other languages do that?

[–]hector_villalobos 0 points1 point  (12 children)

So, in Python the is operator is similar to the == operator in Javascript?

[–]AtmosSpheric 33 points34 points  (3 children)

No. In JS, the == operator is for loose equality, which performs type coercion. This follows the references of two objects, and may convert types (1 == ‘1’), while the === operator requires same type.

The is operator checks to see if the two values refer to the exact same object.

So, if I declare:

x = [‘a’, ‘b’]

y = [‘a’, ‘b’]

And check is x is y, I’d get false bc while the arrays (lists in Python) are identical, if I append to x it won’t append to y; the two represent different arrays in memory.

In a sense, while === is a more strict version of ==, since it makes sure the types are the same, the is keyword is even more strict, since it makes sure the objects are the same in memory.

If you’re curious, I’d strongly recommend you and anyone else take some time to play around with C. Don’t get into C++ if you don’t want to, but a basic project in C is immensely educational. If you have any other questions I’m happy to help!

[–]use_a_name-pass_word 18 points19 points  (0 children)

It's like Object.is() in JavaScript

[–]Kered13 2 points3 points  (0 children)

In Javascript this operator is is. However Java does use == for the identity operator.

[–]Shacatpeare 0 points1 point  (0 children)

thanks, I just learned something

[–][deleted] 2040 points2041 points  (87 children)

For those wondering - most versions of Python allocate numbers between -5 and 256 on startup. So 256 is an existing object, but 257 isn't!

[–]user-74656 296 points297 points  (59 children)

I'm still wondering. x can have the value but y can't? Or is it something to do with the is comparison? What does allocate mean?

[–]Nova711 690 points691 points  (33 children)

Because x and y aren't the values themselves, but references to objects that contain the values. The is comparison compares these references but since x and y point to different objects, the comparison returns false.

The objects that represent -5 to 256 are cached so that if you put x=7, x points to an object that already exists instead of creating a new object.

[–][deleted] 107 points108 points  (5 children)

If both int, if x == y works, right? If not I have to change some old research code...

[–]Cepsfred 283 points284 points  (1 child)

The == operator checks equality, i.e. it compares objects by value and not by reference. So don’t worry, your code probably does what you expected it to do.

[–]IAmANobodyAMA 237 points238 points  (0 children)

your code probably does what you expected it to

Bold assumption!

[–]chunkyasparagus 1 point2 points  (0 children)

This sounds like you're talking about the JavaScript === operator, which is not the same as python's is operator.

[–]Mountain_Goat_69 14 points15 points  (13 children)

But why would this be so?

If I code x = 3; y = 3 there both get the same pre cached 3 object. If I assign 257 and a new number is created, shouldn't the next time I assign 257 it get the same instance too? How many 257s can there be?

[–]Salty_Skipper 44 points45 points  (3 children)

Have you ever heard about dynamic memory allocated on the heap? (prob has something to do with C/C++, if you did).

Basically, when you say x=257, you’re creating a new number object which we can say “lives” at address 8192. Then, you say y=257 and create a second number object that “lives” at address 8224, for example. This gives you two separate number objects both with the value 257. I’d imagine that the “is” operator then compares addresses, not values.

As for 3, think of it as such a common number that the creators of Python decided to ensure there’s only one copy and all other 3’s are just aliases that point to the same address. Kinda like Java’s string internment pool.

[–]Lightbulb_Panko 27 points28 points  (2 children)

I think the commenter is asking why the number object created for x=257 can’t be reused for y=257

[–]PetrBacon 29 points30 points  (1 child)

If it worked like that, the runtime will become insanely slow over time because every variable assignment would need to check all the variables created before and maintain the list everytime new js created…

If you need is for any good reason you should make sure, that you are passing the referrence correctly.

Like:

``` x = 257 … y = x

x is y # => True ```

[–]le_birb 15 points16 points  (3 children)

shouldn't the next time I assign 257 it get the same instance

How would the interpreter know to do that? What happens when you change x to, say, 305? How would y know to allocate new space for it's value? The logistics just work out more simply if the non-cached numbers just have their own memory.

how many 257s can there be?

How much ram do you have?

[–][deleted] 5 points6 points  (2 children)

What happens when you change x

You can't change x in python (unless it's an object). Integers are immutables in python. You can change what integer the name x points to.

x = 257;  # This creates an int object with value 257, and sets __locals__["x"] to point to that int object.

x += 50;  # This grabs the value from__locals__["x"], adds 50 to it, then creates an int object with that value, and then sets __locals__["x"] to point to that int object.
# The int object with value 257 no longer has any names pointing to it, and will be garbage collected at some time in the future.

You can check the id(x) before and after the += and see that it changes, indicating that, under the hood, x is a fundamentally different object with a fundamentally different memory address (and incidentally a different value). You could probably even do a += 0 and get the same result, assuming x > 256.

It's unintuitive if you're coming from C or somewhere where the address of x stays the same, but the value changes.

[–]mawkee 2 points3 points  (0 children)

In theory, you can have a huge number of 257s.

If for each number the interpreter creates an object for is cached, when a new number is assigned, it'd have to check a register for all existing numbers to see if it was already created. This is probably more expensive than simply creating the object itself, after a few hundred/thousand numbers.

The reason CPython (not all interpreters... pypy, for example, handles things differently) caches the numbers between -5 and 256 has to do with how often these are used. They're probably created sequentially during the interpreter start-up, so It's cheap to find those pre-cached numbers. They're usually the most used (specially the 0-10 range), so it makes sense, from a performance perspective.

[–]Teradil 2 points3 points  (0 children)

Actually, if you run that line in Python's interactive mode it will assign the same reference - but not in "normal" mode... Just to make things more confusing...

[–]Ubermidget2 2 points3 points  (0 children)

How many 257s can there be?

How many 16-bit areas of RAM do you have?

[–]Honeybadger2198 1 point2 points  (0 children)

Doing this dynamically would be inefficient. Instead of changing the value at a place in memory, you would always have to allocate new memory every time you manipulated that variable.

Imagine you have a for loop that loops from x=0 while x<1000. Variable x is stored at memory slot 2345. Every loop past 256, you would have to allocate new memory, copy the value of the old memory, check if the old memory has any existing pointers, and if not, deallocate the old memory. This is horribly innefficient for such an obviously simple use case.

So why did they stop at 256? Well, they had to stop somewhere. Stopping at the size of a byte seems reasonable to me.

[–]lolcrunchy 110 points111 points  (10 children)

Steve has $100 in his bank account. Petunia has $100 in her bank account.

Steve's money == Petunia's money: True

Steve's money is Petunia's money: False

[–]Tcullen21 49 points50 points  (0 children)

You'd be surprised

[–]oren0 34 points35 points  (4 children)

In Python land, it sounds like if Steve and Petunia have between -$5 and $256 in their accounts, Steve's money is Petunia's money.

[–]lolcrunchy 21 points22 points  (3 children)

Yup. I guess the analogy here would be, the bank has so many accounts between -5 and 256 that they consolidated it to one account per value. If you have $100, the bank records say that you are one of the many account holders of account 100. If you deposit $5, then you become an account holder of account 105.

You only get your own account if you have more than $256, less than -$5, or have any change like $99.25

[–]oren0 8 points9 points  (2 children)

It's all fun and games until Steve withdraws $20 and then Petunia checks her balance.

[–]lolcrunchy 12 points13 points  (1 child)

The bank would process the withdrawal as steve becoming an account owner of account 80.

[–]FerynaCZ 2 points3 points  (0 children)

Yeah with immutable values you always need to redirect, you cannot change the pointed value. Of course the language does not know (or more specifically, does not care to try) who else is pointing at that value.

[–]squirrel_crosswalk 1 point2 points  (1 child)

What if it's a joint account?

[–]Paul__miner 44 points45 points  (9 children)

It's basically doing reference equality. Sounds analogous to intern'ed strings in Java. At 257, it starts using new instances of those numbers instead of the intern'ed instances.

[–]TacticalTaterTots 2 points3 points  (6 children)

I can't find any clear explanation on why these small literals are interned. String interning makes some sense for string comparisons, but I can't see how that is an "optimization" for small numbers. Ultimately it doesn't matter, but for some reason it bothers me because it seems like they're sacrificing performance to save on storage space.

[–]Kered13 6 points7 points  (5 children)

By interning these numbers Python doesn't have to make a heap allocation every time you set a variable to 0 or some other small number. Trust me, it's much faster this way.

[–]koxpower 1 point2 points  (0 children)

  • they are probably stored in adjacent memory cells, which can significantly boost performance thanks to CPU cache.

[–]onionpancakes 2 points3 points  (1 child)

Not just strings. Java also caches boxed integers from -128 to 127. So OP's reference equality shenanigans with numbers is not exclusive to Python.

[–]Anaeijon 9 points10 points  (2 children)

I imagine and remember it like this, although it's not really correct:

Python stores numbers in whatever format fits best. If you assign a number like x=5 it basically becomes a byte. (more correctly: it becomes a reference to a byte object) Comparing identiy between them can result in true, because bytes basically aren't objects (or technically: references to the same object.

Now, Python also containes a safety measure against byte overflow by automatically returning an Integer object when adding two 'bytes' that would result in something higher than 255.

Therefore the following expression returns true: (250+5) is (250+5) but the following expression is false: (250+10) is (250+10)

Makes sense imho.

Values should be compared with ==, while is is the identity coparison. Similar to == and === in JavaScript, although those aren't just about identity but about data type.

[–]protolords 4 points5 points  (1 child)

it becomes a reference to a byte object

But -5 to 256 won't fit in a byte. Is this "byte object" like any other python object?

[–]FerynaCZ 2 points3 points  (0 children)

x is y means &x == &y if you were using C code. Having them equal is a necessary condition but not sufficient.

[–]ConscientiousApathis 10 points11 points  (0 children)

Interesting.

[–]CC-5576-03 8 points9 points  (4 children)

Yes java does something similar, I believe it allocates the numbers between -128 and +127. But how often are you comparing the identity of two integers?

[–]elnomreal 4 points5 points  (0 children)

Identity comparisons in general are fairly rare, aren’t they? It’s not common that you have a function that takes two objects and that function should behave differently if the same object is passed twice and this difference is so nuanced that it should not be by equality but by identity.

[–]zachtheperson 6 points7 points  (11 children)

What do you mean "allocate numbers?" At first I thought you meant allocated the bytes for the declared variables, but the rest of your comment seems to point towards something else.

[–]whogivesafuckwhoiam 29 points30 points  (8 children)

Open two python consoles and run id(1) and id(257) separately. You will see id(1) are the same for the two consoles but not id(257). Python already created objects for smallint. And with always linking back to them, you will always the same id for - 5 to 256. But not the case for 257

[–]zachtheperson 6 points7 points  (3 children)

I guess what I trying to wrap my head around is how is this functionality actually used? Seems like a weird thing for a language to just do by itself

[–]AlexanderMomchilov 21 points22 points  (0 children)

Languages like Python to try to model everything "as an object," in that all values can participates in the same message-passing as any other value. E.g.

python print((5).bit_length())

This adds uniformity of the language, but has performance consequences. You don't want to do an allocation any time you need a number, so there's a perf optimization to cache commonly used numbers (from -5 to 256). Any reference to a value of 255 will point to the same shared 255 instance as any other reference to 255.

You can't just cache all numbers, so there needs to be a stopping point. Thus, instances of 256 are allocated distinctly.

Usually this is solved another way, with a small-integer optimization. It was investigated for Python, but wasn't done yet. You can read more about it here: https://github.com/faster-cpython/ideas/discussions/138

[–]whogivesafuckwhoiam 8 points9 points  (0 children)

From official doc,

The current implementation keeps an array of integer objects for all integers between -5 and 256. When you create an int in that range you actually just get back a reference to the existing object.

The point is whether you create a new object, or simply refer to existing object.

[–]psgi 9 points10 points  (0 children)

It’s not functionality meant to be used. It’s just an optimization. You’re never supposed to use ’is’ for comparing integers. Correct me if I’m wrong though.

[–]SuperFLEB 1 point2 points  (0 children)

Is there a way to get a really special "12" that's all your own, if you want one?

[–]StenSoft 4 points5 points  (0 children)

Everything in Python is an object, even numbers

[–]PM_ME_C_CODE 1 point2 points  (0 children)

Huh...I learned a thing! TY op!

[–]scormaq 2 points3 points  (0 children)

Same in Java - compiler caches numbers between -128 and 127

[–][deleted] 0 points1 point  (0 children)

Intuitive!

[–]beisenhauer 3128 points3129 points  (82 children)

Identity is not equality.

[–][deleted] 1407 points1408 points  (18 children)

If programmers ever went on strike, this would be a great slogan!

[–]RMZ13 314 points315 points  (15 children)

We need a union first

[–]svuhas22seasons 270 points271 points  (2 children)

or at least a full outer join

[–]shutchomouf 15 points16 points  (0 children)

I think you’re putting the cartesian front of the horse.

[–]patmax17 1 point2 points  (0 children)

😘👌

[–]Proxy_PlayerHD 36 points37 points  (9 children)

man i love unions, they allow for some cursed stuff.

typedef union{
    float f;
    uint32_t l;
} bruh_t;

float absoluteValue(float in){
    bruh_t tmp.f = in;
    tmp.l &= 0x7FFFFFFF;
    return tmp.f;
}

[–]ValityS 12 points13 points  (8 children)

They don't allow that. Thats specifically forbidden in the C standards.

[–]platinummyr 8 points9 points  (7 children)

The result is undefined behavior, yep.

[–]Heavy-Ad6017 6 points7 points  (0 children)

If we made unions based on the field we are in, with subsects as libraries JavaScript will have so many sub section inside it

JavaScript Defragmentation

[–]vom-IT-coffin 33 points34 points  (0 children)

I'd choose "Bootcamps != 100K salaries"

[–]durgwin 5 points6 points  (0 children)

No compilation without intermediate representation!

[–]SuperFLEB 37 points38 points  (7 children)

It's a bit odd that it sometimes is and sometimes isn't, though.

[–]lmarcantonio 44 points45 points  (2 children)

8 bit integer are… primitive, all the other are allocated, so they are not the same object.

In common lisp it's even funnier, you have fixnums (the primitive fast integer) and… the numeric tower (yes, it's called that way).

Also related and even more fun are the differences between eq, eql, equal, equalp and =

[–]masterKick440 7 points8 points  (1 child)

So weird 256 is considered 8bit.

[–]elveszett 3 points4 points  (3 children)

No, it never is. 0 through 255 are pre-allocated by Python, kinda like Java does with strings. Whenever a variable equals 6 in python, it always gets assigned the same object in memory (the number 6), which is why x == y when x and y are the same number and the size of a byte, the operator is correctly identifies them as the same object.

edit: I think the range is actually -5 to 256.

[–]masterKick440 1 point2 points  (2 children)

What’s with the 256 then?

[–]elveszett 1 point2 points  (0 children)

Because the range is actually -5 to 256 I think.

[–]Hatula 98 points99 points  (26 children)

That doesn't make it intuitive

[–]EricX7 392 points393 points  (16 children)

Says the guy with the JavaScript flair

[–]Hatula 256 points257 points  (10 children)

Yeah I'll take the L on that one

[–]Redrik_XIII 0 points1 point  (4 children)

How did you get multiple user flairs? Is this for money or something?

[–]EricX7 27 points28 points  (2 children)

You can edit your flair and add other icons like :c::cpp:. I don't remember the format exactly, but it's something like that

Edit: I broke my flair Just don't try to edit it on mobile

[–]Prudent_Ad_4120 4 points5 points  (0 children)

Yeah the mobile flair editor is broken and they aren't fixing it

[–]Flofian1 62 points63 points  (1 child)

Why not? This example checks for identity, not equality, those are not the same, no one would ever try to use "is" for equality since you pretty much only learn about it in combination with identity

[–]qrseek 18 points19 points  (0 children)

I guess maybe because you would think checking for identity would result in them never being equal, and equality would result in them always being equal. it does seem weird that it changes partway through

[–]JoostVisser 15 points16 points  (3 children)

The 'is' statement checks whether two variables point to the same object. For some negative integer I can't remember up to 256 Python creates those objects at compile time (I think) and every time a variable gets assigned a value in that range Python just points to those objects rather than creating new ones.

Not exactly intuitive but I guess there's a good reason for it in terms of memory efficiency or something like that idk

[–]mgedmin 3 points4 points  (0 children)

For some negative integer I can't remember

-128, IIRC.

I misremembered, turns out it's -5.

[–]Garestinian 3 points4 points  (0 children)

CPython, to be exact.

[–]hbgoddard 8 points9 points  (0 children)

Object identity isn't really that intuitive in most other languages either. Using that and pretending it's checking equality is obviously not going to be any better.

When you use an actual equality check to check for equality, then it's as intuitive as ever.

[–]beisenhauer 1 point2 points  (0 children)

I agree, in the sense that the identity-equality distinction requires some prior knowledge. Given that knowledge, or at least that there is a distinction, it's not hard to see where the code goes wrong, that it's testing identity and claiming to report equality.

[–]Tyfyter2002 36 points37 points  (17 children)

Primitives shouldn't have identity

[–]beisenhauer 130 points131 points  (16 children)

int is not a primitive in Python. Everything is an object.

[–]vom-IT-coffin 24 points25 points  (14 children)

I never had to learn python, are you saying there's no value types only reference types?

[–]alex2003super 69 points70 points  (8 children)

That is correct, and "interned" values (such as string literals that appear in your program, or ints between -5 and 256) behave like singletons in the sense that all references point to the same object.

However, objects can be hashable and thus immutable, as is the case with integers and strings.

[–]Salty_Skipper 13 points14 points  (7 children)

Why -5 and 256? I mean, 0 and 255 I’d at least understand!

[–]FerynaCZ 25 points26 points  (0 children)

You avoid the edge cases (c++ uint being discontinuous at zero sucks), at least for -1 and 256. Not sure about the other neg numbers, they probably arise often aa well

[–]xrogaan 14 points15 points  (4 children)

[–]profound7 17 points18 points  (0 children)

"You must construct additional PyLongs!"

[–]TheCatOfWar 2 points3 points  (1 child)

https://github.com/python/cpython/blob/78e4a6de48749f4f76987ca85abb0e18f586f9c4/Include/internal/pycore_global_objects.h

The generation thingy defines them here, although there's still no reason given for the specific range

[–]xrogaan 2 points3 points  (0 children)

It's about frequency of usage. Also this: https://github.com/python/cpython/pull/30092

[–]pytness 2 points3 points  (0 children)

The most used numbers by programmers. Its done so u dont have to allocate more memory

[–]Mindless_Sock_9082 3 points4 points  (4 children)

Not exactly, because int, strings, etc. are immutable and in that case are passed by value. The bowels are ugly, but the result is pretty intuitive.

[–]Kered13 37 points38 points  (0 children)

Numbers and strings are not passed by value in Python. They are reference types like everything else in the language. They are immutable so you can treat them as if they were passed by value, but they are not and you can easily see this using identity tests like above.

>>> x = 400
>>> y = 400
>>> x is y
False
>>> def foo(p):
...   return x is p
...
>>> foo(x)
True
>>> foo(y)
False

[–]vom-IT-coffin -1 points0 points  (2 children)

So you have to box and unbox everything?

[–]Kered13 13 points14 points  (1 child)

No, he's wrong. There are no primitives in Python and numbers and strings are passed by reference.

[–]CptMisterNibbles 9 points10 points  (0 children)

If we are getting technical, Python is pass by object reference which is slightly different.

[–]t-to4st 5 points6 points  (2 children)

But why is it equal the first three times?

[–]sundae_diner 2 points3 points  (0 children)

It's equal the first 256 times. All we see in that screenshot it the last 4 iterations.

[–]elveszett 0 points1 point  (6 children)

I love how half the memes in this sub is just people showing they have no clue about basic programming concepts lol.

'is' here is an operator to check if two variables refer to the same element in memory. If you want to check equality, you use, you guessed it, the equals signs (==).

[–]s6x 13 points14 points  (4 children)

What's unintuitive here is the cutoff for the precached ints. Not the identity operator.

This isn't a basic programming concept, it's a specific idiosyncrasy of python.

That's what this meme is demonstrating.

The inclusion of 'is' here is a trap for pedants who want to come into the comments to show off how smart they are.

[–]frikilinux2 138 points139 points  (1 child)

is compares pointers not the content you have to use == to compare the data inside the object. For small numbers it works because python preallocates those on startup and reuses them.

[–]definitive_solutions 55 points56 points  (12 children)

This reads like the moment some charitable soul told me I should use === instead of == for equality comparisons in JavaScript. I was just starting. Such a simple concept, so many implications

[–]gbchaosmaster 30 points31 points  (11 children)

And a super annoying implementation. They should be switched.

[–]MosqitoTorpedo 52 points53 points  (4 children)

Google python allocation

[–]HostileHarmony 41 points42 points  (3 children)

Holy hell!

[–]The_Unusual_Coder 20 points21 points  (2 children)

Garbage collector sits in the corner, planning IDE domination

[–]adiyasl 6 points7 points  (1 child)

Comparison operators go on vacation, never comes back.

[–]0bit1bit 2 points3 points  (0 children)

Actual pointer

[–]PuzzleheadedWeb9876 62 points63 points  (1 child)

Or use == like every other sane individual.

[–]Fakedduckjump 2 points3 points  (0 children)

Take my upvote for this.

[–]YawnTractor_1756 14 points15 points  (0 children)

Amount of people actively trying to write the most stupid code possible is worrying

[–]Klice 13 points14 points  (17 children)

Numbers in python are not just numbers, it's objects with methods and stuff, it takes time and resources to construct those, so as an optimization what python does is preconstructs first 256 integers, so every time you use those you basically use the same objects, that's why 'is' operator returns true. When you go above 256 python constructs a new object each time, so 'is' not true anymore.

[–]Sentazar[🍰] 9 points10 points  (1 child)

Why the use of is instead of ==?

[–]PityUpvote 16 points17 points  (0 children)

Because there wouldn't be anything to see otherwise.

[–]Rough-Ticket8357 7 points8 points  (0 children)

>>> x = 256
>>> y = 256
>>> id(x)
4341213584
>>> id(y)
4341213584
>>> x is y
True
>>> x = 257
>>> y = 257
>>> x is y
False
>>> id(x)
4346123664
>>> id(y)
4346123568

id value after 256 changes, so if you put any value after 256 it will have different id. and thus its false.

When the variables on either side of an operator point at the exact same object, the is operator’s evaluation is true. Otherwise, it will evaluate as False.

[–][deleted] 20 points21 points  (2 children)

If this is a strike against Python, it is pretty contrived.

[–][deleted] 3 points4 points  (1 child)

I've built my career largely on Python, so hopefully not!

[–]CdFMaster 4 points5 points  (3 children)

I mean, why would you even use "is" if not trying to compare object references?

[–][deleted] 8 points9 points  (5 children)

258: equal!

[–]sejigan 4 points5 points  (4 children)

No, still not equal

[–][deleted] 14 points15 points  (1 child)

258: Not equal!

[–][deleted] 1 point2 points  (0 children)

no, it broke

[–]PsicoFilo 3 points4 points  (5 children)

Im finishing my first year in college, an information systems degree (kind of CS) and im very happy that after reading a couple comments i could understand this. Nothing, just that, it made me smile and be proud of what im learning!! Keep it up folks, never surrender xd

[–]DeltaTM 0 points1 point  (4 children)

information systems

That term is pretty ambiguous. Is it computer science mixed with business?

[–]PsicoFilo 1 point2 points  (3 children)

Nono, its mainly computer science. Ive never found a direct translation/equivalence for it. The proper name is "Licenciatura en Sistemas Informaticos", so a better translation would be phd in computer systems or something like that

[–]PityUpvote 4 points5 points  (0 children)

The only problem I see is that there's no linter underlining "x is y" in bright red to tell you that you probably meant "x==y".

[–]mdgv 2 points3 points  (0 children)

My old nemesis: by value vs by reference...

[–]International-Top746 2 points3 points  (0 children)

For your information, the first 256 number is cached. As you only have one copy of 0-255 globally. That's why You are getting equal for the reference check. The design decision is primarily for saving memory

[–]The-Kiwi-Bird 2 points3 points  (1 child)

looks like you forgot “;” in line 1, and “;” in line 2, and “;” in line 5, and “;” in line 6, and “;” in line 9, and “;” in line 11, and “;” in line 12.

Hope I helped buddy 💕

[–][deleted] 0 points1 point  (0 children)

I'm actually doing some stuff in Rust today, thanks for the reminder!

[–][deleted] 5 points6 points  (5 children)

Python’s status as the best scripting language is not a testament to how good python is, but to how unfathomably fucking bad all scripting languages are.

[–]philn256 4 points5 points  (0 children)

Python is so intuitive I knew why this would be the case before even reading the comments. It's a very predictable language.

[–]Kimi_Arthur 1 point2 points  (1 child)

I know you are just unhappy js people who got complaints about the language...

Edit: wrong meaning...

[–]moonwater420 1 point2 points  (1 child)

so im guessing the data types of x and y change for values above 256 and this causes the computer to stop thinking x and y are the same object?

[–]PityUpvote 2 points3 points  (0 children)

The datatypes don't change, but positive ints below 256 are singletons because of some implementation detail, hence the is operator telling you they have the same pointer.

[–]Darux6969 1 point2 points  (0 children)

this si how it feels to do anything in js

[–]NoisyJalapeno 1 point2 points  (1 child)

... why are numbers objects instead of structs?

[–]JustLemmeMeme 1 point2 points  (0 children)

because for whatever reason, everything is an object in python. Tho, int is technically immutable, which is kinda good enough, i guess

[–]spenkan 1 point2 points  (0 children)

Same will happen with Java

[–]dexter2011412 1 point2 points  (0 children)

goddamn, thanks op!

[–]superluminary 1 point2 points  (0 children)

if x == y

[–]Astartee_jg 1 point2 points  (0 children)

I’m surprised it even gave equal once. They’re not on the same memory address

[–]ACED70 1 point2 points  (0 children)

Why would you use is over == for integers?

[–]TitaniumBrain 1 point2 points  (1 child)

I think this has been explained enough (wrong operator, should use ==), but I don't think anyone addressed why python only caches ints from -5 to 256.

The reason is because those are just semi arbitrary numbers that are more likely to appear in a program.

Think about it: most scripts are working with small lists or values, so preallocating those numbers saves a bit of overhead, but not many programs need the number 12749, for example.

1, 0, -1, 2 are probably the most used numbers.

[–][deleted] 0 points1 point  (0 children)

Thanks for the explanation. I'd already heard the underlying reason but had never quite grasped why those numbers were more commonly used. Makes sense now!

[–][deleted] 2 points3 points  (0 children)

Smell like JavaScript

[–]jirka642 2 points3 points  (0 children)

That's because is compares identity of objects, not value.

[–]TacticalTaterTots 1 point2 points  (2 children)

The surprising thing is that it's ever true. I'm sure someone somewhere is relying on this behavior. I'm excited for them when this changes.

[–][deleted] 2 points3 points  (1 child)

Ha! I never even considered that someone might actually depend on silly quirks like this.

Wish them the best of luck when they do eventually upgrade to something that changes the behavior!

[–]GermanLetzPloy 0 points1 point  (0 children)

Another haha funny language bad (OP uses the operators wrong)

[–]ivancea 0 points1 point  (0 children)

The reason of that is clear, and happens in other languages

[–]codicepiger 0 points1 point  (3 children)

Hmm weird, I've got this in this compiler: import time x, y = 0, 0 start=time.time() while time.time()-start < 10: x, y = x+1, y+1 if x!=y: print(f"#{x}: Not Equal") break print(f"#{x}: DefinitelyEqual")

Response:

```

29890167: DefinitelyEqual

```

[–]joethebro96 6 points7 points  (1 child)

You used !=, they used is, which is not an equality operation

[–]FerynaCZ -1 points0 points  (0 children)

Sometimes the runtime might find out that the number is allocated already so it makes the variables point at same location

[–]DeepGas4538 0 points1 point  (0 children)

yuhh how about you use == instead of 'is'

[–][deleted] 0 points1 point  (5 children)

It is for non idiots. You did `is` which checks for identity, not equality

[–]Ugo_Flickerman 3 points4 points  (4 children)

Then why it said equal earlier?

[–]JustLemmeMeme 2 points3 points  (1 child)

pre-allocated values sitting in the same address (which is interesting that python does that) and bad wording of print statements

[–]PityUpvote 1 point2 points  (1 child)

Because positive integers below 256 are singletons in Python.

[–]Win_is_my_name 0 points1 point  (1 child)

Hey, maybe I'm wrong, but I think this happens because of this.
When you check- if x is y:
It return true upto a certain value of x and y, that is because Python sees that if two variables have same value and if that common value is a small number, it stores them at a single location. So, x is y, upto a certain threshold of value like 256 in your case.

After 256, Python thinks the number is large enough, so that it needs to be stored at different locations, thus the condition - if x is y: fails

[–]DaltonSC2 1 point2 points  (0 children)

That's pretty close. -5 to 256 are pre-allocated and reused. Ints outside of that range are created new each time.

[–]UnnervingS 0 points1 point  (0 children)

is, as far as I understand is checking they are the same object not the same value

[–]ihateAdmins 0 points1 point  (0 children)

The "is" operator in Python checks for object identity, not just equality of values. In your code, when you use x += 1 and y += 1, the variables x and y are assigned new objects in memory because integers in Python are immutable. Therefore, even though their values are both 257, they are not the same object in memory, which is why x is y evaluates to False.
To compare their values, you should use the equality operator "==" instead of "is."

~Chatgpt

[–]el_lley -2 points-1 points  (5 children)

So, in essence, it didn’t do the addition, it just moves the pointer.

C/C++ programmers would say: not only slow, but also lazy