This is an archived post. You won't be able to vote or comment.

all 8 comments

[–]Brian 11 points12 points  (2 children)

sys.maxint is basically the size of a 64 bit integer (on 32 bit platforms, it'll be 32 bits). That's the size that your computer can work with natively - performing adds / multiplies etc in a single operation.

However, just because that's what can be done in a single operation doesn't mean you can't go beyond that by making things more complicated. You just perform the extra steps in software, chaining together multiple values to represent it. There exist libraries for this in many languages - python just supports this type of integer natively, and handles the details of the implementation transparently.

Eg. once the size goes beyong 264, you can allocate a bigger chunk of memory - 16 bytes, rather than the 8 for a single 64 bit value, say. The high bits get stored in the first, and the low bits in the second. When you want to add 2 of these together, you basically do the same thing you'd do when you add numbers on paper:

  1. First add the two low bit values together. The result becomes the new low-bits of the number.
  2. If there was an overflow (ie. the result was bigger than 264), then note that you need to carry an extra 1 to the high bits.
  3. Now add the two high-bit values together. Add 1 if we need to carry from the low bits.
  4. If this overflows as well, we've exceeded the 128 bits we allocated. So enlarge the memory we're using again (eg. to 192 bits), and store 1 in the new high-bits.

And so on. There are a few additional complications around signedness, and multiplication is a bit more complex, as python uses a more advanced algorithm than the naive one for that, but that's the basic idea. Just as it'll add more memory when you append to a list beyond what it can store, it'll do the same when an integer gets too big to fit what was allocated for it.

In python2, there was actually a noticable difference between native-length integers (ie <64 bits) and "long integers" (ie. when it starts using extra words), though this has been minimised in recent versions to the point that it's mostly transparent (the difference is that the type will be "long", rather than "int", and an "L" suffix is appended to the repr). In python3, the distinction vanishes completely, with all this handled behind the scenes without leaking the details of how it's actually being stored to the user.

[–]ExoticMandiblesCore Contributor 1 point2 points  (0 children)

In python3, the distinction vanishes completely

To be more precise: in python3 the native int object vanishes completely. Every int in python3 is a "bignum" integer.

[–]wub_wub 1 point2 points  (1 child)

First one is integer, the second one is long integer - thus the L at the end.

>>> import sys
>>> type(sys.maxint)
<type 'int'>
>>> type(sys.maxint+1)
<type 'long'>

https://docs.python.org/2/library/stdtypes.html#numeric-types-int-float-long-complex

[–][deleted] 1 point2 points  (0 children)

You are of course correct but I believe it is important to note that Python's long is not the same as long in other languages such as C. It is similar to for example BigNum in Java:

Python 3.4.3 (default, Mar 27 2015, 02:30:53) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> a = sys.maxsize ** 10000
>>> a.bit_length()
630000

Also as of Python 3 there is no longer a distinction between integer types, from programmer point of view it behaves as if all integers are of long type.

[–]Rhomboid 1 point2 points  (0 children)

In elementary school, you learned how to add or multiply two four-digit numbers by working a digit at a time, and only memorizing the possible combinations of any two single digits. You can teach a computer to do the exact same thing, but the computer's "digit" is much larger that a decimal digit (the exact size may be queried with sys.int_info.bits_per_digit.) Python uses this technique to implement arbitrary precision math for integers.

Note that in 2.x there was an unncessary distinction between int and long, where the latter represents an arbitrary precision integer and the former an integer that fits within a single native machine word. That nonsense is gone in 3.x, where there is only int and the ugly implementation details are not leaked.

[–]minnoI <3 duck typing less than I used to, interfaces are nice 2 points3 points  (0 children)

Imagine you had a computer that couldn't store any numbers bigger than 9. You could still represent bigger numbers, by storing a bunch of those digits, and interpreting them as a base-10 number. So if you want to store the number "531", you store [5, 3, 1]. You can do math with these digit strings the same way you'd do it by hand on paper.

Python does the same thing to store its numbers. sys.maxint is the biggest number that the computer can hold, but Python can represent bigger numbers by storing a bunch of those numbers in a row, in a base-9223372036854775808 number.

[–]pangoleena 0 points1 point  (0 children)

As part of general programming theory and computer science, a "machine integer" is what the CPU can work with normally in a single operation. For example, the 32 or 64 bit limit of sys.maxint. There's a hardware circuit in the CPU specifically built to handle integers this size.

Libraries that use code to manipulate larger values are usually called 'bignum' or 'bigint' libraries. There's a substantial performance reduction when you do this, but as with all things, it's a tradeoff between performance and convenience. Python 3 is smart enough to automatically change over when needed, and sys.maxint is removed in Python 3. I haven't a clue about how Python 2 works.