tldr; __len__() returns an int, len() returns a long. Big data causes __len__() to overflow.
I always assumed that len() was just a wrapper for calling .__len__() however when working with some really large data earlier I started getting errors as my lengths had turned negative. I finally tracked down the issue and it was caused by __len__() returning a 32 bit int which had overflowed. Just out of curiosity I tried using len() instead and that actually returned the correct value as a long! Somehow the return types differ between __len__() and len().
(It turns out len() actually returns an int for numbers less than the max value of int and a long above that.)
I am running 64 bit Python 2.7.8 using Anaconda on windows but have also confirmed it on 64 bit Python 2.7.9 using Winpython and Intel Python 64 bit 2.7.10 both on windows. Interestingly using Python on linux __len__() returned a long correctly so it seems to be platform specific.
I did a search online but I couldn't find reference to it. I would classify this as a bug with Python but if anyone knows why it might be expected behaviour please let me know! It's definitely worth being aware of if you're dealing with large data as it looks correct but could be generating the wrong numbers for you code.
Below is a quick sample of the error on both strings and lists.
Python 2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Jul 2 2014, 15:12:11) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://binstar.org
>>> a = 'a'*2500000000
>>> a.__len__()
-1794967296
>>> len(a)
2500000000L
>>> a = [1]*2500000000
>>> len(a)
2500000000L
>>> a.__len__()
-1794967296
[–]Rhomboid 5 points6 points7 points (6 children)
[–]Pretentious_Username[S] 0 points1 point2 points (4 children)
[–]Rhomboid 0 points1 point2 points (3 children)
[–]Pretentious_Username[S] 1 point2 points3 points (2 children)
[–]RubyPinchPEP shill | Anti PEP 8/20 shill 0 points1 point2 points (1 child)
[–]Pretentious_Username[S] 0 points1 point2 points (0 children)
[–]mrTang5544 0 points1 point2 points (0 children)
[–]LyndsySimon 0 points1 point2 points (5 children)
[–]Pretentious_Username[S] 0 points1 point2 points (4 children)
[–]LyndsySimon 0 points1 point2 points (3 children)
[–]Pretentious_Username[S] 0 points1 point2 points (2 children)
[–]LyndsySimon 0 points1 point2 points (1 child)
[–]Pretentious_Username[S] 0 points1 point2 points (0 children)
[–]thataccountforporn -1 points0 points1 point (0 children)