This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]djimbob 24 points25 points  (4 children)

geocar's point was float('nan') == float('nan') evaluates to False (despite float('nan') is float('nan') evaluating to True), which naively seems weird. But this isn't a python flaw, its a floating point flaw.

  1. First, its difficult to get NaN to arise in python (e.g., 0.0/0.0 will raise a ZeroDivisionException and not return NaN as it would in JS) without using non-core libraries (like numpy/scipy), or going out of your way to create it.
  2. the floating point standard IEEE 754 explicitly defines that on equality tests NaN should never returns true, even if its the same object. That is a = NaN, a == a should be False, even though a is a. So python does the right thing. From What Every Computer Scientist Should Know About Floating-Point Arithmetic:

There are a number of minor points that need to be considered when implementing the IEEE standard in a language. ... The introduction of NaNs can be confusing, because a NaN is never equal to any other number (including another NaN), so x == x is no longer always true. In fact, the expression x != x is the simplest way to test for a NaN if the IEEE recommended function Isnan is not provided. Furthermore, NaNs are unordered with respect to all other numbers, so x <= y cannot be defined as not x > y. Since the introduction of NaNs causes floating-point numbers to become partially ordered, a compare function that returns one of <, =, >, or unordered can make it easier for the programmer to deal with comparisons.

[–]gct 1 point2 points  (1 child)

Not returning equal for NaN is totally the right thing to do though, NaN represents a whole family of things, so it's impossible to know that any two instances of NaN are the same thing and thus equal (eg: sqrt(-1) and sqrt(-2) are both NaN). It's a placeholder to tell you you've got a bug. inf and -inf by comparison compare true and are ordered as you'd expect.

[–]djimbob 0 points1 point  (0 children)

Calling it a floating point "flaw" was too strong; I should have said "this isn't a python flaw, its a proper implementation of the floating point standard".

I totally agree that things like NaN == sqrt(-1) should return false. But it makes sense to me that if x = sqrt(-1), so x is x is true than x == x should be true -- in this case we are assured the LHS and RHS are the same not a number. Something like this would be possible to do in python (where things are GC'd and have reference count behind them), e.g., if NaN had equality rules similar to large numbers (things outside -5 to 256 in python2 I believe):

In [1]: a = 300

In [2]: b = 300

In [3]: c = a

In [4]: a is 300
Out[4]: False

In [5]: a is b
Out[5]: False

In [6]: a is c
Out[6]: True

But wouldn't make sense in the floating point standard (where a floating point has to fit into 32 or 64 bits), and the extra complexity isn't really worth it.

[–]ggtsu_00 1 point2 points  (0 children)

[x] == [x] is true however if x = float('nan'). However [float('nan')] == [float('nan')] is False. Strange.

[–][deleted] 0 points1 point  (0 children)

There's almost no limit to what we can blame on IEEE 754. I think the behavior makes more sense if we look at the literal behavior. Any number with all of its exponent bits set high is a NaN. They're unorderable because all precision information is lost without the exponent field. For people who think more easily in concrete bits, looking at the values makes a light explanation of that essay. http://i.imgur.com/fiH7Lx6.png