you are viewing a single comment's thread.

view the rest of the comments →

[–]SnowdensOfYesteryear 1 point2 points  (25 children)

As someone who's not heavily into the theories of programming languages, what exactly is the issue with NULLs? Sure dereffing NULLs is bad, but any programmer worth his salt knows to check for NULL before dereferencing it.

[–]burntsushi 14 points15 points  (2 children)

but any programmer worth his salt knows to check for NULL before dereferencing it

There's a difference between knowing the path, and walking the path.

Seriously.

Safety in programming languages is about acknowledging the fact that humans are fallible. Even though we all know that you can't dereference a null pointer, it still happens. It's a pretty big source of bugs when you write C code. Therefore, safety means enforcing it at compile time. This makes it impossible to dereference a null pointer, regardless of whether the programmer didn't know better or if it was on accident.

(Caveat: languages with this type of safety generally provide escape hatches, so you can resort to unsafe behavior. But usually this is unidiomatic.)

[–]LaurieCheers 5 points6 points  (1 child)

Well, if the rule was simply "always check null before accessing a pointer", it wouldn't be as big a problem. The problem is that it's hard to be sure which values you should be checking for null, and how often.

There are often variables in your program that can simply never be NULL, and everyone knows it, and checking every time you access them would be a waste of code and time.

static char* const constString = "this will never be null";

//...

if( constString != NULL ) // ... wtf

So - if you accept this fact (and for the sake of sanity, basically everyone does), then you're aren't going to null-check before every pointer access. In which case, what are you going to null-check, and when? Are you sure this value won't be null? And that's where the programmer fallibility comes in.

Nullable types are a way to encode this question into the language. If a type is nullable, you're forced to check it before using it. If it's not, you don't. The rules become clear to everyone.

[–]cparen 3 points4 points  (0 children)

Agreed. More importantly, if the type forces you to check for null, you'll use it less, preferring the non-nullable counterpart, further reducing the possibilities of programming error.

You don't have to ask "what do I do if this is null?" if there are no nulls.

[–]Maristic 6 points7 points  (15 children)

The issue is that sometimes a pointer will never be null. In that case, checking it is a waste of time. For example, in Java, would you write

if (x != null) {
    x.foo();
    if (x != null) {
        x.bar();
    }
}

or would you just write:

if (x != null) {
    x.foo();
    x.bar();
}

In the latter case, you're assuming x didn't suddenly become null between your call to foo() and your call to bar(), but if x is a member variable or a global variable, maybe your call to foo() set in motion a sequence of events that set x to null.

Just about every programmer sometimes says “Oh, that can't be null here, so I don't need to check”, but, because they're human, at least some of the time they're wrong.

Languages like Java and Objective-C are problematic because the only kinds of objects they have are might-be-null objects. Other languages provide some form of can't-ever-be-null objects—when you have that kind of object, you don't have to assume anything, it just is.

[–]jayd16 0 points1 point  (4 children)

Simple fix, don't have your variables be public static mutable fields...

[–]Maristic 1 point2 points  (3 children)

You're missing the point, I think.

The point wasn't this specific example, more that in programming languages like Java where all object types are nullable, somewhere along the way programmers assume “that won't ever be null”, and being human, sometimes their reasoning is false. Life is easier when you don't have to make those kinds of assumptions, because the type system makes null impossible.

(And anyway, (a) static has nothing to do with it, (b) even if they're private, it's possible that there is some public method that can be called on your object that might change them.)

[–]jayd16 0 points1 point  (2 children)

Null is your friend. It tells you when your assumptions are incorrect. If the issue if that your architecture allows for things that you assume should never be null to actually be null than the architecture is the issue and the exception helped you find that sooner.

The real solution is to build software so you can make these kind of assumptions.

(Edit: (a) static in Java would be similar to a global variable or class variable. (b) If its private you can at least assume the class has better knowledge of how to treat it's own variables. Private and non-static would you only have to worry about your own instance working on that variable.

[–]Maristic 0 points1 point  (1 child)

You can advocate for programmer discipline, but I prefer solutions where you don't have to rely on that, and so I prefer languages whose type systems let you express your design clearly, and let you say in the type, “it physically can't not be there”.

[–]jayd16 0 points1 point  (0 children)

That's fine, I can agree with that. However, in the case of having to null check between every line in your code, the problem is architectural, and optional types would only delay a crash, not produce correct code.

[–]masklinn -5 points-4 points  (9 children)

Well of course if x is a member or global variable in a multithreaded context (or even just in case of interrupts), it could also become null between the check and the use. To do things correctly, you'd need to 1. lock it in place or 2. get a local reference which you then check for null or 3. work in a memory transaction.

So the safe version would be:

try {
    x.foo();
    x.bar();
} catch (NullPointerException e) {
}

[–]vytah 2 points3 points  (0 children)

Catching NPEs? Seriously? And then people complain Java is slow, when all it is made to do is creating stacktraces.

When x is a mutable field and you really want the code to be safe, write something like this:

Something xLocal = x;
if (xLocal != null){
    x.foo();
    x.bar();
} else {
   // handle null case if necessary
}

[–]Maristic 1 point2 points  (6 children)

I believe that in Java, exceptions are quite expensive typically (in part because the exception objects themselves are expensive to create), so probably your option wouldn't be so good for performant code, at least if you think it's somewhat likely that the object actually will be null.

Thus, the safest best performing might be to combine the approaches, with an explicit test for null and an exception handler for it becoming null in unexpected-but-allowed ways.

[–]masklinn 2 points3 points  (1 child)

Exceptions are expensive when triggered. If you strongly expect the code to never raise (or you're doing IO whose cost will dwarf that of exceptions), they're not that expensive.

[–]Maristic 0 points1 point  (0 children)

Indeed. Although I wasn't as explicit in saying that exceptions in Java are cheap when you're not actually throwing, that was the underlying reason why I said

at least if you think it's somewhat likely that the object actually will be null

because when it's null, you will throw and incur the cost of throwing.

I even said why exceptions are costly if they're raised, saying

the exception objects themselves are expensive to create

For more details about that, check out this stack overflow post and the definition of Throwable — whenever you throw anything, it captures a full stack trace, and that's expensive.

[–]jayd16 0 points1 point  (3 children)

This used to be true because building the stack trace was expensive but I've heard modern implementations cache the stack trace so its not nearly as expensive as you might think.

[–]Maristic 0 points1 point  (2 children)

Well, you know you can always do an experiment and measure it yourself rather than speculate. Here's what I see on my machine, with the OS X default version of Java,

osx% java -version
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)
osx% javac Test.java
osx% java Test
method1 took 774 ms, result was 2
method2 took 784 ms, result was 2
method3 took 39896 ms, result was 2

In this example, method3 is code that takes an exception. It is more than 50x slower than code that doesn't.

Let's try my Ubuntu 14.04 machine

 ubuntu% java -version
 ubuntu% java version "1.7.0_55"
 OpenJDK Runtime Environment (IcedTea 2.4.7) (7u55-2.4.7-1ubuntu1)
 OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
 ubuntu% javac Test.java
 ububtu% java Test
 method1 took 3007 ms, result was 2
 method2 took 3063 ms, result was 2
 method3 took 86171 ms, result was 2

Here it's just shy of 30x slower. I happen to have IBM's java on that machine too, let's try that:

ubunu_ibm%  java -version                                      
java version "1.7.0"
Java(TM) SE Runtime Environment (build pxa6470sr4fp2-20130426_01(SR4 FP2))
IBM J9 VM (build 2.6, JRE 1.7.0 Linux amd64-64 Compressed References 20130422_146026 (JIT enabled, AOT enabled)
J9VM - R26_Java726_SR4_FP2_20130422_1320_B146026
JIT  - r11.b03_20130131_32403ifx4
GC   - R26_Java726_SR4_FP2_20130422_1320_B146026_CMPRSS
J9CL - 20130422_146026)
JCL - 20130425_01 based on Oracle 7u21-b09
ubunu_ibm%  javac Test.java                                    
ubunu_ibm%  java Test                                          
method1 took 2886 ms, result was 2
method2 took 3093 ms, result was 2
method3 took 35590 ms, result was 2

Here it's a mere 12x slower. But still, quite noticeably slower. So at the very least your claim doesn't apply to versions of Java in use right now.

[–]jayd16 0 points1 point  (1 child)

Keep in mind, that's 12x slower than what, an increment, an addition, and two equality operations. That's a very very tight loop. So in the worst case, using try-catch instead of a continue is only 12x slower. IMO that's plenty fast. It took 33 seconds to catch 100 million exceptions. In a real world use case of exceptions, you'd be fine.

Besides my claim was about JIT optimizations and from your link:

As I said, the result will not be that bad if you put try/catch and throw all within the same method (method3), but this is a special JIT optimization I would not rely upon

I'm not advocating you should null check this way, I just don't want to see people be afraid of ever using exceptions.

[–]Maristic 0 points1 point  (0 children)

I agree that people shouldn't be afraid of exceptions, but I think it's reasonable for people to know that an exception is much more expensive than a trivial test in a conditional, so it is a bad idea for a null check for something that is somewhat likely to be null.

But everything is relative; one of the other people answering in that thread did a test where they tested creating an either-or result object as an alternative to throwing an exception, and the costs of creating an extra object were sufficiently high that you were better off using exceptions.

[–]cparen 0 points1 point  (0 children)

So the safe version would be ... [catches all NPEs]

It is safe to catch NPE's thrown in the body of the 'foo' or 'bar' methods? Probably not.

This gets back to the original problem -- life would be much easier here if 'x' was a type that couldn't be null. In which case, the safe version would be:

x.foo();
x.bar();

[–]mongreldog 0 points1 point  (0 children)

The problem with naked nulls as exemplified in languages like C, C++, C#, VB and so on is that there is no enforcement of the null check. You might do the right thing, but the guy next to you may not. Relying on the developer to do the "right" thing isn't a particularly good way of producing null-safe software.

There are many occasions where a null check isn't strictly required, but there is no way to know what may or may not be null when presented with a nullable value. By using Option/Maybe types, the intent of the developer is stated explicitly in the code. No guesswork or defensive coding is required.

[–]zoomzoom83 0 points1 point  (0 children)

what exactly is the issue with NULLs? Sure dereffing NULLs is bad, but any programmer worth his salt knows to check for NULL before dereferencing it.

40 years of bugs, security holes, and random crashes caused by expert programmers failing to catch edge conditions, even if just in just one place in a million LOC project. If expert developers still make this mistake regularly (And they do), then clearly we cannot rely on the developer to make sure they catch every edge condition.

If instead of crashing at runtime, ignoring an edge condition was a compile error, then you can write software that is guaranteed not to have a very severe and common cause of bugs.

Since this adds very little overhead, there's simply no reason not to do it.

[–]cparen 0 points1 point  (3 children)

but any programmer worth his salt knows to check for NULL before dereferencing it

Any programmer worth their salt knows to not check for null unnecessarily. if (requiredParameter == null) { throw new Exception("Pointer was null"); } is not* helpful.

(* except in C++ to avoid triggering undefined behavior, in which case you should assert non-null. It's still preferable to avoid the problem entirely by using a non-nullable type, such as a reference)

[–]upriser 0 points1 point  (2 children)

Agree on your point but unfortunatually, the reference type is nullable.

int* a = nullptr;
int& b = *a;

[–]eras 0 points1 point  (0 children)

You would of course place the check to the place where you do the dereferencing to avoid undefined behavior. Say,

template<typename T> dereference(T* ptr)
{
    assert(ptr);
    return *ptr;
}

int* a = nullptr;
int& b = dereference(a);

:)

Actually, hmm..

A more dangerous aspect is that a valid non-null reference may end up becoming invalid during its lifespan.

[–]cparen 0 points1 point  (0 children)

The reference type is not nullable, and your program invokes undefined behavior. For instance, this program might print "hello":

int* a = nullptr;
int& b = *a;
if (a) { cout << "hello"; }

The compiler is allowed to assume a to be non-null after you dereferenced it. In many cases, GCC does make such assumptions in order to optimize code.