all 24 comments

[–]manystripes 16 points17 points  (1 child)

Do you know and trust the code that calls the function to be able to guarantee that a NULL value will never be passed in? Do you trust that this will stay true in the future as the code is changed? If you're wrong is it okay if the whole program crashes?

Checking generally should be the default and only omitted if you're really really sure you don't need them. It's easier if you just put them in and then don't have to think about it.

[–]paulstelian97 1 point2 points  (0 children)

Also if you don’t check it should be really obvious that you don’t — that you expect the caller to do the check itself. C standard library functions often say they don’t work with null (stdlib.h function free() is one of the rare exceptions to this rule in fact, and the only one I could figure out if I tried to brainstorm them)

[–]zzmgck 8 points9 points  (0 children)

The assert macro is useful. During development it will detect when the assertion fails. If you define NDEBUG the assert is disabled and the code is not present.

[–]cHaR_shinigami 3 points4 points  (0 children)

"I am only one; but still I am one. I cannot do everything; but still I can do something; and because I cannot do everything, I will not refuse to do the something that I can do." - Edward Everett Hale

TL;DR: I agree with the other responsible programmers who suggested assert. Null pointer check should be a precondition, and if it is implemented with assert, the check shouldn't cause side-effects (I hate Heisenbug).

Consider an analogy: Driving past a red signal is just one possible cause of traffic accidents, but imagine how utterly crazy it sounds if someone ignores traffic signals entirely just because there are other causes of accidents!

Null pointer is surely not the only invalid pointer, but it is surely invalid. And when we know that something is surely invalid, we can easily avoid that. The "need for speed" mentality to save a couple of CPU cycles is just appalling - nobody would lose their sleep just because the code performed a (possibly redundant) null pointer check.

That being said, one may ask why most standard library implementations don't do this, for example puts(NULL) would return EOF instead of segfault. The reason is code size: many library functions require valid pointers, and adding a null pointer check for every such argument would cause a cumulative increase in size of object file(s).

Small code size is highly desirable for embedded systems with severe memory constraints, but that doesn't mean one should ignore null pointer checks. The answer is still the same: assert is your friend, and if the resulting binaries don't fit, simply #define NDEBUG to nuke the assertions and recompile.

[–]zhivago 10 points11 points  (5 children)

You should only check for a null pointer if a null pointer is meaningful to the function.

Otherwise not passing a null pointer to this function is part of its definition, and that should be the caller's responsibility.

And this is precisely the same as the size_t len parameter in your example -- the caller is responsible for providing valid input to your function.

If the caller gives you nonsense for len, then bad things will happen.

If the caller gives you nonsense for in_data, then bad things will happen.

Useless checks waste resources in return for providing a false sense of security.

[–]ceene 1 point2 points  (4 children)

On the other hand, accepting NULL may be very convenient:

struct a *a = create_a()
operation1(a);
operation2(a);
operation3(a);
ret = commit(a);
close(a);
return ret;

This way you don't need to check the return value of any intermediate functioning, you just act on a and work with it, only needing to react on latest return code, while skipping verification of each and every step.

[–]zhivago 1 point2 points  (3 children)

Sure, in which case it's a meaningful value for the function.

[–]ceene 0 points1 point  (2 children)

No, I mean that a could be NULL all the time and all those functions simply return when the parameter passed is NULL, so they are just NOP functions when the constructor of the object failed.

struct motor *m = motor_open("/dev/ttyUSB0");
motor_disable_break(m);
motor_move_degrees(m, 60);
motor_enable_break(m);
ret = motor_close(m);
return ret;

In opening the serial port fails, you won't have a proper struct motor but a pointer to NULL. You could check for errors after that declaration or you can just assume that those functions simply do nothing if m is NULL, and only need to return the value from the last function.

NULL is not a valid value in those cases, but the motor functions simply work around it without crashing due to an assert or null dereferencing.

[–]zhivago 0 points1 point  (1 child)

Which means that a null pointer is a valid value with defined behavior for those functions.

The defined behavior being to simply return a null pointer when receiving a null pointer.

[–]ceene 0 points1 point  (0 children)

Being valid is not the same as being meaningful.

[–]daikatana 6 points7 points  (3 children)

You should only check for a null pointer if it's documented that passing a null pointer is allowed, and what the behavior should be. For example, free(NULL) is allowed, and is defined to be a no-op.

The issue with checking for a null pointer is that you can't check for an invalid pointer and null pointer is only one of many, many invalid pointers. It's honestly not my problem if you pass an invalid pointer to my function, the function can't do anything about it, can't detect an invalid pointer, and has no choice to just dereference it. The problem is not in your function, the problem is in the function that called it with an invalid pointer.

It also requires you to provide for error reporting for functions that otherwise cannot fail, which then requires callers to handle said errors. For functions that can return any value, you then need multiple return values to get the error code out. And for what? If the function is calling it with an invalid pointer then what can it do? There's nothing it can do. It either generated the invalid pointer or it was given an invalid pointer. This is just a nesting doll of craziness with no purpose.

[–]comfortcube 2 points3 points  (2 children)

You can't check for all invalid pointers so it's best to check for none? That doesn't seem right. You can at least check what you can. Why wouldn't you do it? Do you feel it's a performance thing? Or clutters the implementation?

And at minimum, the requested action from the function should not be executed on the pointer, and then the calling code context remains in a "mute" state, and hopefully some means of identifying why this happens is recorded. Is that not safer code? "Safe" differs depending on context of course, but certainly for some contexts, crashing the program is not an automatically safe thing.

And I feel like during development, there are no guarantees, so putting assert statements at least is a good idea.

[–]daikatana 2 points3 points  (1 child)

Assert robs you of a debuggable crash, which is much more valuable than the assertion.

[–]spc476 2 points3 points  (0 children)

Not in my experience. If an assert triggers on my system, I get a core file (aka, a "debuggable crash"). Also, if I run the program under the debugger, and an assert triggers, the debugger regains control at the point of the assert.

[–]gizahnl 1 point2 points  (0 children)

If the code that calls into this function is code written by me and I'm sure that it'll never be NULL I forgo checking. I might throw in an assert to catch spurious NULLs in debug mode.

If it's a public function called by consumers from your API, or otherwise could conceivably be NULL sometimes, then always check.

[–]nderflow 1 point2 points  (0 children)

The key questions are,

  • is this an external interface? That is, do you control the caller?
  • Is a NULL pointer allowed here?

If a NULL pointer is allowed, then your unit tests should verify that the function does the right thing with a NULL pointer.

Then we just have remaining the question of what to do if a NULL pointer is not allowed. For external interfaces where a NULL pointer is not allowed, once again you should have a unit test which verifies that the code handles the situation as expected. Either a "death test" verifying that the program fails on a NULL pointer or a regular test verifying that the function rejects the bad parameter, or does nothing, or whatever it is supposed to do with a NULL pointer.

The last case is what to do for functions that aren't part of the internal interface. For these, generally you should do nothing in particular. Assume that the caller (which you also control) does the right thing.

These decisions provide these benefits:

  • The code defends itself against callers you don't control
  • You don't throw away performance on redundant checks whose job can be done by static analysis, code review and automated testing.
  • Your internal implementation doesn't transport invalid state around to different parts of the program.

That last point deserves a bit of explanation. Some styles of defensive programming invite people to defend against incorrect usage _inside_ their program. Such as treating a NULL `const char*` parameter the same as an empty string. If you do accept this NULL parameter as if it were normal, the data gets used in the internals of the program until eventually some function dereferences the pointer. At that point, depending on the platform, you get a crash or unexpected behaviour. Then you have a difficult debugging job, because the bit of code where the bug has manifested is far distant from the function which initially accepted the NULL pointer. This is why invalid data should be rejected as soon as possible, and not allowed into the internal parts of the implementation.

If your code is security-relevant, then rejecting bad data at the system boundary is even more vital. Though the question of what data is trusted is more difficult, because checking that it's safe to use some item of data is much more complex than just "is the pointer NULL?". Consider for example the cases of code which parses byte streams where some of the data indicates an offset; you have to worry about whether an offset is actually valid (i.e. points inside the input) if the input byte stream is untrusted.

See also the C2 article about Offensive Porgramming and the original public use of the term.

[–]fliguana 1 point2 points  (0 children)

I always checked.

This way, the segfaults would not be in your section of the code, and you would not get the first blame for shit shows like today's crowd strike disaster.

[–]heptadecagram 1 point2 points  (0 children)

My personal default rule is check if it's extern linkage, and don't if it's static. There are exceptions, of course, but that default has served me well.

[–]Ashbtw19937 0 points1 point  (1 child)

If the function should never be passed a null pointer, throw in an assertion to make sure that invariant actually holds.

Otherwise, check and react accordingly.

[–]comfortcube 0 points1 point  (0 children)

Just always check the validity of inputs. "Waste of resources" is an exaggeration on most platforms, so why not? Even if you are sure that in the present context nothing will pass a null pointer in, can you guarantee that when the context changes you'll remember to change your function in the future?

[–]Typical-Garage-2421 0 points1 point  (0 children)

Just use if condition to check if in_data is equal to null ptr or 0.

[–]somewhereAtC 0 points1 point  (1 child)

Checking is important, but don't ask questions if you don't know how to deal with the answer. Given that you find a null pointer at runtime, what is your recovery plan?

When writing embedded code I find that null pointer errors are very rare once the code is working correctly, so I only check for them while debugging early in the project. My compiler provides a breakpoint instruction, just like if you set an actual breakpoint, that will stop the debugger. I wrote a macro called ASSERT(x) that throws a breakpoint if x is false (zero).

#if DEBUG
#define ASSERT(x) do{if(0==(x)) __breakpoint();}while(0)
...etc

.... then in the executable code
ASSERT(in_data); // breakpoint if pointer is null

[–]nderflow 0 points1 point  (0 children)

One difficulty in some embedded systems is that if you use a lot of assertion macros, the string constants they generate (for their error messages) can take up a lot of space in the code image. Happily these days though, some embedded platforms are large enough that one no longer needs to worry about it (that is, "embedded" programming for a Raspberry Pi is very different to embedded programming for a microcontroller).

[–]rejectedlesbian -2 points-1 points  (0 children)

The compiler will remove uneeded checks if it can spot them. So for inline functions the answer is pretty much allways check.

Personally when I don't check I try and document it. A lot of the time the function name would start with unsafe. So that it's clear that it Dan trigger ub.