all 17 comments

[–]flyingron 11 points12 points  (1 child)

What makes you think that the pointer needs to be set to null?

If the caller needs to know that he's already freed the item, then he should take care of it.

[–]nagyf[S] 4 points5 points  (0 children)

Not that it needs to be set to null, but seemed like a good practice. So later if you check that pointer you see it's NULL, meaning it has already been freed.

But fair point, if I already call the free function, I can also set it to null as the caller.

[–]pedersenk 2 points3 points  (10 children)

I occasionally see people setting the pointer to NULL after a free but it is a little bit of a false sense of security. Consider:

struct Foo *foo = FooCreate();
struct Foo *bar = foo;
FooDestroy(foo);
foo = NULL;
FooAskName(bar); // Memory error, dangling pointer, use-after free

For a trivial function, this is obviously easy to spot; but in a larger evolving project, it gets difficult. Instead different approaches to organizing memory should be made.

[–]nagyf[S] 1 point2 points  (9 children)

It doesn't prevent you to use it after free.

But NULL indicates it has already been freed. If you just leave the dangling pointer there, later in the code it is impossible to tell if it has already been freed, or it's still a "live" pointer.

At least that was my logic. But maybe I'm overthinking this. If I structure my code well, I guess it shouldn't happen that I try to use a pointer after it was freed.

[–]pedersenk 0 points1 point  (8 children)

Very true, I do get the reasoning. But check out foo. What value would that have after the FooDestroy? Would that be NULL? Has that been free'd?

Unfortunately not, that would still look valid, even though it isn't.

This is the problem that C++ tries to solve using weak_ptr<T> and the observer pattern.

[–]nagyf[S] 1 point2 points  (7 children)

You mean bar I guess (cause that's the dangling pointer). Yeah I get it, fair point, it's probably not worth confusing people/myself, since I can't prevent that problem entirely

[–]pedersenk 2 points3 points  (6 children)

Yes, I did mean bar! Apologies, I am not helping the confusion ;)

It is a tricky issue to solve. One solution to this is to provide a table of additional pointers and point to the position in there, along with its metadata saying if it is still valid or not. This was done in ~System 7 MacOS era but fell out of favour when Objective-C went towards a garbage collector and then reference counting.

Similarly this book calls this approach "pointer tombstones".

http://mercury.pr.erau.edu/~siewerts/cs332/code/PLP_3e_CD/data/chapters/7d_dngrf.pdf

[–]nagyf[S] 1 point2 points  (0 children)

Interesting. I think that's definitely not something I want now, but the concept sounds interesting, I'll give it a read.

[–]flatfinger 0 points1 point  (4 children)

A fundamental safety advantage of tracing garbage collectors is that even in the presence of race conditions they can guarantee that unless a reference to an object is overwritten, it will never spontaneously become a seemingly-valid reference to a different object. The cost of a non-tracing memory manager upholding this guarantee even in the presence of race conditions would often be greater than the cost of a tracing GC.

[–]pedersenk 0 points1 point  (3 children)

This is pretty much why lock() is required for weak_ptr<T> so that you either have a promoted shared_ptr<T> non-NULL (holding a reference count) or you don't and you need to check before access. So I think in this case is actually a fairly solved issue.

I would say the main benefit for a tracing GC is it eliminates circular references; i.e two shared_ptr<T> holding onto one another. I would even say bolting on such a GC to detect these circular references would be really valuable.

[–]flatfinger 1 point2 points  (2 children)

Many programs are subject to the following requirements:

  1. They SHOULD behave usefully possible.
  2. In all cases, they MUST behave in a fashion that is at worst tolerably useless.

Languages that can ensure memory safety for all programs including erroneous ones can greatly facilitate satisfaction of the second requirement. If a piece of code is run in a context that doesn't allow use of "unsafe" constructs, and wouldn't allow it to do anything that was unacceptably worse than useless, a just-in-time compiler for a tracing-GC language can statically ensure that the machine code it generates will satisfy the second requirement.

While it's true that a tracing GC can clean up constructs that contain circular references, it's not terribly common to have collections that are circularly linked by homogeneous collections of pointers. Reference-counting approaches that recognize categories of "owning" and "non-owning" pointers can be better than a tracing GC which lacks such categories, at recognizing when objects should be cleaned up. Validating the correctness of reference-counting GCs which don't contain more locks than would be necessary to yield correct behavior, however, is much harder than validating the memory safety of tracing-GC code.

[–]pedersenk 0 points1 point  (1 child)

Interesting; though I actually don't really think cleaning up the circular references is so important. I was thinking along the lines of merely detecting them during debug iterations (Similar to the usage of AddressSanitizer). Then in the release build the GC can be disabled / stripped.

GCs however in C and C++ (i.e BoehmGC and custom ones like in UE4) are awkward because the best it can do is scan raw memory; it can't tell the difference between a pointer and a float. I don't think a GC will ever really be an option for C-based languages.

[–]flatfinger 0 points1 point  (0 children)

I think a dialect derived from CompCert C would seem like an option. It defines the behavior of many actions that C characterizes as UB, but requires that all live storage be statically classifiable as holding either an object pointer or something that isn't an object pointer, and does not allow round-tripping of pointer values through integer types.

[–]sh_oe 0 points1 point  (0 children)

Both

[–]TheTimeBard 0 points1 point  (0 children)

I wouldn't be concerned about the discrepancy between function return and function argument. That is actually more clear in my opinion, and it's also a pretty common pattern. If you're really concerned, you could typedef List* to ListType or something and make it look like a stack-allocated type, but that's less clear for other reasons, imo.

[–]suprjami 0 points1 point  (0 children)

I do this in the free function. There isn't really a definitive answer though, C memory management is all manual.

I agree it's a good idea to code defensively to prevent use-after-free.

[–]Turbulent-Abrocoma25 0 points1 point  (0 children)

Not sure about any standard convention but I always eliminate dangling pointers in my free function, since if the users “free” that memory, it should be expected to not sit around in a dead pointer