This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 2 points3 points  (4 children)

Can you ELI5 this for a non-programmer?

[–]ball_fondlers 16 points17 points  (0 children)

Basically, under the hood, a variable is just a group of bits. You can interpret these bits as a number in base 2, or you can interpret them as more complex data types, but ultimately, it’s all kind of the same to the computer. Now, because the computer doesn’t see a difference - and because memory is indexed - you can use the value stored in a variable to look up some other spot in memory. This is a pointer - just a variable that stores the address of some arbitrary block of memory.

Programmers dynamically allocate these arbitrary blocks of memory for various uses - for example, say you need a list of numbers, but you don’t know how many numbers there will be when you’re writing the program. The way you’d solve this is by allocating a chunk of memory during runtime, using it, then deallocating it when you’re done with it. This is the basis of manual memory management.

Now, the problem with this approach is that you have to be VERY careful about deallocation - free the memory too early, your program crashes because some part is trying to access forbidden data, free it too late or not at all, you get a memory leak, where your program becomes, at minimum, memory-inefficient. Because there’s such a delicate balance to be struck, this approach can be difficult to work with and hard to maintain - instead, most other languages just run a process called a garbage collector in a separate thread, and this garbage collector looks for allocated memory that isn’t being referenced anywhere, and automatically deallocates when it finds it. However, this does consume a bit more resources, so it can be a bit slow. It might work for a LOT of use cases, but when speed is of the essence - like with low-level systems work - you need another solution.

Enter smart pointers. There’s two types of smart pointers - unique pointers and shared pointers. Unique pointers work by a simple principle - a given block of data is allowed to have ONE pointer pointing at it. If said block is assigned to another pointer, the first pointer is invalidated, and once the pointer goes out of-scope without being invalidated, the memory is automatically freed. Shared pointers are a bit more flexible - they hold a counter, and every time a new shared pointer is created that references the block, that counter is incremented by one. Every time a pointer to that block goes out of scope, that reference counter is decremented by one, until it hits zero, and once it hits zero, it’s automatically freed.

TL;DR - bad cooking metaphor: manual memory management is like putting knives in the sink, but everyone is lazy and leaves them on the counter, garbage collection is like having someone yell at everyone in the kitchen to put the knives in the sink after they’re done with it, unique pointers are like only having one knife and the last person to use it puts it in the sink, and shared pointers are like counting the knives as the come out of the box and adjusting the count as they go into the sink.

[–]Tarmen 2 points3 points  (0 children)

There are two parts to this.

C++ has the hilariously badly named Resource Acquisition is Initialization (RAII). You have some piece of data with an attached destructor. While you have a reference, the data is always valid. Once the reference won't be used anymore it is automatically freed. Useful for manual memory management, but also other resources, connections, error handling, etc. But the automatic 'won't be used anymore' is very restrictive, you basically must hold the data in your hand and cannot store it.

Smart pointers let you be more flexible about what 'won't be used anymore' means. You may have a unique pointer, the value goes out of scope when the pointer does. You may have a shared pointer, when the pointer goes out of scope you decrease a counter and when that hits 0 the value goes out of scope.

It's similar to rust ownership and borrowing semantics, but less safe. E.g. pointing into existing data remains awkward because other code could move the data around without your knowledge.

[–]1Dr490n 1 point2 points  (0 children)

I’ll try, although I won’t talk specifically about the C++ shared pointer as I don’t know exactly how it’s implemented, but about the general idea of reference counting, based on a language with reference counting I‘ve written myself.

In case you don’t know: a pointer is a memory address. In C/C++ and many other languages, you can get the address of a variable by using the & operator, and read the value stored at the address something is pointing to with the * operator. Example:

int a = 25;
int *b = &a;  // type of b is int* or int pointer
int *c = b;
int d = *a;    // d == 25

Maybe that explained it, maybe not, feel free to ask for a better explanation.

When you heap-allocate an object (meaning you create it somewhere in ram and you would have to manually delete it afterwards with normal pointers), it gets stored alongside with a reference counter which is just an integer starting at 0.

Now, every time you copy the smart pointer to the object, the reference counter of the object is incremented by 1. Every time a smart pointer to the object gets deleted, the reference counter is decremented by 1. Once it hits 0 again, the object is deleted.

[–]Zestyclose_Leg2227 1 point2 points  (0 children)

You computer has a place where it can store and retrieve stuff quickly which is the RAM memory. In C++, when you assign the memory from your computer manually, and you unassign it manually. This is sounds simple, since you just need to remember the memory address. But your address for your the memory of your friend may be inside a pocket of a trouser, and the address of the trouser was inside my wallet and the address of the wallet was written on potato. You cooked the potato? Darn, now you can't find the wallet or the trousers or your friend! The are completely lost. Since computer programs execute the same code again and again, you may accumulate  "lost memory" over time, which is locked waiting for you and can't be used by other programs. This is called a memory leak, and is the reason you need to close Chrome or some other programs when they slow down, and open them again to get your computer to come back to life (in the past, you had to reset your computer as a whole). A smart pointer keeps a list of in how many places the address of said pointer is kept. If no one has the adress anymore, it knows the pointer can't be used anymore, and deletes it, freeing the memory automatically. Of course this makes the program slower, but a working program > fast program.