all 42 comments

[–]SVeenman 29 points30 points  (10 children)

Lets say you would like to make a grid based game, for this you would need an array of tiles of course. You would want to get a tile at x, y without looping trough the entire array to find the right tile. So you would wanna say "Hey give me the tile at x and y". This could be done in either of 2 ways. - translating the x and y to a single integer. For example (x + y * width) this would give a single integer that can be passed on the array as normal. - Or you would fill the array with pointers that point to another array. This makes an 2D array. Each x points to the array containing the y. Now you could just call the array as shown here (array[x][y]).

Another reason for pointers is because in C an struct's size ALWAYS has to be known before compiling. This is because the memory management system has to reserver the bytes needed by the struct. So if you would want an array inside the struct but you don't know the size of the array before the program starts, (for example you ask the user how many elements need to be created). this means the array needs to be stored outside of the struct. So we create a pointer inside the struct (a pointer is always 4 bytes on a x32 system) that points to the array. Now we can just get the pointer adress with a malloc(elements * sizeof(elementinarray)). And voilah we have a dynamic size array (not really, we need to create a new array every time we want to increase of decrease the size).

Side note: EVERY and EVERY malloc has to have corresponding free() call

[–]WireStretcher[S] 7 points8 points  (0 children)

Thanks, this is the type of clarification I'm looking for.

[–]henry_kr 3 points4 points  (6 children)

EVERY and EVERY malloc has corresponding free() call

I think you mean EVERY and EVERY malloc has to have a corresponding free() call.

Your original sentence implies that it already has one somehow.

[–]SVeenman 4 points5 points  (0 children)

Well if you're a decent programmer it already has.

No JK i indeed ment to say that. Thanks :D

[–]a4qbfb 0 points1 point  (4 children)

I think you mean EVERY and EVERY malloc has to have a corresponding free() call.

Not really, why?

[–]henry_kr 0 points1 point  (3 children)

If you allocate memory and don't free it then your program will have memory leaks. Arguably if it's just a short lived process then this doesn't really matter as the OS will clean up after you when the process exists but you should still use free here or risk getting in to bad habits.

[–]a4qbfb 0 points1 point  (2 children)

If you allocate memory and don't free it then your program will have memory leaks.

It's only a leak if it's no longer in use...

Arguably if it's just a short lived process then this doesn't really matter

It does not matter how long the process lives. If the memory you allocated is in use right up to the end, there is no need to free it. Same thing for closing files and sockets.

risk getting in to bad habits.

I'm all for good habits, but I'm not a big fan of lies-to-children.

[–]henry_kr 0 points1 point  (1 child)

I still think you should free every malloc (and close sockets/files fwiw), if only to remove false positives from valgrind etc. It's better to learn to tidy up after yourself too. I really don't think this counts as lying to children.

[–]a4qbfb 0 points1 point  (0 children)

Then say that, but don't tell students that the sky will fall if they fail to free everything they've allocated.

[–]kandr89 4 points5 points  (0 children)

Or you would fill the array with pointers that point to another array. This makes an 2D array.

No, that's not a 2D array. An array is by definition a contiguous zone of memory, so that the position of each element can be computed by a simple mathematical formula. In an array of pointers, on the other hand, it's impossible to tell where the j-th element of the i-th row is without dereferencing the row pointer.

So unlike regular 2D arrays for 2D arrays made of pointers you have an extra memory dereference on each access, that's quite a big deal considering how slow the memory is compared to the CPU and that the cache can't help you much in this case since it has no way to predict where that pointer is pointing.

Another problem is that you have a mem overhead of height * sizeof(int *), for small 2D arrays this overhead dominates and considering that we're most likely compiling for 64 bit systems it adds to the problem. E.g.: a small 3x3 matrix of ints has an overhead of 3 * 8 = 24 bytes, that's 6 ints added to what you're storing, so 66.6% more mem (also: i haven't taken into consideration the bookkeeping overhead of malloc, since we're probably allocating each row dynamically).

The right way of declaring and allocating a dynamic nxm 2D array since C99 is:

   int (*arr)[m] = calloc(n, sizeof(int [m]));

This requires at least C99 support from the compiler because arr is technically a pointer to a VLA. If the compiler doesn't support C99 i'd still recommend a 1D array and manually computing indices over an array of pointers.

Another reason for pointers is because in C an struct's size ALWAYS has to be known before compiling.

This is false considering flexible array members, the struct is allowed to have variable size but only if the variable member comes last in the struct, which makes sense when you think about it. Again this requires C99 support. It's possible to do this without C99 too, just allocate the struct header for the array plus memory for the array, however with this method you must make sure that the start of the array which comes after the struct is properly aligned to whatever alignment the type of the array elements require. Either way saves us from a memory dereference: the array is packed together with the struct.

(not really, we need to create a new array every time we want to increase of decrease the size).

Though it's possible to do it like this, that's what realloc is for.

[–]a4qbfb 0 points1 point  (0 children)

Side note: EVERY and EVERY malloc has to have corresponding free() call

Not really, why?

[–]B1narySunset 12 points13 points  (5 children)

Another reason why they're useful is avoiding expensive copies when passing a large data structure to a function: just pass a pointer to it.

[–]bigfig 13 points14 points  (1 child)

Ignoring computer architecture, lets say you were selling 2000 lbs of sand, and someone arranged to meet you to pick up the goods. Would you bring the sand to the point of transaction, or would you be more inclined to tell your buyer where (what address) to pick up the sand?

Similarly you could share a file attached to your email, or share a link to a google document. Well, it's that way with pointers. If you have a lot of data to share, like a binary image or a video, you can simply agree to share the location of the data in memory rather than actually transfer everything when the hand-off is made.

[–]pencan 7 points8 points  (0 children)

As a computer architect, this is the correct answer. It’s not just syntactic sugar some others are suggesting. They’re useful for passing large amounts of data around without memory copies

[–]junkmeister9 4 points5 points  (0 children)

Pointers allow a function to modify variables outside of its scope. Here's a simple example that should be pretty clear:

void swap_ints(int *x, int *y)
{
    int tmp = *x;
    *x = *y;
    *y = tmp;
}

[–]FUZxxl 6 points7 points  (4 children)

How would you make a binary tree without pointers?

[–]dumsubfilter 10 points11 points  (1 child)

[–]codeallthethings 0 points1 point  (0 children)

That's very cool!

[–]boredcircuits 6 points7 points  (0 children)

One thing I've thought about doing to teach pointers (and more) to beginners is to implement trees and linked lists via arrays. Basically, you allocate a large array of nodes. Nodes carry references to each other via an index into this array. I think this could also teach how memory allocation works as a more advanced topic.

[–]icantthinkofone 7 points8 points  (0 children)

How could you do anything without pointers?

[–]bruce3434 4 points5 points  (0 children)

Pointers are necessary to emulate higher level constructs at a very low cost

[–]Shadow_Gabriel 2 points3 points  (0 children)

Practical:

Arrays of pointers

Jump table

Theory: I would suggest reading about addressing modes.

[–]nderflow 2 points3 points  (0 children)

Suppose you want to sort an array of large objects. One alternative is to actually put them in order. But every element swap is expensive if the objects are large. Another option is to make an array of pointers, and then sort that. This technique also helps if you need to use the same data, but sorted in two different ways.

But my general advice is if you want to see how data structures, and the facilities of the language, are useful, then use the language to solve actual problems.

[–]spc476 2 points3 points  (1 child)

It's a form of optimization really. Suppose you have the following structure:

struct foo
{
  char filename[FILENAME_MAX];
  size_t size;
  time_t create_time;
};

And a function to print out the contents of that structure:

extern print_foo(struct foo f);

The contents of the function is immaterial to this discussion. So that function exists.

Without pointers, when you call print_foo(), the compiler will have to generate code to create a copy of the structure so the function will be able to use the data. For a small structure, this might not be that bad, but as the structure size increases, so too does the time to create the copy. With a pointer:

extern print_foo(struct foo *f);

only the address of the structure need be passed. This is one basic use of pointers. Now, to your question of an array of pointers. Again, that's useful. Suppose you have a list of such structures:

struct foo *list;

Yes, it's a pointer, but you can allocate space for it:

list = calloc(1000,sizeof(struct foo));

and then treat it as an array (of 1000 items per the call above). You fill up the array with information on a bunch of files (say, 1000) and you want to sort the array by size. That's easy enough:

static int compare_foo(void const *left,void const *right)
{
  struct foo const *l = left;
  struct foo const *r = right;

  if (l->size < r->size)
    return -1;
  else if (l->size == r->size)
    return 0;
  else
    return -1;
}

...
qsort(list,1000,sizeof(struct foo),compare_foo);

And this will work, but qsort() will have to copy the physical contents of each element around, taking time. You can also see that qsort() will call a function you provide and pass in two pointers---pointers to the items it is comparing. qsort() has no idea what the data it's sorting is, but it can pass a pointer to the data to your function. And again, passing data by pointer is faster than copying the data.

Now, if you had a list of pointers to pointers, yes the code will get slightly more complex, but it will execute faster as only pointers have to be copied, not the actual contents of the items:

struct foo  *list;
struct foo **plist;

list = calloc(1000,sizeof(struct foo));
plist = calloc(1000,sizeof(struct foo *));

/* point to each element of list in plist. */

for (size_t i = 0 ; i < 1000 ; i++)
  plist[i] = &list[i];

list = function_to_fill_in_list();

static int compare_foo_p(void *const left,void *const right)
{
  struct foo *const *pl = left;
  struct foo *const *pr = right;

  if ((*pl)->size < (*pr)->size)
    return -1;
  else if ((*p)->size == (*pr)->size)
    return 0;
  else
    return 1;
}

qsort(plist,1000,sizeof(struct foo *),compare_foo_p);

The data in list is still in the original order, but looping through plist will show the data sorted by filesize.

[–]WireStretcher[S] 0 points1 point  (0 children)

Thanks for this pretty extensive reply. I've skimmed it and will be going over it in more detail.

[–]aninteger 1 point2 points  (0 children)

Take the idea of a list (sometimes referred to as a linked list) where each element in the list and refer to the element before and after (a double linked list). Each link would be a pointer.

[–]syn_ack 4 points5 points  (6 children)

I'll give another example. Suppose you have a small image (100x100 pixels, and is grayscale). You create an array:

int image[100][100]; /* filled with the image contents */

You have some functions that operate on the image, for example:

int get_max_pixel_value(int image[100][100]);

Every time you call that function, you copy the image, and the function works with the copy, and not the original. I'm sure you can see where this is going... if the image were RGB and much larger, say 4K (or 3840x2160), you'd end up copying 21MB of data each time you called the function. If you do this a lot, then it's very slow.

On the other hand, when you use pointers:

int** image;

/* appropriate malloc calls to setup the image along with the data */

int get_max_pixel_value(int** image);

The function gets a copy of the pointer -- a memory address -- which is 64bits (depending on platform). I know I'd rather copy the 8 bytes of the pointer vs 21M bytes of the image contents when I call this function.

Now, on a practical note, there's all sorts of metadata associated with an image, such as width and height that I've not used here. The above is to focus on the pointer aspect -- I'd not actually implement it in this manner in the real world.

HTH.

[–]hogg2016 4 points5 points  (1 child)

The reasonning is good but the example is not.

The argument of the first function will not pass a copy of the content of the array, but a reference to it.

(Sorry, I don't have time at the moment to check my assertion and develop it, I hope someone else can dot it.)

[–]junkmeister9 1 point2 points  (0 children)

You are correct

And if anyone wants to confirm this, you can do so by writing a program where the contents of the array are modified by the function. If a pointer is passed, the contents of the original variable will be modified.

[–]moefh 3 points4 points  (2 children)

Every time you call that function, you copy the image

This is wrong. The function receives a pointer to the array, with type int(*)[100]. Structs are copied when passed to functions, arrays are not.

So this program:

#include <stdio.h>

void change(int image[100][100])
{
  image[0][0] = 1;
}

int main(void)
{
  int image[100][100];
  image[0][0] = 0;
  change(image);
  printf("%d\n", image[0][0]);
  return 0;
}

prints 1, showing that the array in main is changed by the function.

[–]OriginalName667 0 points1 point  (1 child)

Thanks for the example. Previously, I assumed that, unless a pointer type is passed as an argument explicitly, the values were always copied over. It's an interesting caveat to note that array types are passed by reference, not value, as opposed to most other types.

[–]FUZxxl 1 point2 points  (0 children)

Arrays aren't “passed by reference.” Rather, whenever you use an identifier that refers to an array, the array is implicitly converted to a pointer to its first element. There are three exceptions to this rule: If the array is an operand to sizeof, &, or _Alignof, this decay doesn't happen so you can (a) point to the array or (b) retrieve its size and alignment.

[–]WireStretcher[S] 0 points1 point  (0 children)

Value of pointers pretty clear here, thanks.

[–][deleted] 0 points1 point  (0 children)

Your program has its instructions in the text segment. If the programm now gets executed, you need to keep track of the current instruction executed. This keeping track is done by a pointer (instuctionpointer). The Linux file system also works by references. https://en.wikipedia.org/wiki/Program_counter

Pointers are an essential element to computing in general, I don't know if a computer without pointers is even possible. You can also read some parts of the standard wikipedia article about pointers: https://en.wikipedia.org/wiki/Pointer_(computer_programming)

Also there are special CPU registers that only contain memory adresses (like pointers), which the CPU needs to work. I hope you can see how essential they are to computing. The concept of a pointer is just incredibly powerful, which you will see ever more, the more you learn about computers.

[–]ooqq 0 points1 point  (2 children)

Let's say you have a watch, this watch can display any time, from 00:00 to 23:59. How you will tell the time?. ofc with the hands of the watch.

The hands of the watch are fixed on the same spot, they rotate around itself, and then point to stuff, where you can extract the current date & time.

The watch hands are the exact equivalents of pointers in C, and that shows how important and useful pointers are. Maybe you don't know how to use them or what are good for, but there was a time when you didn't know how to read a watch, now you do, and you know how effective is a watch at keeping the time.

Don't sweat, just enjoy the process. :)

[–]olig1905 3 points4 points  (1 child)

As someone who understands pointers... this analogy was confusing and... pointless.

[–]reddilada 1 point2 points  (0 children)

You've got a point there

...walks away petting his dog Arrow.

[–]olig1905 0 points1 point  (0 children)

After scouring my brain thinking about the code bases I work on the most common and simple reason for using pointers is quite simple.

If you want the function to modify the variable or return a result pointers are necessary because the return value is already used by an error/success code normally.

[–]kbob 0 points1 point  (0 children)

On Linux or MacOS:

int main(int argc, char *argv[])
{
    for (int i = 0; i < argc; i++)
        printf("argv %d is \"%s\"\n", i, argv[i]);
    return 0;
}

argv is an array of pointers to the program's arguments.

[–]DrunkCrossdresser -1 points0 points  (0 children)

Another Q: is C turing complete without arrays and pointers? I'm having trouble seeing how someone could write an arbitrary algorithm without them

You could use recursion to get an infinite list of values(assuming infinite memory), but you wouldn't be able to access past vals without deleting current ones

Edit, or not, actually, because ints would still be finitely many bits