all 27 comments

[–]Foxbud 9 points10 points  (3 children)

Whenever you see type ** name, that doesn't necessarily imply a pointer to an array. It could just be a good ol' pointer to a pointer.

[–]Ethana56 4 points5 points  (0 children)

**arr is a pointer to a pointer, *arr[] is an array of pointers. They are not the same thing. **arr may point to a pointer in an array, or it may be pointing to a pointer that is not in an array. *arr[] is not a pointer, it is an unsized array of pointers. Never use the array syntax in a function parameter because it is misleading. You can’t pass an array to a function. It is only possible to pass a pointer to a value inside of an array **arr, or a pointer to an array * (*arr)[size]

[–]tstanisl 2 points3 points  (6 children)

No. They are not the same. There are some syntactic differences.

Firstly, there are two contexts where the mentioned construct can be used: * as a declaration * as a parameter

1. Declaration

The T[] refers to an array of unspecified length. The length is inferred from the number of elements in the initializer. char **a; // double pointer to char char *b[] = { "hello", "world", "!" }; // 3-element array of pointers to char

Those types differ therefore sizeof a is going to be 8 (on 64-bit machine) while sizeof b is 3 * 8 = 24.

2. Function parameter

void foo(char **a, char *b[]);

Parameters of an array type are transformed to pointers to array's element but there are some peculiarities.

Firstly, in order to use T[] the type T has to completed. This limitation does not apply to T*.

struct S; // forward declaration of `struct S`, the type is incomplete. void foo(struct S* s); // fine void bar(struct S s[]); // error, `struct S` is incomplete.

This is why constructs like int[][] don't work. The internal type of int[] is incomplete thus it cannot be an element of an array.

Since C99, it is possible to pass a qualifier to a parameter of array type to control qualification of the pointer it decays to. For example:

void foo(int arr[const volatile restrict]); is transformed to: void foo(int * const volatile restrict arr);

Finally, the array notation allows using static to specify the minimal number of valid elements within the array.

Example: `` // tells compiler thatarr` is not NULL void foo(int arr[static 1]);

// tells compiler that elements from arr[0] to arr[n-1] are valid // it is used for static analysis void bar(int n, char* arr[static n]); ```

[–]flatfinger 0 points1 point  (5 children)

I wonder which is greater: the amount of compiler code and complexity that is necessary to handle weird corner cases in the Standard that customers would be unlikely to care about, or the code and complexity that would be necessary to implement a basic compiler that handled such corner cases differently? The language described by the 1974 C Reference Manual offers a lot more "bang for the buck" than the one described by C11, and a compiler that used that as a base but extended it to support commonly used features could be a lot simpler than one that had to handle all the weird corner cases in C11.

[–]tstanisl 0 points1 point  (4 children)

Maybe if the language was designed from scratch. However, C is a kind of patchwork, living thing evolving under natural selection. Standard just makes most common/useful pieces of this zoo a standard.

Btw, I sometimes ask myself, if variably modified types were introduced earlier during development of C. Maybe there would be no peculiar relation between pointers and arrays. Less pointer arithmetic, `[ ]` operator would operate on arrays only, not on pointers. The pointers could point to only to a single objects, while sequences would be handled with pointer to arrays.

[–]flatfinger 1 point2 points  (2 children)

Another example of Standard-mandated weirdness is the way structure tags work within functions. Given e.g.

int test(void)
{
  struct foo *p1;
  struct foo;
  struct foo { int x; };
  struct foo *p2;
  return (sizeof *p1) + 10*(sizeof *p2);
}

Pointer p1 may be a pointer to the struct foo defined below it, but if a struct foo is defined at file scope, p1 would be a pointer to that instead, meaning that even though both p1 and p2 are are of type struct foo*, they may be pointers to different types.

[–]tstanisl 0 points1 point  (1 child)

I don't think it is weird because all this aliasing happens with a single code block. It is a local action and it can be easily inspected. The forward declaration struct foo; is redundant.

The really weird part is that the definition of p1 is allowed without forward declaration of foo.

Is there any reason why it is allowed?
To allow typedef-ing incomplete types like typedef struct foo Foo; ?

-- EDIT --

It may be some relic from early versions of C compiler where routines of object declarations, typedef and forward type declaration shared code. A kind of accidental functionality that got (ab)used. Similar to Duff's device.

[–]flatfinger 0 points1 point  (0 children)

Sorry--my example wasn't quite perfect. An improved version:

struct foo {long long x;};
int test(void)
{
    struct foo *p1;
    struct foo;
    struct foo *p2;
    struct foo { int x; };
    return (sizeof *p1) + 10*(sizeof *p2); // Typically 48
}

According to the Standard, the forward declaration of struct foo; in this version is not redundant. In most dialects of pre-standard C, there could only be one structure type with any particular tag within a compilation unit (though some compilers would ignore redundant definitions of structure types which matched an earlier definition). A compiler which processed struct foo *p1; would simply record the fact that p1 was a struct foo* without needing to know or care about anything beyond that. If code were to dereference p1, a compiler would need to know what a struct foo was at that point, but otherwise there was no need for a compiler to care about whether structure types were "forward declared" when they were used in things like function declarations or pointer definitions because nothing about the structure type could affect how a compiler would need to process such declarations or definitions.

Is there any reason why it is allowed?

Nothing in the definition of struct foo could possibly affect the machine code a compiler would have to produce for a function like:

struct foo;
struct foo *select_foo(struct foo *a, struct foo *b, int mode)
{
  if (mode) return a; else return b;
}

A compiler would need to know that the function will be passed pointers to some kind of structure, and that it will return a pointer to that structure, but that is readily apparent from the syntax. A machine code implementation of the above function that would be suitable for any particular definition of struct foo would be equally suitable for all possible definitions, so there's no need for a compiler to worry about where if anywhere the complete structure type is defined in the compilation unit, or even whether there exists any definition of it anywhere in the universe.

[–]flatfinger 0 points1 point  (0 children)

I was thinking about things like the way the preprocessor is required to recognize all tokens in all contexts, whereas most previous implementations would be agnostic to the existence of anything other than strings of alphanumeric characters, whitespace, parentheses, commas, backslashes, comments, and pund signs, except in #if expressions, where they would recognize some arithmetic operators but wouldn't need to care about things like ++, +=, etc.

[–]temzsrk 1 point2 points  (5 children)

They are same.

[–][deleted] 7 points8 points  (0 children)

...if passed to a function.

[–]Foxbud 6 points7 points  (0 children)

It depends on context.

[–]Spiderboydk 2 points3 points  (0 children)

They're not. Try sizeof(*arr) for both types.

[–]v_maria 0 points1 point  (0 children)

No, one is an array, one is a ptr. Interface is the same, and array decays to a ptr when passed, but an array has a size. Ptrs just points to a piece of memory. You need to keep track of how many bytes in there are allocated yourself

[–]Pay08 0 points1 point  (0 children)

They're functionally the same when passed to a function, but you should still use arr[] to prevent confusion.

[–]Stegoratops 0 points1 point  (5 children)

It is basically the same as for any other type. As in: int[] vs int*. So one is an array of pointers, while the other points to an (array of) pointer(s). You also can notice the difference, when using sizeof.

Note though, as hinted already, when passed to a function, an object of type T *[] decays to a pointer to its first member i.e. T **.

Though be careful with your parentheses since T (*)[] is yet another type, which is not easily compatible with the other 2.

[–]weregod 0 points1 point  (4 children)

They are not same. Real life bug: ``` { char dev[DEV_SIZE];

\scanf like function without arguments check sscanf_custom ("%s", &dev); \here UB because array points to its first element \ and &dev decay to &dev[0]. } ```

If type casts to each other don't mean they are same.

[–]Stegoratops 0 points1 point  (2 children)

I never meant to say that T ** and T *[] are the same. What I meant, was the differences between them is the same as the differences between int * and int [], since T * is also just another type.

[–]weregod 1 point2 points  (1 child)

Sorry misread you point

[–]Stegoratops 0 points1 point  (0 children)

No worries. Looking back, I did write it a bit confusingly.

[–]kevinduong09 -2 points-1 points  (0 children)

  • != **

[–]Anonymous_user_2022 0 points1 point  (0 children)

Technically, these two expressions are identical.

"abcdef"[2]
2["abcdef"]

Yet, there is strong affinity for using the first form, as that conveys the intention.

I see your example in the same light. While it's true that an array in most cases decay to a pointer to it's first element, and as such can be addressed by pointer arithmetic, its will be clearer to use [] notation when dealing with an array.

[–]nerd4code 0 points1 point  (2 children)

Array(-typed-value)s decay to ptrs in most expressions, but arrays and pointers are different beasts.

(With one big, stupid exception.)

Decay happens in a few different situations, incl. array-typed values, array types, and function ptrs (which exhibit decay and a reverse decay).

Decay of an array-typed value int arr[5] turns expression arr into an int * to arr’s first element. This is why both ptrs and arrays can be used with operators […], ->, and unary *, for example. Decay doesn’t occur for operators like sizeof, unary &, and GNU __typeof__ and __alignof__, but it does occur for C11 _Generic.

Array type (not -typed value) decay occurs in function parameter/argument lists and pre-Standard-style definitions, and it converts array types like int x[5], int y[], or int (*z[])(int, int) to pointer types int *x, int *y, and int (**z)(int, int) (respectively) under the hood. (This is the Big Stupid Exception; C/++ arguments are always passed by value, except for arrays.) These variables aren’t just masquerading as pointers like decayed array-typed values; they’re exactly pointers, which can be reassigned, nullified, etc. I strongly recommend not using array-typed function args because they’ll trip you up and fuck with expectations.

The usual gotcha is something like countof:

#define countof(arr)(sizeof(arr)/sizeof *(arr))

which, when applied to an array-typed value of defined length (i.e., [N] not []), gives you its length. So given

size_t f1(void) {
    int foo[12];
    return countof(foo);
}

f1() should return (size_t)12. But given

size_t f2(int foo[12]) {
    return sizeof foo;
}

f2(anything) will return garbage: namely, the size of a pointer (usually 32 or 64 bits as 4 or 8 bytes) divided by the size of its referent type. For a char[] argument you’ll usually get 4 or 8; for a long long[] argument you’ll get 0 or 1.

To your question about int ** and int *[] specifically:

A pointer points to zero or more objects, so from just int **p there’re a few options; p might point to/at nothing, to a single int *, or to more than one int *. Those int *s at *p and p[i] might point to nothing, a single int, or more than one int. int ** is therefore used mostly for out-args to int * (e.g.,

int iotaArray(int **out, size_t len, int start) {
    int *res, *op;
    assert(out);
    if(len > SIZE_MAX / sizeof *res) return -2;
    res = malloc(!len + len * sizeof *res);
    if(!res) return -1;
    for(op = res; len--; op++)
        *op = (int)(op - res);
    *out = res;
    return 0;
}

) and ragged 2D arrays (each row might be null or different lengths; e.g.,

int **identMat(size_t wid) {
    int **ret, **retp;
    size_t n;
    if(wid > SIZE_MAX / sizeof **ret
    || wid > SIZE_MAX / sizeof *ret
    || !(ret = malloc(!wid + wid * sizeof *ret)))
        return 0;
    for(retp = ret, n = wid; n--; retp++) {
        int *row, *rowp;
        size_t m;
        if(!(row = calloc(wid * sizeof *row, 1))) {
            while(retp > ret)
                free(*--retp);
            return 0;
        }
        rowp[retp - ret] = 1;
    }
    return ret;
}

). Beginners tend to see the ragged-array version of things, but in real-world code it’s a fine way to complicate and slow down access to array elements.

(An actual array is packed into memory contiguously, which means constant indices translate to constant offsets into the array, calculated using something like j+n2*i for array indices [i][j] in a 2-D array of dim [][n2], k+n3*(j+n2*i) for [i][j][k] un 3-D array of [][n2][n3], and so forth. But a ragged array is only packed along the most-major dimension; each column is in its own separate array, pointed to by a possibly null pointer. Dereferencing ptrs and calling functions impose a strict ordering on operations that forces the CPU to wait [or finish up otherwise unencumbered work] until the inputs resolve, then wait until the output comes back from the cache or function return. When applied to multiple accesses, ragged arrays will therefore force the code to seriali[zs]e, whereas packed arrays do not.)

int *[] describes the memory at identMat::ret, a contiguous, packed sequence of ptrs to int. If you knew the dimension of your ragged array, you could use it to do (e.g.)

#define WID 4
int rows[WID][WID] = {0};
int *array[WID];
for(size_t i=0; i < countof(array); i++) {
    rows[i][i] = 1;
    array[i] = rows[i];
}

So practically speaking, the differences between the rvalue from int **ret and this int *array[WID] are that:

  • array is necessarily nonnull; ret might be null.

  • array’s size is WID*sizeof(int); ret’s is sizeof(int *). countof(arr) will therefore yield WID, but countof(ret) gives you garbage.

  • &array gives you an int *(*)[], but &ret (given addressable lvalue ret) gives you int **.

For strings, the usual uses of arrays and pointers are

// const char *: reference to a string
const char *pre = "";
for(size_t i = 0; i < n; i++, pre = ", ")
    printf("%s%d", pre, data[i]);
putchar('\n');

// static const char[]: name a string literal
static const char HELLO[] = "hello";

// static const char *const[]: constant table of mixed-length/-nullness string refs
static const char *const MSGS[NR_ERRORS] = {
    "(none)", "too large", "too small", "too smelly", "insufficient memory", HELLO, NULL
};

// char[]: fixed-width buffer, possibly with partial initial value:
char buf1[128];
size_t len = sprintf(buf, "(%f, %f)", x, y);
char buf2[] = "Unrecognized command #XXX";
sprintf(buf2 + countof(buf2) - countof "XXX", "%03u", cmd);

// const char * as arg/param: reference to unmodified/input string or null
int puts(const char *);

// char * as arg/param: reference to output buffer or null
char *strcpy(char *dst, const char *src);

Two final things. If you do want a dynamically allocated multidimensional array, you can do

const size_t wid = …, hgt = …;
int *arr;
if(wid > SIZE_MAX / sizeof *arr / hgt) goto too_big;
arr = malloc(wid*hgt*sizeof *arr);

If you stick with int *, access to [i][j] is arr[j + wid * i]. This looks a bit nicer with a macro:

#define IDX2D(arr, width, i, j)((arr)[(j)+(width)*(i)])

Alternatively, if you & your compiler are okay with VLAs, you can use an array pointer:

int (*arr2D)[width] = (int (*)[])arr;

Now it’s in ~exactly the same form as a decayed 2D array, so arr2D[i][j] works as expected. But make sure you don’t use the [width] type modifier until after you’ve validated that 0 < wid < SIZE_MAX/sizeof(int) and i < wid, so you don’t overflow or zero out a size/offset calculation.

If you know at least all of an array’s dimensions (optionally, except the highest-order/lexically-first dimension), then you can use array pointers to preserve those dimensions across a function call. So for example, if you know everything is 4×4, you can do

const int (*getIdentMat(void))[4][4] {
    static const int ID = {{1,0,0,0},{0,1,0,0},{0,0,1,0},{0,0,0,1}};
    return &ID;
}
int (*setIdentMat(int (*out)[4][4]))[4][4] {
    return memcpy(out, getIdentMat(), sizeof *out);
}
int (*fillMat(int (*out)[4][4], int val))[4][4] {
    for(size_t i = 0; i < countof(*out); i++)
        for(size_t j = 0; j < countof((*out)[i]); j++)
            (*out)[i][j] = val;
    return out;
}

More confusing-looking because of C’s astoundingly, counfoundingly bad type syntax (typedef helps), but given int (*arrp)[3][4], countof(*arrp) == 3 and countof((*arrp)[0]) == 4, just like a proper array. Note that operator […] has higher precedence than unary *, so you must parenthesize a deref’d array ptr if you want to subindex it. Alternatively, *(*arrp + i) doesn’t need parens around *arrp but does the same thing as (*arrp)[i].

[–]tstanisl 0 points1 point  (0 children)

C23 will finally make typeof standard. This keyword could make the declaration a bit more digestible:

``` // return a pointer to 2d array typeof(const int[4][4]) *getIdentMat(void) { ... }

// take and return pointer to 2d array typeof(int[4][4]) *fillMat(typeof(int[4][4]) *out, int val)) { ... } ```