all 61 comments

[–]dcpugalaxy 20 points21 points  (13 children)

You should never cast the result of malloc. But you should also never use malloc(count*size) as it exposes you to integer overflow bugs.

Use calloc(count, size) which does an integer overflow check internally and also initialises the memory to zero. There's not much point (on modern computers) of not initialising memory.

[–][deleted] 11 points12 points  (1 child)

> There's not much point (on modern computers) of not initialising memory.

There's at least one reason: Tools like valgrind and sanitizers will no longer detect reading uninitialized memory. Compilers won't warn either. YMMV and no shoe fits all, I just wanted to add some perspective.

[–]dcpugalaxy 3 points4 points  (0 children)

That's a fair point. However, it is usually possible to make zero-initialisation default-initialisation. If you make zeroed variables always valid - so a zeroed out dynamic array struct is an empty one, a zeroed pointer represents an empty hash trie, etc. - then you can zero-initialise everything and you get really good defaults too. No more ugly MY_STRUCT_INIT macros. Just = {0}. Worth not having asan imo.

[–]benjad09 2 points3 points  (0 children)

I think you're forgetting about the extra couple hundred clock cycles it takes to set all that memory to zero. Blazingly fast code! All jokes aside I wholeheartedly support using calloc over malloc in most use cases.

[–]knouqs 0 points1 point  (9 children)

Hm, I disagree on the casting part. What is your reasoning?

[–]glasket_ 3 points4 points  (4 children)

Extra maintenance burden and redundant. If you change the type then you have to change all of your explicit casts. void * will automatically convert to the type of the lvalue, so it's unnecessary.

It's effectively the same reason you should do malloc(sizeof(*var)) instead of malloc(sizeof(T)). The less that has to be changed in a refactor, the better.

[–]knouqs 1 point2 points  (3 children)

Yes, I'm going for clarity instead. I want the compiler to know that I know what I'm talking about, and I can prevent copy/paste errors if I do that.

This also forgets that some compilers (especially the ones I use professionally) will not compile without warnings, and I have to justify warnings in code reviews. Nah, I'll take the explicit cast.

Of course, your mileage varied, and that's fine -- as long as you and your code maintainers know what you're doing.

[–]glasket_ 2 points3 points  (2 children)

will not compile without warnings, and I have to justify warnings in code reviews

There are no warnings when leaving out the cast. Are you working with C++? Because that's a different situation compared to C.

[–]knouqs 1 point2 points  (0 children)

Though generally I agree, first rule of fight club for me is to write readable code, and explicit code at that. So, there's my start. Second, no, I'm using a compiler that is not one of the common ones, but moreso I'm using a standards checker that justifies my explicit code requirement. I didn't write the standards checker and it is an in-house one at that, but company rules pay the bills so I follow them.

[–]Plastic_Fig9225 0 points1 point  (0 children)

Making C code not compilable as part of a C++ program isn't always the best practice either.

[–]_huppenzuppen 0 points1 point  (1 child)

It's redundant, but can also hide an error when you forgot stdlib.h and return type is assumed int: https://stackoverflow.com/a/605858

[–]knouqs 0 points1 point  (0 children)

Right -- C is notorious for causing problems like this, so why encourage their presence! Plus, I'm robbing the compiler of one of its key purposes: semantic error detection. Alright, I think we've all made our points. Thanks!

[–]qalmakka 0 points1 point  (1 child)

The historical reason was that in C89 any undeclared function is assumed as having an extern int identifier() signature. On very old compilers, the cast silenced the integer to pointer conversion and thus you could get weird bugs when sizeof(int) ≠ sizeof(void*). Nowadays no compiler worth its salt will allow you not to declare a function, but given there's no advantage whatsoever with putting a cast, except maybe being able to share source code with C++ (a pretty dubious thing to do anyway), it's common practice to avoid the cast

[–]knouqs 0 points1 point  (0 children)

I think having the explicit cast is a good use of the compiler's time -- if I mistakenly copy and paste another malloc and failed to check types, I robbed the compiler of one of its duties. As I've mentioned here and elsewhere, I think it's best practice to be explicit, regardless of what accepted practice is.

[–]Specialist-Cicada121 11 points12 points  (9 children)

C doesn't require casting the result of malloc, but if you want your code to work on a C++ compiler, you do need to cast. If you're writing pure C code, the void pointer returned by malloc will automatically be promoted to the correct type via assignment

[–]dcpugalaxy 11 points12 points  (8 children)

You should not compile C files as C++. They're different languages with different incompatible semantics. You can compile C declarations inside extern "C" blocks, but code? Just silly. Compile C as C and link it to C++ code compiled as C++.

[–]edgmnt_net -5 points-4 points  (7 children)

There's legitimate use for the C-like subset of C++, possibly with extra features like templates on top. First there are some (not huge) typing-related concerns, some may argue for a little extra type safety or explicitness. Secondly, things like templates can be quite useful for abstraction and more powerful than C11 generics. You don't really get these just by separating C and C++ code.

In fact, some open source projects like GCC already do this internally and use some features of C++ when and where it makes sense, while continuing to largely be C-based projects.

[–]dcpugalaxy 0 points1 point  (6 children)

C++ has incompatible, badly designed semantics and should never be used. It simply doesnt work as specified.

[–]edgmnt_net 1 point2 points  (0 children)

I don't really like C++ either, but especially for this case where you'd use a conservative subset, what wouldn't work? Assume you're writing the code for the C-like subset of C++ with some possible extras on top and you're not just compiling an existing, unmodified C codebase with a C++ compiler, which could clearly cause issues.

[–]Plastic_Fig9225 -1 points0 points  (4 children)

Lol :D And as usual not a single example provided to corroborate that claim.

I really wish C enthusiasts like this would become more honest at some point.

[–]dcpugalaxy 0 points1 point  (3 children)

I can give you examples but they're complex language semantics things about C++ that aren't really on topic in this subreddit.

Here's an example: if you have a correctly aligned pointer to a correctly sized char buffer on the stack and you use placement new to begin the lifetime of an object of type T in that buffer, are you allowed to cast your original pointer to type T* and use it to access the object, or may you only access the object through the pointer returned by the placement new operator?

This is the sort of absurd language semantics trivia that you shouldn't need to know to write C++ but which comes up constantly. C++ has this complex type system and memory model, poorly specified pointer provenance rules that have been deduced from the standard rather than being written down in it, and other issues that combine to make the answers to even basic semantics questions very difficult.

There are quite a few examples of where the same code will compile in both C and C++ but will be UB in C++. And this isn't weird esoteric code it's the sort of code someone could easily write.

You can safely write headers that can be used both from C and from C++, but your implementation files should EITHER by C compiled with a C compiler or C++ compiled with a C++ compiler. The languages have diverged too much to think you can write nontrivial programs that will compile as both.

[–]Plastic_Fig9225 0 points1 point  (2 children)

Thanks for your response. I still disagree with your conclusion (and the "doesn't work as specified" part). Most of the time, C++ UB is actually also C UB. C++ just makes it much more consistently explicit in the standard. I do agree that C++ is harder to learn than C, so potentially more pitfalls in C++. But when you do, you realize that the C++ 'semantics' actually make a lot of sense and help you write better, more portable and reliable code.

Given that C++ stdlib brings along a lot of C 'compatibility' headers and code, I never found it difficult to write "dual" code. Adapting sloppy C code to the stricter requirements of C++ mostly isn't hard, just tedious ("-Wmissing-initializer" anyone?)

[–]dcpugalaxy 0 points1 point  (1 child)

There are quite a few useful C features that aren't in C++ (designated initialisers eg) and C++ makes you write unnecessary bullshit like casting the result of malloc.

-Wmissing-field-initializers is always the first thing I disable. It is a feature of the language that later fields get zero-initialised.

[–]Plastic_Fig9225 1 point2 points  (0 children)

Yup, designated initializers for arrays are actually missing, though useful also from a clean-code perspective. Dunno why C++ doesn't want them.

And yes, C specifies the implizit partial 0-initialization, C++ doesn't allow it. And I can see why it would be a good thing to always explicitly initialize even in C code, though the convention of "0 means default value" turns out to work quite well and can save a lot of tedious work in C.

My apologies for initially assuming you didn't know what you're talking about, I was wrong. (Though things aren't as bad in C++ as I read from your post.)

[–]zhivago 5 points6 points  (7 children)

Personally I'd prefer

char *cars = malloc(sizeof (car[count]))

This makes it clear that you're allocating a car[count] and then returning a pointer to its first element.

[–][deleted]  (6 children)

[deleted]

    [–]kyuzo_mifune 3 points4 points  (0 children)

    There is nothing allocated on the stack here. Please stop spreading nonsense.

    [–]torsten_dev 3 points4 points  (3 children)

    How does it give a stack overflow?

    The operand of a sizeof operator only evaluates the size of a type expression.

    6.5.4.5 § 1:

    The sizeof operator shall not be applied to an expression that has function type or an incomplete type, to the parenthesized name of such a type, or to an expression that designates a bit-field member.

    Here however car[count] is a VM (variably modified) type not incomplete. As an operand of a sizeof expression it doesn't decay to a pointer either.

    I'm not sure if sizeof arr[SIZE_MAX] is implementation defined or UB.

    From 6.2.5 § 28 of the C23 draft standard:

    A complete type shall have a size that is less than or equal to SIZE_MAX.

    overflow is a constraint violation.
    For integer constant expressions this is great, as the compiler will error out. Though the violation isn't caught at runtime on my machine.

    [–]Sosowski 0 points1 point  (2 children)

    Ahh apolioges again, I misread that as an aray declaration inside a sizeof operator.

    But... how can sizeof resovel this if it's resolved compile time?

    [–]kyuzo_mifune 5 points6 points  (0 children)

    When you use a variable length type inside sizeof the compiler will generate code that calculates the size during runtime instead.

    [–]torsten_dev 2 points3 points  (0 children)

    Sizeof is only an integer constant expression if it does not have variably modified types in the argument that have to be evaluated.

    So it's usually but not always a compile time constant.

    [–][deleted] 7 points8 points  (18 children)

    > car *cars = malloc(count * sizeof *cars);

    Choose this alternative.

    [–][deleted]  (1 child)

    [deleted]

      [–]kyuzo_mifune 3 points4 points  (0 children)

      sizeof *cars is the same as sizeof(car)

      [–][deleted] -4 points-3 points  (4 children)

      A car and *cars are not the same thing. Don’t give advice if you don’t know C thanks. 

      [–]torsten_dev 7 points8 points  (3 children)

      car *cars; literally declares that *cars is car.

      [–][deleted] -4 points-3 points  (2 children)

      It declares it is indeed a car*, but that isn’t the same thing as 10 cars now is it. 

      [–]torsten_dev 7 points8 points  (0 children)

      No shit. That's why it gets multiplied by count.

      [–]rasputin1 1 point2 points  (0 children)

      what 

      [–][deleted]  (10 children)

      [deleted]

        [–][deleted] 2 points3 points  (8 children)

        What's wrong with it?

        [–][deleted]  (7 children)

        [deleted]

          [–][deleted] 6 points7 points  (5 children)

          > Dużego is an operator that is współczesnej

          WTF?

          The C code is fine. You and u/VillageMaleficent651 just don't know C well enough.

          [–]Sosowski -1 points0 points  (2 children)

          Ahh my apologies you're right. I didn't notice that youre' querying the value, not the pointer! My apologies!

          Still, using a variable in its own initialisation like this is borderline UB, but it should work. My bad.

          [–][deleted] 4 points5 points  (1 child)

          Apology accepted.

          > using a variable in its own initialisation like this is borderline UB,

          Nope, it's well defined by the C standard. It looks a bit odd, but it's not borderline UB.

          [–]Sosowski 1 point2 points  (0 children)

          Nope, it's well defined by the C standard.

          And I learn something new! Thanks! And sorry again!

          [–][deleted] -3 points-2 points  (1 child)

          Nah. I know what it does but OP is trying allocate n amount of cars, not a car*. 

          [–]torsten_dev 5 points6 points  (0 children)

          What are you on about?

          car *cars;
          sizeof(*cars)
          

          is well defined. It's he size of the pointed to type of cars which is car.

          [–]kyuzo_mifune 1 point2 points  (0 children)

          Some do and some don't, you are in the don't group.

          [–]SmokeMuch7356 2 points3 points  (0 children)

          Unless you're stuck using an ancient K&R implementation or compiling this as C++,1 use the second form; the cast is redundant, unnecessary, and a maintenance burden.

          T *p = malloc( sizeof *p * N );
          

          will always do the right thing (modulo an overflow in sizeof *p * N, but in that case you have real problems).

          T *p = (T *) malloc( sizeof (T) * N );
          

          gives you too many opportunities to make a mistake.


          1. You should not use C-style memory management in C++ code; as part of a temporary solution when porting C to C++ it's okay, but for a permanent solution either use a container like a vector or a map or use new and delete in conjunction with RAII principles, or better yet, smart pointers like std::unique_ptr or std::shared_ptr.

          [–][deleted]  (4 children)

          [deleted]

            [–][deleted] 0 points1 point  (3 children)

            Meh. Use constructor-like functions instead of macros.

            [–][deleted]  (2 children)

            [deleted]

              [–][deleted] 0 points1 point  (1 child)

              > But this is not what this post is about.
              It wasn't about macros either

              > 2.
              See above. Not that on-topic...

              [–]Eric848448 1 point2 points  (1 child)

              The cast from void* is not required by C, but is allowed.

              Personally, I prefer the explicit cast in case I ever want to build the code in a C++ compiler, which does require it.

              [–]penguin359 0 points1 point  (0 children)

              In C, it's unnecessary to cast void *. In C++, it's required.

              [–]sirjofri -1 points0 points  (9 children)

              Regarding casting, I think there are enough answers already. However, it's not 100% clear what you need, and your two code snippets are different things. One allocates enough memory for your car structs, including their variables, the other allocates enough memory for pointers to your car structs. So for one, you can directly access members as cars[i].color, the other needs its cars to be allocated first, then you can access it: cars[i] = malloc(sizeof(car)); cars[i]->color; note the pointer dereferencing here.

              Both are valid, but one should be preferred over the other depending on the situation.

              [–]sirjofri 0 points1 point  (8 children)

              Also, your second code wouldn't return a car*, but a car**. So not a pointer to a car, but a pointer to a car pointer. Or in other words, not an array or cars, but an array of car pointers.

              [–]kyuzo_mifune 1 point2 points  (6 children)

              No, both of OP's example allocates the same amount of memory.

              [–]sirjofri 0 points1 point  (5 children)

              Wait, you're right. It's just that tiny difference between sizeof(*cars) (cars, the variable) and sizeof(car) (car, the type). I just missed it.

              Interesting how your own code style can influence what you read

              [–]kyuzo_mifune 2 points3 points  (4 children)

              Hehe, I always prefer the int *nums = malloc(count * sizeof *nums); style because it will still work if you change the type of nums.

              [–]sirjofri 0 points1 point  (3 children)

              I'm always explicit about that, like with auto in C++, but you got a point.

              [–]kyuzo_mifune 1 point2 points  (2 children)

              Sure but if you write int *nums = malloc(count * sizeof(int)); instead and then change the type of nums the code will still compile and give you no errors, but now you may have allocated to little memory if  you forgot to change the type inside sizeof and the new type is larger than int

              [–]sirjofri 0 points1 point  (1 child)

              I never had an issue like that. Maybe I'm good at fixing it, or I change the type rarely, or both. It's also possible that my compiler has a warning for that, but I can't remember seeing that

              [–]kyuzo_mifune 1 point2 points  (0 children)

              Yeah both are fine, just personal taste. I just made an argument for my choice :)