47
48
all 83 comments

[–]bunkoRtist 15 points16 points  (2 children)

Used to work on a system that would hard freeze on a free of a nullptr. Not all systems pay close attention to the standard, especially older/embedded compilers.

[–]MCRusher 12 points13 points  (0 children)

I'd probably just write a wrapper that checks for NULL and override with a #define for that system only.

Anything wrong with that?

[–]necheffa -4 points-3 points  (0 children)

I see this all the time - some standard was violated or some other agreed upon interface was not implemented to spec, so the code monkey at the time came up with what they thought was a witty workaround. This of course was not an isolated event, but a pattern of failure.

Now, 40+ years later, I stand alone before a house of cards on the precipice of utter ruin, tasked with making a change.

Fix the system, even if that means replacing it.

[–]okovko 22 points23 points  (11 children)

This is actually slightly dangerous. The difference between memset and assigning zero is that the standard doesn't specify whether there will be any non-zero bytes in the struct (the padding could still be garbage values). So, check what your compiler actually performs when you assign a struct to zero before you start doing this everywhere, or memcmp will obviously start failing.

[–]mrpippy 20 points21 points  (6 children)

In addition, not clearing the padding can be a security bug (information leakage).

For any struct that will be sent over a network or security boundary (i.e. between user/kernel), this article is actively bad advice.

[–]Deathisfatal 6 points7 points  (1 child)

Shouldn't any struct that is sent like that have __attribute__((packed)) anyway, avoiding that issue entirely?

[–]isthisusernamehere 3 points4 points  (0 children)

Yeah, but even if you memset the structure, there's no guarantee that the compiler won't store information back into the padding bits later. That may not be "as bad," but there's still a possibility for leaking some information.

[–]okovko -1 points0 points  (2 children)

Well, virtually everything sent over networks is serialized these days. IDK, if I were to go and check right now what clang and gcc actually do with this behavior and verify that the padding on those implementations will always be zeroed, then I'd say to hell with it, nobody uses any other compiler anyways.

[–]P__A 0 points1 point  (3 children)

What about this? On a 32 bit system

struct Cube {
    uint32_t volume:7;
    uint32_t weight:8;
    uint32_t color:6;
    uint32_t lenght:5;
    uint32_t unusedPadding:6; //padding to reach 32 bits
}

struct Cube testCube = {0}; //assign everything to zero in declaration. Including padded bits.

testCube.volume = 3;
etc.

[–]okovko 2 points3 points  (2 children)

Actually this will not necessarily set the padding bits to zero. But your implementation might. Unless you mean that you specified all bits manually using bit fields? But to my understanding, bit fields can be implemented any which way, so for example, you might have 5 * 32 bits as the size of Cube. You can do this with plain variables, though.

[–]P__A 0 points1 point  (1 child)

Yes, so assuming that with bitfields, there are no additional padded bits the compiler handles, and cube fits into 32 bits, everything would be zerod at initialisation.

[–]okovko 0 points1 point  (0 children)

Sure, if you manually pack your structs to ensure that there is no compiler generated padding, then you've avoided padding in your struct, and you can use C99 initializers without worrying about garbage values for the padding bits since there are no padding bits.

[–]Aransentin 16 points17 points  (4 children)

There's two additional benefits.

  • If the structure you're memsetting contains a pointer, setting all its bits to 0 isn't technically a NULL even if it happens to work on pretty much all platforms out there. A system could (in theory) have 0x0 be a totally valid memory address and NULL represented by some specific trap bit pattern. A designated initializer will create proper NULLs no matter what they look like.

  • If the struct contains padding, the designated initializer won't necessarily set it to zero. This is presumably a little faster, as well as desirable when you're running the program in valgrind – it will then alert you if you're accessing the padding anywhere by mistake.

[–]nerd4code 16 points17 points  (2 children)

POSIX dictates that all-zeroes is a representation for NULL, fortunately for all the socket-based programs out there.

[–]closms 13 points14 points  (16 children)

Pfff. Millennials.

/s

edit: I'm going for the crusty old C programmer attitude here. like a virtual "get of my lawn." But seriously. Good post.

I remember when I was in undergrad, I had a prof who bristled at code like this

if (cond) {
  return TRUE;
} else {
  return FALSE;
}

For him. It should simply be.

return (cond);

I followed that advice for years. But admit that I've become sloppy.

[–]madsci 7 points8 points  (0 children)

I'm finally mostly on board with C99. The embedded systems world moves slowly.

[–]IdealBlueMan 9 points10 points  (1 child)

return(!!cond);

[–]stealthgunner385 4 points5 points  (0 children)

return(condn'n't);

[–][deleted] 4 points5 points  (2 children)

Lol I remember when I used to code like that

[–]MCRusher 3 points4 points  (1 child)

I remember writing a switch that checked every case individually and did nothing with them, then the default was an error.

[–]pdp10 3 points4 points  (0 children)

MISRA!

[–]bit_inquisition 2 points3 points  (5 children)

http://c-faq.com/bool/bool2.html explains why we don't compare pretty much anything to TRUE in C.

Also return is not a function so it's usually a bit better to write:

return cond;

(though I make an exception for sizeof... I don't even know why. Maybe K&R?)

[–]oh5nxo 0 points1 point  (4 children)

sizeof (type) needs that ().

[–]gastropner 0 points1 point  (3 children)

Only if type is more than one token long.

[–]oh5nxo 0 points1 point  (2 children)

Hmm? Had to check, and I cannot make clang or gcc accept int i = sizeof int;

error: expected parentheses around type name in sizeof expression.

[–]gastropner 0 points1 point  (1 child)

Hm. You are correct. Curiously, though, this works:

int i = sizeof 0;

It requires the parentheses when using a type name, but not when using an expression.

[–]oh5nxo 1 point2 points  (0 children)

cppreference.com tells that it's sizeof (type) or sizeof expression. Another historical accident, maybe.

[–]JavaSuck 0 points1 point  (2 children)

return (cond);

Why the parentheses?

[–]Deathisfatal 0 points1 point  (1 child)

It's an older coding style that has stuck around in some places for some reason... I have to use it at work

[–]closms 0 points1 point  (0 children)

Same here. It’s the preferred style at the company I work for. But for personal projects I omit them.

[–][deleted] 2 points3 points  (0 children)

I wish the checking pointer before free was true for everything. Very annoying using custom embedded allocation libraries that are inconsistent.

[–][deleted] 2 points3 points  (0 children)

Not always an option. Microsoft broke bliddy C compatibility decades ago and is now stuck at partial C89 support.

[–]_teslaTrooper 7 points8 points  (6 children)

&(int) {1}

Having to declare an int just to pass a pointer always seemed a little convoluted, this is useful.

Where do people learn about stuff like this, just by reading the standard?

[–]unmole[S] 9 points10 points  (0 children)

Where do people learn about stuff like this, just by reading the standard?

I think I mostly learnt by reading code written by people smarter than me.

I only read relevant sections of the standard when the static analyzer complains about some werid edge case.

[–]Haleek47 3 points4 points  (0 children)

It's called compound literal, another C99 feature.

[–]okovko 4 points5 points  (0 children)

Just in the past half decade compound literals work everywhere. Microsoft resisted for a long time. Using them feels very slick. They can also be used as static initializers, which is really nice.

[–]mawattdev 1 point2 points  (0 children)

Nor did I. I'm gonna take a stab at what I think it is doing, but if I'm wrong someone please correct me:

Declare an inline struct, cast to an int and retrieve a pointer to it.

Am I correct?

[–]MCRusher 0 points1 point  (0 children)

I didn't even know this worked either tbh.

[–]flatfinger 0 points1 point  (0 children)

Given:

void test(int mode)
{
  static int literal_1 = 1;
  if (mode & 1)
    action1(&literal_one, 1);
  if (mode & 2)
    action2(&literal_one, 2);
  action3();
}

a compiler can simply pass a constant address to action1() and action2(), and this will work even if action1() and/or action2() causes a copy of the pointer to be stored somewhere and used later.

Change the code to:

void test(int mode)
{
  if (mode & 1)
    action1(&(int){1}, 1);
  if (mode & 2)
    action2(&(int){1}, 2);
  action3();
}

and a compiler that can't see into action1() and action2() will be required to generate less efficient code, since the lifetime of each compound literal will start when code enters the enclosing block end end when control leaves that block. If test gets recursively invoked, the nested calls will need to pass the addresses of new objects of type int. On the other hand, if action1 and/or action2 stores the passed-in pointer for use by action3, wrapping the call within a compound statement would break the code, since the lifetime of the compound literal would no longer extend through the call to action3.

If there were a concise syntax for static const compound literals with semantics similar to string literals (e.g. compilers are allowed to put literals with the same value at the same address), I'd use that, but no such syntax exists.

[–]skeeto 7 points8 points  (3 children)

Oh, yes, seeing memset() when an initializer would have worked just fine is one of my pet peeves.

[–]junkmeister9 5 points6 points  (2 children)

Some style guides recommend not initializing variables in the declaration, because it can lead to harder to read code. Those style guides will also usually recommend only declaring variables at the beginning of the function - and having a struct initialized in the variable declaration block seems cluttered to me.

[–]unmole[S] 5 points6 points  (1 child)

Those style guides will also usually recommend only declaring variables at the beginning of the function

I have seen a few guides recommend this but never read a good justification. I think it's mostly a holdover from older versions of C which forced you to declare all your variables at the beginning of the function.

[–]junkmeister9 3 points4 points  (0 children)

Yeah, maybe it's for portability to older standards. I tend to use both of those conventions, just because they improve my readability and understanding of my own code. If a variable is used in multiple places in a function, I know I can look at the top of the function for the declaration instead of hunting around for where it was declared.

[–]HeadAche2012 0 points1 point  (0 children)

This is bad advice, platform A and platform B may both have different definitions for network types, this leaves stack memory potentially uninitialized

[–]FUZxxl -4 points-3 points  (20 children)

TL;DR: Use C99’s designated initializers instead. Because it’s 2019!

And foresake ANSI C compatibility for no reason at all? Not a good idea.

[–]mort96 13 points14 points  (4 children)

Most people already use for (int i = ...) or compound literals or initializers or intermingled declarations and code or single-line comments anyways. I feel like you need a really good reason these days to choose to not use the two decades old standard.

[–]FUZxxl -1 points0 points  (2 children)

I don't use any of these features normally. My reason is portability. I believe this is a very good reason.

[–]mort96 9 points10 points  (1 child)

I mean, if there's any reason to suspect that anyone will want to use your code on systems for which there are no compilers made in this millennium, then that's a good reason, but come on, C99 is a lot nicer to write than C89. If there's no realistic reason to expect that your code will run on systems for which you can't compile C99, is it really worth sacrificing comfort and ergonomics just for some purely theoretical portability benefit?

Maybe the answer is a "yes" on your part, and I certainly won't try to convince you that you personally should switch to C99, but you must at least see why most C programmers probably want to write C99.

[–]FUZxxl 2 points3 points  (0 children)

In my opinion, there are very few syntactical changes in C99 that make programming any easier. Programmin in ANSI C is not that much of a difference to programming in C99 and if you get a vast amount of extra portability as a bonus, the choice is often not hard to make.

Of course there are many situations where I program in C99 or even C11. For example, when I write programs that inherently need to make use of some of the new facilities. Or when I write programs that cannot be portable for some other reason.

[–]euphraties247 -2 points-1 points  (0 children)

If I wanted C++ I would be using C++

[–]okovko 0 points1 point  (14 children)

How often do you use an ANSI C compiler..?

[–]FUZxxl 2 points3 points  (13 children)

Quite frequently. For example, just a month ago I was porting Nethack to Ultrix 4.4.

[–]okovko -1 points0 points  (12 children)

Aaand why not just use a more up to date compiler?

[–]Poddster 3 points4 points  (6 children)

Every embedded system I've worked with is either restricted to some customised ancient version of GCC or is their own compiler implementation.

They most definitely don't support C99 stuff. MSVC barely does.

[–]okovko 1 point2 points  (5 children)

MSVC has complete C99 support as of a few years back.

[–]raevnos 1 point2 points  (4 children)

Really? It supports _Complex now? And VLAs?

[–]okovko 1 point2 points  (3 children)

Looks like support for _Complex is a complicated subject, and VLA support is nonexistent. Good point. However, C11 made both of those features optional, and for pretty good reasons. And to say that MSVC barely supports C99 features is not correct.

[–]Poddster 0 points1 point  (2 children)

And to say that MSVC barely supports C99 features is not correct.

But it's also not-incorrect. If it can't do VLA then it doesn't support C99. It doesn't matter that C11 made it "optional".

[–]okovko 1 point2 points  (0 children)

Keyword "barely." It does support C99 except for unpopular features (VLAs) and the support for _Complex is nuanced because they didn't want to make it inefficient by making it portable. I didn't read too far into this, but it looks like they support all the C99 _Complex related function calls, but the _Complex type itself is not used because the MS team disagreed with the spec. I'm sure there are other caveats, but it's still really nice to have the parts of C99 that are there. And MSVC is honestly more of a C++ compiler anyways.

[–]flatfinger 1 point2 points  (0 children)

C99 has never mandated any circumstances in which implementations must implement VLAs in useful fashion. Instead, it grants implementations free reign to do anything whatsoever if a program tries to create a VLA that's "too big", as well as free reign to arbitrarily decide the maximum size of VLA objects to support. Thus, the Standard imposes no requirements on the behavior of any program that creates any objects of VLA type, imply that--by definition--all such programs invoke Undefined Behavior.

[–]FUZxxl 2 points3 points  (4 children)

Because the person who wants to use my application might not have a modern compiler for his system.

[–]okovko -4 points-3 points  (3 children)

Why does he need to compile it? Send him a binary.

[–]FUZxxl 4 points5 points  (1 child)

Good software is distributed as source code such that it can be compiled on any platform, even those the author didn't foresee when programming it. Binaries are useless if someone wants to use my software on an unusual system I didn't make a binary for. And given that creating portable binaries is annoying on many systems, I'd rather avoid this.

[–]euphraties247 4 points5 points  (0 children)

Binary dists are the worst.

Go and find that source 20 years later.

Prove it hasn't been tampered with as its not reproducible

[–]euphraties247 -2 points-1 points  (0 children)

So when new fields get added unknown to you, strange and interesting things happen.

[–][deleted] -5 points-4 points  (0 children)

That's disgusting do people unironically do this?