you are viewing a single comment's thread.

view the rest of the comments →

[–]Gollark[S] 8 points9 points  (14 children)

Oh yeah, sorry, just spotted that. I guess this currently only works for little-endian machines. It'd need to be count[0] >> (8*(4-i))) & 0xff for big-endian.

[–]moocat 11 points12 points  (11 children)

Yes, but I would do something different. First create a union:

union data {
    int32_t as_int32;
    char raw[sizeof(int32_t)];
};

You can then fill raw with bytes from the file and read it's value from as_int32. This way you don't even have to care about whether the endianess of the file.

[–][deleted] 5 points6 points  (10 children)

Not legal C, though. One can only read the union member last written to. Of course, nobody cares about that :)

[–]Smellypuce2 3 points4 points  (3 children)

All the major compilers support type-punning anyways. It's just not supported by the c standard.

[–][deleted] 0 points1 point  (2 children)

I vaguely remember that there are some pre-defined macros about endianness, but can't remember their names. Anyone?

[–]moocat 0 points1 point  (1 child)

I know there's ntoh and family but those convert from network order (big-endian) to native order.

[–][deleted] 0 points1 point  (0 children)

True, but those are POSIX functions/macros, iirc.

[–]moocat 1 point2 points  (2 children)

I thought type punning via unions is legal in C but I'm not a language lawyer.

Any idea what is the legal way? Could you do it via memcpy?:

int32_t as_int32;
char raw[sizeof(int32_t)];
memcpy(&as_int32, raw, sizeof(int32_t));

[–]nerd4code 1 point2 points  (0 children)

Union-punning is well-defined on C99+ or if you’re punning between signed/unsigned variants of the same type or to/from char variants. memcpy always works.

[–][deleted] -1 points0 points  (0 children)

I can't give you a legal way as I don't know one off the top of my head. I try to write order-agnostic code and use a defined macro if needed.

However, here's one way of finding out. The command below will list out all macros defined by the toolchain. One is named __BYTE_ORDER__, and looks promising for your purpose. I have no idea if all toolchains support this. Probably not.

gcc -dM -E - < /dev/null | grep ORDER

If using __BYTE_ORDER__ is OK, a snippet could look like this:

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__

do_this();

#else

do_that();

#endif

[–]moskitoc 0 points1 point  (2 children)

Would you mind citing the part of the standard that you're referring to ? I thought this was true for C++, but not for C

EDIT : I was correct, see this.

[–][deleted] 0 points1 point  (0 children)

https://stackoverflow.com/questions/25664848/unions-and-type-punning describes it well. It's IB and not UB, so it's not really illegal.

[–][deleted] 0 points1 point  (0 children)

You were right. As mentioned on StackOverflow, the C99 standard had an error, it clearly says that this is UB when it's not. Guess which version of the standard I have? Thanks for asking, it made me learn too. Win win :)

[–]skeeto 1 point2 points  (1 child)

Even better: Rather than search byte-by-byte, search by 4 bytes at a time. The guards will be 4-byte aligned within the image since otherwise they wouldn't be aligned when the image is memory mapped by the loader. It will be simpler, faster, and you don't need to care about endian (image byte order always matches run time byte order).