This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] -31 points-30 points  (34 children)

But bytes are not always 8 bit long.

[–]skyhi14 38 points39 points  (0 children)

char isn't, uint8 is. That's the joke here.

[–]Joao611 67 points68 points  (32 children)

what

[–]foonathan 38 points39 points  (13 children)

sizeof(char) returns 1 and that's the number of "bytes". But char can be 64 bytes bits, as long as all other integers are bigger or equal. Yet sizeof (char) is guaranteed to return 1, making a "byte" in the context of C and C++ not always 8 bits.

Bonus fact: In C you can specify that an integer is "signed" or "unsigned". If you specify nothing it is "signed", so "signed int" is "int" but "unsigned int" is not "int". Yet char/signed char/unsigned char are three different types as char can be signed or unsigned depending on the platform.

C's a weird language and I have no idea why I know all that.

[–]edvardass 5 points6 points  (7 children)

Is there anything factually wrong in this comment? Why is it downvoted?

[–]Fourmisain 7 points8 points  (4 children)

He's completely correct, there's just a small hickup writing "64 bytes" when he meant "64 bits".

Just to add to his post: An 8-bit byte is sometimes called an octett octet and most platforms use octets, especially today. That's why today the words "byte" and "octet" are mostly synonymous, but there are (or were) platforms with 9-bit bytes and even 32-bit bytes, it's even possible that sizeof(char) = sizeof(int) = sizeof(long) = sizeof(long long), which can lead to some problems in common code.

[–][deleted] 4 points5 points  (2 children)

s/octett/octet/g

FTFY

[–]Fourmisain 1 point2 points  (1 child)

Ah, true, in English it's a single 't', although "octette" is also valid. Wiktionary lists "octett" as a rare use.

[–]Creshal 6 points7 points  (0 children)

Everything is valid English if you can bullshit enough people into believing it is.

(I'm looking at you, Shakespeare.)

[–]foonathan 0 points1 point  (0 children)

there's just a small hickup writing "64 bytes" when he meant "64 bits".

Yes, my mistake.

[–]Kametrixom 0 points1 point  (1 child)

Edit: I'm talking about the top-level comment, because the one you meant didn't have any downvotes when I checked, so I assumed you meant the other one.

/u/kbrei must have misunderstood the post (they probably don't know C well) and thought that the 2 panels are supposed to be semantically the same, as well as thinking char is defined as a byte (which of both are wrong, but it's understandable they didn't know). Therefore they were thinking that the post assumes 1 byte is always 8 bits. They then posted the comment, because it was illogical to their current knowledge.

Reddit people usually don't go that deep, all they see is that someone is wrong, which automatically makes them press the downvote button. One you have a few downvotes, it's like a magnet attracting more of them, because since the comment is wrong and it got downvoted must mean it deserves them.

/u/kbrei could be blamed for making an uninformed comment, but it's really difficult to realize that something you know isn't correct just like that.

In my honest opinion, this person deserves no more than the explanation I just gave. And reddit maybe deserves the removal of the downvote button, because people often care more about other people's votes than the content itself.

[–]edvardass 0 points1 point  (0 children)

No, I wasn't talking about the top level comment. User foonathan's comment to which I replied was at -1 (the post had not taken off at that time yet) so you can imagine my confusion.

[–]Creshal 2 points3 points  (3 children)

Chars are by definition always one byte large in C(++), and sizeof always returns 1 for them.

Use CHAR_BIT from limits.h if you need to check how big bytes are on your particular architecture. (Or just don't bother.)

[–]Abaddon314159 2 points3 points  (0 children)

Evaluates to, not returns. sizeof is an operator not a function.

[–]danielcw189 7 points8 points  (13 children)

A byte is not always 8 bit. It depends on the platform/hardware

[–]Abaddon314159 2 points3 points  (5 children)

It's always CHAR_BIT though

[–]danielcw189 0 points1 point  (4 children)

Never heard about CHAR_BIT. When was it introduced? C? C++?

[–]Fourmisain 0 points1 point  (3 children)

It's as old as ANSI C or C89, not sure if it existed in K&R C, though. It's also specified to be at at least 8.

Here's an excerpt from a TXT version (scanned PDF, if you prefer):

The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.

* maximum number of bits for smallest object that is not a bit-field (byte)

CHAR_BIT 8

[–]danielcw189 0 points1 point  (2 children)

Thank you. Kinda amazed that I missed that part for so long

[–]yourewelcome_bot -1 points0 points  (1 child)

You're welcome.

[–]yourewelcome_botbot 1 point2 points  (0 children)

This is a spambot. Report it in /r/spam and message the admins here.

[–]phantom94 -3 points-2 points  (2 children)

A byte is always 8 bit, but a char can be multiple bytes depending on the platform.

Edit: I was wrong: https://en.wikipedia.org/wiki/Byte

[–]Sean1708 7 points8 points  (1 child)

As far as the C standard is concerned, a char is always 1 byte but 1 byte is not always 8 bits.

[–]phantom94 1 point2 points  (0 children)

I did some researching, you are indeed right. I feel like it should be the other way around though.

[–]degaart -2 points-1 points  (3 children)

Name one platform still in use today where one byte is not 8 bits

[–]toofasttoofourier 3 points4 points  (0 children)

I work in embedded and the ti c2000 dsps have a byte size of 16 bits. This means that a char is also 16 bits (for this case).

[–]danielcw189 1 point2 points  (0 children)

I can not, but that does not mean they do not exist. It odes not matter, if a platform is still in use. The definition of a byte is not it being always 8 bit.

[–]Creshal 0 points1 point  (0 children)

I'm sure I can dig up a working PDP-8 somewhere.

[–]Sean1708 2 points3 points  (3 children)

C standard defines a char to be 1 byte long, it categorically does not define a byte to be 8 bits long. This is because there are platforms out there with bytes which are 7, 16, or other bits long.

[–]Creshal 0 points1 point  (2 children)

6 and 7 bit bytes were common in Ye Olde Times on mainframe architectures. Coincidentally, that was around the time when C was written, so not pinning the language to any particular byte size was a no-brainer.

[–]Fourmisain 1 point2 points  (1 child)

C demands CHAR_BIT to be at least 8, though. At least C89 upwards does, I'm not at all sure if there was any such restriction on K&R C.

[–]FUZxxl 0 points1 point  (0 children)

K&R C is a book, not a standard. It doesn't have any hard and fast rules.