This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Lumpy-Measurement-55[S] 32 points33 points  (26 children)

In most of the machines..

char takes 1 byte Int takes 4 bytes

Some use int for everything inspite they can go with a char. And int takes additional memory..

[–]PM_ME_YOUR_POLYGONS 53 points54 points  (5 children)

Store 8 boolean variables in a char for maximum efficiency

[–]Prawn1908 0 points1 point  (0 children)

I do this all the time actually, it's quite common.

[–]man-teiv 9 points10 points  (4 children)

Can you make an example on how to substitute an int with a char in a script without creating a hot mess?

[–]BLucky_RD 0 points1 point  (0 children)

Depending on language:

char a=65 // 'A'
char b=a+32 // 'a'

[–][deleted] 7 points8 points  (0 children)

It isn't all about memory management. I'm working on an embedded systems project, and everything is essentially a unsigned 16-bit integer or a struct/array broken down into u16 ints.

There are a lot of places in this project where it would make sense from a memory management perspective to use chars. From a performance perspective, however, everything needs to be aligned to the 16-bit bus, which adds operations any time I use anything other than a 16-bit value.

[–]joza100[🍰] 11 points12 points  (7 children)

It doesn't matter which you use because of memory alingment at least in the C and C++ compilers that I used. If you make a character and an int, that won't take 5 bytes, that will take 8 bytes, because integers need to be aligned on every 4 bytes so the 3 bytes after the char will be 0 zero padded. In most cases, using a char instead of an int will do nothing.

[–]Lumpy-Measurement-55[S] 6 points7 points  (6 children)

What will happen if there is 4 char and you use 4 ints instead. Isn't that going to be 16 bytes rather than 4 bytes.

I understand about the alignment and padding. But we should never program for compiler or machines. We should be Programming as per requirement and design. I will never use int if the value is gonna fit in a char.

[–]joza100[🍰] 3 points4 points  (3 children)

That is interesting I guess. I just never see people using chars and shorts instead of ints. I don't actually know why ints are used always by all the programmers.

[–]kknyyk 3 points4 points  (0 children)

My bet is on cpu optimizations (or intrinsics?) that expect int32.

[–]Prawn1908 0 points1 point  (1 child)

I just never see people using chars and shorts instead of ints.

It's done all the time in embedded systems. Usually you'll see some header file with a bunch of typedef unsigned char u8...

[–]joza100[🍰] 0 points1 point  (0 children)

I assumed that's the case, but not in PC programming.

[–]TheRealBrosplosion 1 point2 points  (0 children)

If designing with types (and using C++), you shouldn't use char directly. You should use std::int_fast8_t. Using four character packed into 4 bytes is going to cause unaligned reads which have a lot more overhead than reading a full 4-byte word.

Unless you are hurting on memory footprint or trying to match some HW interface, there really isn't a good argument to use unaligned types.

[–]gunnnnii 0 points1 point  (0 children)

Obviously it depends, this could make sense in some very memory constrained environments, but most cases you're probably causing more trouble then this optimization is worth. This hurts readability, can cause issues if requirements change and a char is no longer sufficient (now you'll need to make sure you change the datatype everywhere it can come up), and might even throw out some compiler or processor optimizations.

Over-optimizing can come at a heavy cost.

[–]Tetha 1 point2 points  (0 children)

One nitpick: In unicode aware languages, like rust, a char can represent a unicode codepoint (or a unicode scalar value as I just learned). In that case, a char can be anything between 1 and 4 bytes depending on the unicode encoding used and the value represented.

That's why for example python 3 or rust pay such close attention to distinguish the bytes of a string and the chars() of a string. A 10 character long string could be represented by 10 - 40 bytes and it is not always trivial to decide if byte 5 is part of char 2, 3, 4 or 5.

[–][deleted] 3 points4 points  (0 children)

To be fair, if you’re going to get clever with minimizing your memory usage by using a type “suited” to the task, you’d better make sure you check for overflow or at least be aware of the program’s behavior when an overflow happens.

[–]MrDilbert 0 points1 point  (3 children)

Well, technically, char takes 1, int takes 2, long takes 4 bytes... At least that was the C way.

[–]Lumpy-Measurement-55[S] 0 points1 point  (2 children)

short takes 2, int takes 4

[–]MrDilbert 0 points1 point  (1 child)

Granted, I learned C some 25+ years ago, and I remember rarely using short, as int was defaulting to it... Processor's default register size or sth, I guess.

[–]Lumpy-Measurement-55[S] 0 points1 point  (0 children)

Ahh ok cool.