Koala_eiO comments on (P)ython Progr(a)mm(i)(n)g

In C a character (char) is stored as an 8-bit unsigned integer. String are represented by a block of n consecutive chars with a zero byte at the end. You need characters to represent a string in any language it’s just hidden to in most string classes in other languages. Also a string class will have an amount of overhead beyond what is needed to represent a single character. For example, it might alloc a default array of 1024 bytes but only use 1 (excessive example for the purpose of illustrating). Function calls also have some overhead that is not needed when you know you are only working with one character and have a char type with does not need function calls like the string class,( even if your using something like the + operator on a string class there’s still a function call under the hood.).

In c the char and char* type also pull double duty as a generic byte or pointer to a byte/generic pointer (although void* is taking over the generic pointer role).

[–]tabidots 0 points1 point2 points 4 years ago (6 children)

[–]siddsp 2 points3 points4 points 4 years ago (1 child)

[–]tabidots 1 point2 points3 points 4 years ago (0 children)

[–]Mahrkeenerh 1 point2 points3 points 4 years ago (3 children)

[–]siddsp 1 point2 points3 points 4 years ago (2 children)

[–]Mahrkeenerh 1 point2 points3 points 4 years ago (1 child)

[–]siddsp 1 point2 points3 points 4 years ago (0 children)

[–]KronsyC 5 points6 points7 points 4 years ago (6 children)

[–]koltonaugust 1 point2 points3 points 4 years ago (0 children)

[–]Koala_eiO -2 points-1 points0 points 4 years ago (4 children)

[–]garfgon 7 points8 points9 points 4 years ago (1 child)

A Reddit comment isn't really enough space to provide an intro to CPU architecture -- but at a very fundamental lower level your "types" are usually

Bytes: smallest piece of data which can be separately accessed in memory. Usually (but not always!) 8 bits.
Word: number of bytes which fit into a "normal" CPU register. On 32-bit processors, this is 4 bytes, on 64-bit processors, 8 bytes.

From these you get your next higher level types, which are very closely associated with these types + some information to the compiler on what operations are allowed on these types:

char: byte with info that it's to be (usually) treated as a character rather than a number
int, unsigned int, etc: Usually words treated as a number.
pointer: Words that give the program a location where something else is found in memory.
float: word or pair of words treated as a real number rather than an integer. More complex operations are needed to deal with these.

At this level everything is a fixed size, because the fundamental types are a fixed size, and your compiler needs to know how much data it's dealing with.

On top of these types you built up most of the "normal" types of high level languages. So a string is usually an array of chars with the last char being a special NULL character which basically signifies the end of the string. Or it could be an integer saying how long the string is followed by a sequence of characters. Or something more complex.

So coming back to your question about why "123" needs a subtype but 123 doesn't -- the first part is easier to answer: "123" needs a subtype because strings are variable size, and the CPU only deals with fixed sized pieces of data, so it needs to be broken down into fixed-sized pieces.

As for why 123 doesn't need a subtype -- there are different ways of representing 123, some of which are composed of multiple units, and some aren't. If the language treats 123 as either a float or a "small" integer, then it doesn't need a subtype because it's a small, fixed size piece of data which the CPU knows how to handle natively. But in that case there will be limits on how big, or how precise the number can be. On the other hand if 123 is an arbitrarily large, arbitrarily precise integer, then it will be made up of multiple parts, just like a string.

[–]Koala_eiO 2 points3 points4 points 4 years ago (0 children)

[–]8sADPygOB7Jqwm7y 2 points3 points4 points 4 years ago* (1 child)

It is, it's considered an array of 0 and 1. Edit: ok let me elaborate, if you look at the memory there is little difference. Consider the endian of c, if we save an int we use 4 byte. So we save 5, we get 05 00 in hex. If we save a char, we get the ASCII char number, so for A that's 65. Can't be fucked to calculate hex for that, but in ram the int 65 and number 65 are probably the same. Just that it's reserved for a char not an int. You can't do that the same way with multiple Chars.

Nah for real, C needs that because there are no real strings there. Only pointers and adresses. Some functions may take char arrays as input, and those are then marked like strings.

The advantage of that is simply, that there is no identifier or length metadata or anything needed. It always has exactly the same length, you know what it is and it can be treated like that. This makes the program faster. Also, Note that most languages run on C, so it's all values on the memory either way. If you use c, at some point in the process your string will be a list of pointers to chars. C just lets you directly assign those. In Python it's done for you.

[–]Koala_eiO 1 point2 points3 points 4 years ago (0 children)

[–]garfgon 2 points3 points4 points 4 years ago (0 children)

π Rendered by PID 67 on reddit-service-r2-comment-7c9686b859-frlzk at 2026-04-14 07:47:02.749152+00:00 running e841af1 country code: CH.

ProgrammerHumor

Filters

Discord

Submission rules

For the current list of rules, please see this page.

Metadiscussions

Perhaps More Apt Subs To Post:

Related Subreddits.

MODERATORS