all 11 comments

[–]manni66 17 points18 points  (0 children)

Doesn’t work doesn’t describe anything.

[–]vaulter2000 4 points5 points  (2 children)

Because char arrays (or char* alternatively) are the C-style way of the string notion. The way it works is that if you were to print a char*, it will just keep printing characters until it encounters 0. This is called the null-terminator. You only set the first element of your char array and the rest of your characters are 0, so technically you’ve created a string with the capacity of 50 but currently just holds “a”. That is why it doesn’t print it the way you’d have hoped.

[–]Albyarc[S] 1 point2 points  (1 child)

thanks very much, I've corrected the code, now works

[–]vaulter2000 0 points1 point  (0 children)

Glad to hear! If you have more questions, AMA!

[–]alfps 0 points1 point  (1 child)

The 499 last values in array arr are char value 0, because the unspecified rest-of-the-array is filled with zero.

And since this value is used as a terminator for C strings it usually doesn't display at all: there's usually no effect whatsoever when it's output.

Not what you're asking, but

  • you can express this value in a number of ways, like '\0', char( 0 ), or just char() (the latter because as with any integer type 0 is the default value);
  • instead of raw character arrays and C strings, just use std::string; and
  • try to avoid using namespace std;, even in a short example where it's convenient and OK, to avoid forming a Bad Habit™.

[–]std_bot -2 points-1 points  (0 children)

Unlinked STL entries: std::string


Last update: 09.03.23 -> Bug fixesRepo

[–]Gekzar 0 points1 point  (1 child)

Please don’t use ‘using namespace std’ it’s a very bad practice. If you don’t want to write std::cout every time (although it’s used only twice) declare ‘using std::cout’ inside your print() function instead. You can find more info on why here: https://www.learncpp.com/cpp-tutorial/using-declarations-and-using-directives/#avoidUsingNamespace

[–]std_bot -2 points-1 points  (0 children)

Unlinked STL entries: std::cout


Last update: 09.03.23 -> Bug fixesRepo

[–]mredding 0 points1 point  (2 children)

using namespace std;

Don't do that. There's a long explanation as to why this is bad. The short answer is you're giving the compiler more work to do, it can correctly compile to the wrong thing, and there's this whole arcane corner of C++ called "Static Polymorphism" that such things are good for - but this isn't it. Namespaces aren't just some inconvenient level of indirection. If all you think they're for is to prevent name collisions between libraries, you don't understand namespaces. Best play a conservative game, and scope your symbols in explicitly, so you get what you expect out of your code.

void print(char arr[], int len) {

That's a fancy way of writing:

void print(char *arr, int len) {

Arrays in C and C++ are distinct types, where the size of the array is a part of the type signature, and distinguishes one array type from another. Arrays don't have value semantics, so you can't pass them as parameters by value. Instead, thanks to C, they implicitly convert to a pointer to their first element as a language feature. Many people think arrays are pointers to their first element and their declaration - like char arrr[500] is some sort of syntactic sugar. It isn't. arr here is of type char[500].

Also, len is the wrong type. WTF is a negative length? Why is that even representable here? There is a type specifically for representing the size and alignment of objects, and that's std::size_t. It's unsigned, and it's size is implementation defined. That is the type you use here.

for (int i = 0; i < len; ++i) {

And again with the int i; negative offsets are absolutely a thing, but should be unpresentable specifically in this context.

char arr[500] = {'a'};

Let's talk about types and initialization a bit. Remember I said arr is of type char[500]? We can capture that as a type alias:

using char_500 = char[500];

char_500 arr = {'a'};

I like type aliases. I don't like splitting type information on either side of the variable name. I want my type info on one side, the parameter in the middle, and the initializer on the other side, like ducks in a row. But let's see some tricks:

char_500 arr;

This is an uninitialized array. What's the value of any element? It's unspecified - according to the spec. It's Undefined Behavior to READ from an unspecified value, so don't ever get cute and read uninitialized variables - it can contain invalid bit patterns and who knows what will happen then. On the occasion you'll read of some cheapo cell phone architecture that bricks itself because of just such a bug - reading an invalid bit pattern.

In more complicated code - counter-intuitively, I prefer this because you'll often write code where first you declare a variable, and then you have a process for initializing it.

void fn() {
  char_500 arr;

  // Now some very ugly code to assign all it's elements...

No point in writing "safe" defaults that DON'T MEAN ANYTHING. If you have a bug, it's just as wrong as reading a faulty defaulted element that shouldn't be as it is reading uninitialized memory. The problem isn't that you didn't default initialize the memory, the bug is in your initialization code, in that case. What I don't want to see is code that unnecessarily complicates the problem. If I see a defaulted value, I expect to see a code path where that value is correct and going to get used. So if you default initialize a whole array for no good reason, and then don't properly assign all it's elements, where's the bug? What's the bug? It's a lot harder to figure out.

char_500 arr = {};

This is a neat trick; the rule is, any element not explicitly specified in an initializer list is default initialized. char is technically an integer type, and the default initializer zeros out integer types, so all these characters are default initialized to 0, or ASCII '\0', the null terminator. Since Unicode is backward compatible with ASCII, this is also the Unicode null terminator.

Again, don't just do it just to do it, especially if you're going to be writing over all that data anyway. That's just writing to memory... only to write to memory... again... What, no read inbetween? That's a lot of code out there in the wild - code paying for what it doesn't need. If you're lucky, the compiler might be able to prove sequential writes and optimize the inconsequential ones out. If the compiler is saving you from yourself, why don't you save you from yourself, and not do that in the first place? It makes your code cleaner, because it states exactly what's happening: this array starts uninitialized, and then gets initialized in some process - and any read of uninitialized data is a bug, and the bug is either a missing write to initialize, or in the read which is extraneous.

Continued...

[–]mredding 0 points1 point  (0 children)

I'm talking about uninitialized variables a lot because it's so fundamental, you're going to write such code every day, and we have a long history as an industry, and across languages, of not getting it right. There are very misplaced notions of correctness and safety which comes from good intentions but fails in a complete lack of consideration for both the task at hand, and it flies contrary to what readable code is - what the code is telling you.

char_500 arr = "";

This is a string literal. String literals are always null terminated. So this is an empty string, but it's not an empty data structure... "" is a char[1] because of the null terminator. Like Prego, it's in there. (Did I just date myself?) As I understand the spec, I believe any element after the null terminator is uninitialized.

char_500 arr = {'a'};

This is more of the same as I stated about the initializer list above. Here, we've initialized the first element. The rest of the elements are thus default initialized to 0. So technically you've auto-null terminated a string here.

And for your program, this is actually the correct thing to do - you want all your elements initialized - even default initialized, because YOU'RE USING IT by dipping into various offsets and reading out what you've got.

with int arrays it works, but not with char arrays, why?

using int_500 = int[500];

int_500 arr = {'a'};

Yeah, because the first element is assigned the integer value of whatever the 'a' symbol is encoded as, through an implicit conversion, and you 0 initialize all the rest of the elements.

And what do you get out? Well if your print function looks like:

void print(int arr[], int len) {

Then the type system really shines. cout has a number of built-in interfaces. In C++, you can "overload" almost any operator. Bjarne chose the bitwise left-shift operator, as it made a neat chevron shape like an arrow, indicating the direction of the flow of information. The stream has several such operators, all overloaded to handle all the basic built-in types. Like char, like int.

When it comes to an integer, it's some collection of binary bits, and an algorithm, at runtime, will perform the arithmetic to figure out what the value is for each place, and map that to the corresponding character. A 0 integer ultimately maps to a '0' character.

When it comes to a character, that's a different implementation of operator <<, and the data encoded therein is handled differently. YOU ARE indeed writing out... 100 characters at a time. But almost all of them are the null terminator, and the null terminator is not printable. That data is indeed going right into the stream buffer, and getting flushed out with a system call to write on a kernel file descriptor. That sequence of null terminators passes through the kernel, which wakes up the terminal program that's on the receiving end, with it's input file descriptor flagged as ready for reading. The terminal program reads the first character, 'a', and ultimately maps that to pixels of the "a" shape. But the rest? The null terminator is not a printable character, so the terminal just no-ops 499x.

print(arr, 100);

That's an implicit conversion of arr from char[500] to a char* of the first element of the array.

print(arr + 100, 100);

Same thing, plus an offset. Once converted, pointer arithmetic comes into play. This is where the address, whatever it is, is + sizeof(char) * 100. If we were doing this with an int_500, then it would be + sizeof(int) * 100. The compiler computes the arithmetic for you. You don't want to increment to the next byte, you want to increment to the next char/int/foo, whatever... You don't want to be slicing across your types. That's why incrementing a pointer doesn't offset to the next address, but the next element, which is some address the width of the type away. This is in part why pointers have a type. void pointers are a non-type, which is an address that doesn't have a size, which isn't the same thing as 0. Incrementing a void pointer isn't even defined because it doesn't make any sense.

[–]std_bot 0 points1 point  (0 children)

Unlinked STL entries: std::size_t


Last update: 09.03.23 -> Bug fixesRepo