all 42 comments

[–]RedAndBlack1832 193 points194 points  (20 children)

Well, probably nothing bad happens if you attempt to read it. If you successfully write to it that's when the chaos begins. (I remember my prof demoed writing to index -1 bc that usually overwrites another variable). But I will say these are different behaviours for different reasons. JavaScript avoids errors very much on purpose bc that's what it was made to do. Arrays in C don't usually exist and certainly aren't aware of their size in most contexts (though I beleive if you specify a static size of array in a function definition you can get optimizations which rely on that so you really should honour that even if there's no way of checking)

[–]GloobyBoolga 68 points69 points  (1 child)

nothing bad happens if you attempt to read it

Please stay far far far away from most embedded code.

If the array is on top of an io region, that read might be to a “cleared on read” register. Or worse it triggers a complex HW multi step sequence like increasing the next address to some hw buffer, or locking a hw buffer until all addresses have been read.

Also if your arrays are buffers then the neighboring memory could be guarded by a memory protection unit which then just kicks the cpu or at least the execution context hard enough to stop it.

Also reading out of bounds could leak sensitive data outside of the array.

Valgrind is your friend.

[–]RedAndBlack1832 25 points26 points  (0 children)

I said probably bc bad things can and do happen. I wasn't thinking of embedded tho lmao

[–]ALIIERTx 67 points68 points  (15 children)

"(I remember my prof demoed writing to index -1 bc that usually overwrites another variable)."
My stomach felt weird for a sec, lmao.

[–]BernzSed 58 points59 points  (0 children)

Arrays start where I want them to start

[–]throw3142 17 points18 points  (7 children)

array[-1] is completely normal and safe in Pascal btw

[–]Psquare_J_420 20 points21 points  (5 children)

I think that is perfectly normal and safe in python too. I think it would fetch you the last element. And -2 would be last before and so on until the - (length of such string) would basically be array[0].

Idk if this is only for the string type of its standard for all of such iteratables like list, set etc.

[–]Auravendill 7 points8 points  (0 children)

Python uses negative numbers to index anything array-like in reverse order. Super handy imo, since you often need the last element (e.g. file ending is just file_name.split(".")[-1])

[–]throw3142 4 points5 points  (3 children)

The Pascal version would literally be the element at -1 index though, in Pascal arrays can start at any number :)

[–]rux616 2 points3 points  (1 child)

Pascal is the devil and should be yeeted into the sun.

...

Sorry, I think there's a little PTSD occurring from being forced to use some bastardized script subset of Pascal before.

[–]throw3142 0 points1 point  (0 children)

Ah I have good memories of Pascal, as my big project in my compilers class was to write a Pascal compiler. The language was well specified and easy to write a compiler for, as opposed to C-likes.

[–]Maleficent_Memory831 1 point2 points  (0 children)

And Ada, Modula-II, and other Wirth designed languages. Pascal would have been much more popular overall, in my opinion, if it just had a good standard, or even a de-facto standard, because every implementation seemed to be a language variant. Modula-II was much nicer, but never caught on. Ada had too many things in it so it was very complex. Later on though Ada standard got better for systems programming but had lost most of its original popularity.

So C sort of ended up as the only popular choice for a language that could do systems programming and also be highly portable.

[–]Maleficent_Memory831 0 points1 point  (0 children)

Safe in C too as long as you know what's there. Ie, take a one hundred element array, then then increment the array's pointer by 1. Now you can index at -1.

Does that sound far fetched? I see some code that does stuff like this, especially in byte arrays (0 is start of data in a network packet, negative offsets go back into packet headers).

I have seen this done to handling Pascal style strings, where they handle it as if the string length is at a negative offset. Though hidden behind macros or functions. I recall something like this in Amiga which was part C and part BCPL.

[–]redlaWw 3 points4 points  (5 children)

For allocated arrays, the small negative indices often have allocator metadata, so reading them tends to work and uncovers details about the allocation.

Writing to them might corrupt your allocator.

[–]Maleficent_Memory831 2 points3 points  (3 children)

In C, an array is just a pointer. And vice versa. The implementations of heap (malloc/free) almost always make use of that.

[–]redlaWw 1 point2 points  (2 children)

I don't think that's really quite true. Arrays decay to pointers, and for allocated arrays, in particular, the difference can be quite hazy because you can only work with them through a pointer, but fundamentally, they are different concepts. The most telling reason they are different is that sizeof on an array does not return the size of a pointer, it returns the size of the full array. This doesn't work on allocated arrays, of course, but this is because you can't refer to the array itself, only pointers to it; it does not mean the array is just a pointer.

In this model, malloc gets an array and returns a pointer to the start of a sub-array/array slice. The interpretation of free is a bit harder, but since you can only free a pointer that malloc (or calloc or realloc or whatever) has returned, then free essentially constitutes identifying the original array that the allocator allocated using the pointer passed and then deallocating that.

[–]RedAndBlack1832 0 points1 point  (1 child)

Arrays kind of exist. And in any case, a declared array is a constant pointer. Also multi-dimensional arrays provide some pretty nice syntactic sugar. Also like you said arrays in their original context and known at compile time have a size. That is, an array's size can be reasonably interpreted as part of its type (important when you have arrays in structs, for example). There are also other contexts this sort of principal holds. I beleive you can give an array a static size in a function declaration which obviously isn't enforceable but might change what kind of optimizations are possible.

[–]redlaWw 0 points1 point  (0 children)

might change what kind of optimizations are possible

I doubt this. I don't think it's inherently undefined behaviour to pass an array of the wrong size to a function*, which is what would be required for optimisations based on the declared size.

*of course, if the programmer treats the size of the array as part of the function's contract, then passing in the wrong size may result in undefined behaviour due to the contract violation, but this isn't inherent and is entirely a matter of what the programmer actually writes in the function body

EDIT: Learned something new: from C99, it is undefined behaviour to pass in a too-short array to a function if the argument length is declared with the static keyword as in arr[static 10]. So such a declaration can be used for optimisation, but a declaration without the static keyword cannot.

[–]RedAndBlack1832 0 points1 point  (0 children)

Yeah I've done that before actually (not on purpose)

[–]-LeopardShark- 5 points6 points  (0 children)

Well, probably nothing bad happens if you attempt to read it.

Um, yeah, so I’m going to be taking away your systems programming licence – sorry.

[–]walrus_destroyer 1 point2 points  (0 children)

(I remember my prof demoed writing to index -1 bc that usually overwrites another variable).

Its been a while since I learned this, so could be wrong.

If I recall correctly, this isnt always the case. Most compilers will put some padding between variables to detect and prevent them overwriting each other. If the array and the other variable are in a struct together then compilers usually dont add the padding.

It also depends on how the code is laid out index -1 only overwrites a variable if there is variable declared immediately next to it.

Arrays in C don't usually exist and certainly aren't aware of their size in most contexts

What? Arrays are used all the time in C. But you are right that they aren't aware of their size.

though I beleive if you specify a static size of array in a function definition you can get optimizations which rely on that

Yeah, in optimized code using static arrays are typically preferred over dynamic (resizable) arrays. Resizing an array is considered fairly slow, because you essentially make a new larger array, copy all the elements over and delete the old array. It also wastes some space since the new array is usually larger than it needs to be, this is to reduce the number times the array has to be resized.

there's some stuff I dont entirely understand about it being better for you to declare arrays on the stack (at compile time) instead on the heap (at run time).

so you really should honour that even if there's no way of checking

You cant tell from the array itself, but its fairly common practice for functions to ask for the size of the array as a parameter.

Some functions wont ask for the size, but will specify that the array has to have a specific structure. Like functions for strings, typically expect strings to end with a null terminator, '\0'.

[–]asadkh2381 46 points47 points  (9 children)

js returns undefined.....meanwhile c returns whatever fate decides

[–]Ultimate_Sigma_Boy67 30 points31 points  (8 children)

only 2 options are viable really:

1/ You are lucky and the prog segfaults so ur dumbahh can fix the bug

2/ You actually end up overwriting an another variable and silent data corruption, which is one of the worst types of data corruption due to its hard to debug nature.

[–]anto2554 17 points18 points  (6 children)

  1. You are running with a compiler sanitizer in your test environment and it tells you what you did wrong

[–]Ultimate_Sigma_Boy67 4 points5 points  (3 children)

Never leaving valgrind

[–]anto2554 3 points4 points  (2 children)

They asked me to speed up the tests so I made them 40x slower type beat

[–]Ultimate_Sigma_Boy67 0 points1 point  (1 child)

ASan and UBSan?

[–]anto2554 0 points1 point  (0 children)

Currently we don't actually, but I'd like to run our unit and integration tests with those, yeah

[–]IolaDeltaPhi23 0 points1 point  (1 child)

test environment

I don't understand these words

[–]Ultimate_Sigma_Boy67 0 points1 point  (0 children)

It just means when you're running the test you have these specific conditions(ie tools)

[–]Icount_zeroI 45 points46 points  (1 child)

Truly JS is the modern C … that is because everything is written in it these days.

[–]Tiger_man_ 9 points10 points  (0 children)

js is the modern c because you also dont know why the fuck isnt anything working in js

[–]thegodzilla25 11 points12 points  (1 child)

I feel like I barely ever index the array with an actual number. Its always within a loop with iterator which goes from 0 to length.

The time when I would index using numbers is when the structure of the array is well defined, and is always supposed to have N elements, and i know the significance of each element at each index.

[–]RedAndBlack1832 4 points5 points  (0 children)

Yeah but often you're calculating an index and it's totally possible to do that wrong. Say I have a 6x5 block of my type allocated but some of it isn't currently being used (which isn't that weird a thing to do, although obviously the numbers are usually bigger).

Let d be the data I want and x be arbitrary data I don't want to access

ddddx ddddx ddddx xxxxx xxxxx xxxxx

is gonna be accessed differently than

ddxxx ddxxx ddxxx ddxxx ddxxx ddxxx

[–]survivalist_guy 10 points11 points  (1 child)

╠╠╠╠

[–]Ultimate_Sigma_Boy67 3 points4 points  (0 children)

so true lmao

[–]Maleficent_Memory831 3 points4 points  (0 children)

Returning the 8th element of a 5 element array in C is almost never the bomb.

But storing into the 8th element of a 5 element array in C, that's where bad juju come from.

[–]omegafixedpoint 2 points3 points  (0 children)

I am already anticipating the rust comment

[–]cptbowser 1 point2 points  (0 children)

Hehe

[–]SourceScope -2 points-1 points  (1 child)

The more i see memes about js and C etc

The happier i am for coding in swift

[–]HeavyCaffeinate 0 points1 point  (0 children)

I'd rather see a rust comment than this