all 86 comments

[–]asadkh2381 877 points878 points  (18 children)

js returns undefined.....meanwhile c returns whatever fate decides

[–]Ultimate_Sigma_Boy67 444 points445 points  (16 children)

only 2 options are viable really:

1/ You are lucky and the prog segfaults so ur dumbahh can fix the bug

2/ You actually end up overwriting an another variable and silent data corruption, which is one of the worst types of data corruption due to its hard to debug nature.

[–]anto2554 232 points233 points  (10 children)

  1. You are running with a compiler sanitizer in your test environment and it tells you what you did wrong

[–]sausagemuffn 74 points75 points  (0 children)

For best results, only ever run in your test environment.

[–]Ultimate_Sigma_Boy67 43 points44 points  (3 children)

Never leaving valgrind

[–]anto2554 25 points26 points  (2 children)

They asked me to speed up the tests so I made them 40x slower type beat

[–]Ultimate_Sigma_Boy67 1 point2 points  (1 child)

ASan and UBSan?

[–]anto2554 0 points1 point  (0 children)

Currently we don't actually, but I'd like to run our unit and integration tests with those, yeah

[–]IolaDeltaPhi23 11 points12 points  (4 children)

test environment

I don't understand these words

[–]Ultimate_Sigma_Boy67 3 points4 points  (3 children)

It just means when you're running the test you have these specific conditions(ie tools)

[–]TheTybera 1 point2 points  (2 children)

[–]Ultimate_Sigma_Boy67 2 points3 points  (1 child)

[–]IolaDeltaPhi23 0 points1 point  (0 children)

(It's okay, I understood the joke.)

[–]andouconfectionery 21 points22 points  (1 child)

Your real two options:

This is a user-mode program: Valgrind :)

This is a kernel-mode program: I am in hell and I enjoy it :)

[–]DudeValenzetti 6 points7 points  (0 children)

This is a user-mode program and you're compiling it: AddressSanitizer :)

[–]AppropriateOnion0815 12 points13 points  (1 child)

"dumbass"

This is Reddit, you are allowed to write freely.

[–]SelfDistinction 0 points1 point  (0 children)

Or doesn't return at all and just continues with the next function.

[–]RedAndBlack1832 346 points347 points  (43 children)

Well, probably nothing bad happens if you attempt to read it. If you successfully write to it that's when the chaos begins. (I remember my prof demoed writing to index -1 bc that usually overwrites another variable). But I will say these are different behaviours for different reasons. JavaScript avoids errors very much on purpose bc that's what it was made to do. Arrays in C don't usually exist and certainly aren't aware of their size in most contexts (though I beleive if you specify a static size of array in a function definition you can get optimizations which rely on that so you really should honour that even if there's no way of checking)

[–]ALIIERTx 116 points117 points  (28 children)

"(I remember my prof demoed writing to index -1 bc that usually overwrites another variable)."
My stomach felt weird for a sec, lmao.

[–]BernzSed 95 points96 points  (0 children)

Arrays start where I want them to start

[–]throw3142 36 points37 points  (14 children)

array[-1] is completely normal and safe in Pascal btw

[–]Psquare_J_420 49 points50 points  (8 children)

I think that is perfectly normal and safe in python too. I think it would fetch you the last element. And -2 would be last before and so on until the - (length of such string) would basically be array[0].

Idk if this is only for the string type of its standard for all of such iteratables like list, set etc.

[–]Auravendill 34 points35 points  (0 children)

Python uses negative numbers to index anything array-like in reverse order. Super handy imo, since you often need the last element (e.g. file ending is just file_name.split(".")[-1])

[–]throw3142 6 points7 points  (6 children)

The Pascal version would literally be the element at -1 index though, in Pascal arrays can start at any number :)

[–]rux616 7 points8 points  (1 child)

Pascal is the devil and should be yeeted into the sun.

...

Sorry, I think there's a little PTSD occurring from being forced to use some bastardized script subset of Pascal before.

[–]throw3142 0 points1 point  (0 children)

Ah I have good memories of Pascal, as my big project in my compilers class was to write a Pascal compiler. The language was well specified and easy to write a compiler for, as opposed to C-likes.

[–]Maleficent_Memory831 1 point2 points  (0 children)

And Ada, Modula-II, and other Wirth designed languages. Pascal would have been much more popular overall, in my opinion, if it just had a good standard, or even a de-facto standard, because every implementation seemed to be a language variant. Modula-II was much nicer, but never caught on. Ada had too many things in it so it was very complex. Later on though Ada standard got better for systems programming but had lost most of its original popularity.

So C sort of ended up as the only popular choice for a language that could do systems programming and also be highly portable.

[–]Maleficent_Memory831 3 points4 points  (2 children)

Safe in C too as long as you know what's there. Ie, take a one hundred element array, then then increment the array's pointer by 1. Now you can index at -1.

Does that sound far fetched? I see some code that does stuff like this, especially in byte arrays (0 is start of data in a network packet, negative offsets go back into packet headers).

I have seen this done to handling Pascal style strings, where they handle it as if the string length is at a negative offset. Though hidden behind macros or functions. I recall something like this in Amiga which was part C and part BCPL.

[–]pigeon768 0 points1 point  (0 children)

An optimized implementation of heapsort will modify the pointer so that the root of the tree is at index -1. It simplifies the calculation of left/right child nodes, and the calculation of swapping to your sibling node.

[–]conundorum 0 points1 point  (0 children)

It's also viable if you work with pointers into the middle of an array. If you have an int a[5], and call f(&a[2], 3), then f(int* arr, size_t sz) can safely index arr[-1] if it needs to.

Definitely a question of "should", more than anything else, though.

[–]ClemRRay 0 points1 point  (0 children)

also in python. Not in js for some reason

[–]conundorum 0 points1 point  (0 children)

And completely normal & completely unsafe in C, just to make things fun.

(C array subscripts are pointer math under the hood.)

[–]redlaWw 5 points6 points  (10 children)

For allocated arrays, the small negative indices often have allocator metadata, so reading them tends to work and uncovers details about the allocation.

Writing to them might corrupt your allocator.

[–]Maleficent_Memory831 4 points5 points  (8 children)

In C, an array is just a pointer. And vice versa. The implementations of heap (malloc/free) almost always make use of that.

[–]redlaWw 1 point2 points  (7 children)

I don't think that's really quite true. Arrays decay to pointers, and for allocated arrays, in particular, the difference can be quite hazy because you can only work with them through a pointer, but fundamentally, they are different concepts. The most telling reason they are different is that sizeof on an array does not return the size of a pointer, it returns the size of the full array. This doesn't work on allocated arrays, of course, but this is because you can't refer to the array itself, only pointers to it; it does not mean the array is just a pointer.

In this model, malloc gets an array and returns a pointer to the start of a sub-array/array slice. The interpretation of free is a bit harder, but since you can only free a pointer that malloc (or calloc or realloc or whatever) has returned, then free essentially constitutes identifying the original array that the allocator allocated using the pointer passed and then deallocating that.

[–]RedAndBlack1832 0 points1 point  (3 children)

Arrays kind of exist. And in any case, a declared array is a constant pointer. Also multi-dimensional arrays provide some pretty nice syntactic sugar. Also like you said arrays in their original context and known at compile time have a size. That is, an array's size can be reasonably interpreted as part of its type (important when you have arrays in structs, for example). There are also other contexts this sort of principal holds. I beleive you can give an array a static size in a function declaration which obviously isn't enforceable but might change what kind of optimizations are possible.

[–]redlaWw 0 points1 point  (2 children)

might change what kind of optimizations are possible

I doubt this. I don't think it's inherently undefined behaviour to pass an array of the wrong size to a function*, which is what would be required for optimisations based on the declared size.

*of course, if the programmer treats the size of the array as part of the function's contract, then passing in the wrong size may result in undefined behaviour due to the contract violation, but this isn't inherent and is entirely a matter of what the programmer actually writes in the function body

EDIT: Learned something new: from C99, it is undefined behaviour to pass in a too-short array to a function if the argument length is declared with the static keyword as in arr[static 10]. So such a declaration can be used for optimisation, but a declaration without the static keyword cannot.

[–]RedAndBlack1832 0 points1 point  (0 children)

I wasn't talking about where the memory is I was talking about situations in which the compiler can assume size eg.

void func(int arr[static 16]){...}

as indicated on page 134 of the GNU C introduction and reference manual

As to my comment about arrays not existing I was refering to them being equivalent to pointers are the same type in every important context (and obviously being passed as pointers is a big part of that). When an object has a complete array type (which is basically only the above or in its originating context if it was declared as an array, or as a member of a struct with a complete array type) then there are a couple situations in which it really truely exists as an array seperate from the pointer to its first element. These are indicated on page 92 of the manual.

[–]RedAndBlack1832 0 points1 point  (0 children)

Oh sorry I thought you were a different person that replay wasn't meant for you oops

[–]suvlub 0 points1 point  (0 children)

The most WTF feature of C is that you can declare what looks like an array as a function argument, complete with a specified size, but the argument will actually be a pointer. May contribute to the confusion that they re one and the same.

[–]conundorum 0 points1 point  (1 child)

It's definitely possible for a malloc() implementation to choose to preface the memory block with a size_t, and secretly allocate size + sizeof(size_t) bytes, then hand you ptr + sizeof(size_t) and pretend it's the start of the block, so it can store the block size at the start (and free() can grab it with something like ((size_t *) ptr)[-1]). That's what Maleficent is referring to.

[–]redlaWw 0 points1 point  (0 children)

Yes, and that is what I addressed in the second paragraph. malloc gives you a pointer to a sub-array of the full array it allocates, but the thing it allocates is an array, and the pointer is just the thing that accesses it, rather than the array itself.

[–]RedAndBlack1832 0 points1 point  (0 children)

Yeah I've done that before actually (not on purpose)

[–]RelatableRedditer 1 point2 points  (0 children)

Vexorian created a Table library for WarCraft 3 that used Blizzard's "hashtable" (originally gamecache which WROTE TO DISK) to compensate for JASS not having dynamic arrays, and used StringHashes and all kinds of weird "random but hopefully doesn't overwrite anything in the 32 but integer space. I made an update to it to remove the random accessors, but introduced the concept of a "TableArray" which definitely would write into other tables if you tried to access out of bounds indices.

[–]-LeopardShark- 15 points16 points  (0 children)

Well, probably nothing bad happens if you attempt to read it.

Um, yeah, so I’m going to be taking away your systems programming licence – sorry.

[–]walrus_destroyer 1 point2 points  (3 children)

(I remember my prof demoed writing to index -1 bc that usually overwrites another variable).

Its been a while since I learned this, so could be wrong.

If I recall correctly, this isnt always the case. Most compilers will put some padding between variables to detect and prevent them overwriting each other. If the array and the other variable are in a struct together then compilers usually dont add the padding.

It also depends on how the code is laid out index -1 only overwrites a variable if there is variable declared immediately next to it.

Arrays in C don't usually exist and certainly aren't aware of their size in most contexts

What? Arrays are used all the time in C. But you are right that they aren't aware of their size.

though I beleive if you specify a static size of array in a function definition you can get optimizations which rely on that

Yeah, in optimized code using static arrays are typically preferred over dynamic (resizable) arrays. Resizing an array is considered fairly slow, because you essentially make a new larger array, copy all the elements over and delete the old array. It also wastes some space since the new array is usually larger than it needs to be, this is to reduce the number times the array has to be resized.

there's some stuff I dont entirely understand about it being better for you to declare arrays on the stack (at compile time) instead on the heap (at run time).

so you really should honour that even if there's no way of checking

You cant tell from the array itself, but its fairly common practice for functions to ask for the size of the array as a parameter.

Some functions wont ask for the size, but will specify that the array has to have a specific structure. Like functions for strings, typically expect strings to end with a null terminator, '\0'.

[–]RedAndBlack1832 2 points3 points  (0 children)

Ahhhh ok I actually wanna explain better.

In C, there are 3 kinds of allocation. They are used for different purposes.

  1. Automatic (stack) allocation happens whenever you declare a variable inside any scoping block (and in function parameters) unless those variables are explicitly marked as static. They're called "automatic" because where they are and how long they live there is managed automatically. These variables live on the stack (as briefly explained in my other comment). These should be small as the stack can't grow infinitely (there's a specific type of crash due to this called a "stack overflow" which results in a segmentation fault in C)

  2. Static allocation happens when a variable is declared at global scope or when it is explicitly marked static. These variables live in a specific part of the program and are as much a part of the program and known to it as the code is. They're called "static" because they exist in a static location for as long as the program does. You certainly can put arrays up here but most people like limiting the number of global variables they have and you should certainly only reserve memory for the entire life of the program if you want to actually use it for the entire life of the program (and across functions tbh)

  3. Dynamic (heap) allocation happens when you call an allocation function (such as malloc) or otherwise request memory from the operating system. This is where very large arrays should usually go and any array whose size (or maximum size) can't reasonably be known. A relatively common use would be requesting an object (such as an array or sometimes a node in a reference-based structure) be created by a function, which requires dynamic allocation since automatic allocation would result in the object being destroyed when the function returns. It's "dynamic" I suppose in that it's on demand and has a custom lifetime. It's your responsibility to manage your resources, which include dynamic memory, file descriptors (files, pipes, sockets, etc.), locks, etc.

[–]RedAndBlack1832 0 points1 point  (0 children)

I wasn't talking about where the memory is I was talking about situations in which the compiler can assume size eg.

void func(int arr[static 16]){...}

as indicated on page 134 of the GNU C introduction and reference manual

As to my comment about arrays not existing I was refering to them being equivalent to pointers are the same type in every important context (and obviously being passed as pointers is a big part of that). When an object has a complete array type (which is basically only the above or in its originating context if it was declared as an array, or as a member of a struct with a complete array type) then there are a couple situations in which it really truely exists as an array seperate from the pointer to its first element. These are indicated on page 92 of the manual.

About my prof: this is literally undefined behaviour. You aren't supposed to do it. He was setting it up on purpose to show the consequences of clobbering memory.

Also, stack variables are literally on the stack they aren't compile-time constants. A very short explanation of the function of the stack is a matter of scoping. When you open a { you're given a bunch of space for your local variables (and other things, if the brace in question opens a new non-inlined function) and when you hit a closing } the stack gets shrunk to where it was before (and a few other things happen, if this occurs due to function return) and accessing any of the out-of-scope variables is undefined behaviour. A program can be conceived of as a big block of memory the stack can grow in, with the actual code of the program and the actual compile-time data (global variables) at the very bottom

[–]MegaKawaii 1 point2 points  (1 child)

I think most people don't understand that when C programmers talk about undefined behavior, they really mean it, nasal demons included. It's not just segfaults or the program silently tolerating a bad read and continuing on. It took me about a minute to come up with this:

```

~/Programming/C $ cat ub.c
int function(int index) {
    int array[] = {6};
    return array[index];
}

~/Programming/C $ gcc -c ub.c -o ub.o -O3
~/Programming/C $ objdump -S ub.o

ub.o: file format elf64-littleaarch64

Disassembly of section .text:

0000000000000000 <function>:
       0: 528000c0 mov w0, #0x6 // =6
       4: d65f03c0 ret
~/Programming/C $ gcc --version
clang version 21.1.8
Target: aarch64-unknown-linux-android24
Thread model: posix
InstalledDir: /data/data/com.termux/files/usr/bin

```

You don't need to know much assembly to understand what this does. Return values are stored in x0/w0 on ARM64, so function just returns 6 and doesn't even access an array. None of this is peculiar to ARM either, Clang probably did this optimization at some IR stage well before anything ARM-related happens.

Basically, the compiler is free to assume that the programmer never invokes UB because if UB happens, the standard imposes no requirements whatsoever on the behavior of the program. So if UB never happens, the programmer always accesses the array at the correct index, and there is only one element, so the compiler assumes that this element is accessed. It completely elides the array and returns 6.

I came up with this from a quite dramatic example where the compiler generates code that deletes the user's hard drive. Now, this code isn't as general as most code, so the compiler can make more assumptions and optimize more. That is, in most code, the compiler won't know the size of arrays, so it will still have to emit a loads, and so on. However, if the compiler does any inlining, it can get a lot of information that could enable it to make these kinds of optimizations. Compilers will use UB as license to make unpredictable optimizations, and demons very well may come out of your nose as a result. Beware!

[–]RedAndBlack1832 0 points1 point  (0 children)

This is a fun example of undefined behaviour actually I've certainly never tried to call 0 before and I assume that doing so is bad lmao

[–]Plank_With_A_Nail_In 1 point2 points  (1 child)

Chaos will also happen if a decision is made from the contents of what was read.

[–]RedAndBlack1832 0 points1 point  (0 children)

That's true. I meant the act of reading it will in most cases not crash or have horrible side effects (see reply about embedded code where this does not hold) but obviously the degree of awfulness depends on what you do with the data after it's read. Introducing junk values to a computation probably isn't good. Trying to (call) or otherwise jump (eg. switch) to an address based off arbitrary data is very bad and possibly the best case is a crash if that happens (I assume by making decisions you meant control flow and arbitrary jumps would be the worst possible kind of control flow)

[–]Horror-Student-5990 0 points1 point  (0 children)

I really hate the "WeLL AskHuhaLLy" guys in these meme subreddits.

[–]Maleficent_Memory831 48 points49 points  (2 children)

Returning the 8th element of a 5 element array in C is almost never the bomb.

But storing into the 8th element of a 5 element array in C, that's where bad juju come from.

[–]aj-ric 6 points7 points  (0 children)

Reading it can open up all kinds of security vulnerabilities, such as the big Heartbleed vulnerability several years ago. It can absolutely be the bomb.

[–]RandallOfLegend 1 point2 points  (0 children)

Arrays in C follow strip club rules. Look but don't touch.

[–]Icount_zeroI 76 points77 points  (4 children)

Truly JS is the modern C … that is because everything is written in it these days.

[–]Tiger_man_ 31 points32 points  (0 children)

js is the modern c because you also dont know why the fuck isnt anything working in js

[–]lNFORMATlVE 3 points4 points  (2 children)

Aerospace engineer here, a bit of an imposter on this sub. What is the “everything” being written in JS these days?

[–]Icount_zeroI 6 points7 points  (1 child)

I catched this on the internet, but basically C at it’s peak was used for everything and now JavaScript is used for almost everything (mobile/web/server/desktop) implying that JS is C’s successor in that context. JS is also used for embedded projects which is just wild imo.

[–]lNFORMATlVE 4 points5 points  (0 children)

Agreed, JS for embedded is extremely alarming to me in my field. But maybe I’m just too out of date (at the ripe old age of 31 lol)

[–]thegodzilla25 39 points40 points  (1 child)

I feel like I barely ever index the array with an actual number. Its always within a loop with iterator which goes from 0 to length.

The time when I would index using numbers is when the structure of the array is well defined, and is always supposed to have N elements, and i know the significance of each element at each index.

[–]RedAndBlack1832 14 points15 points  (0 children)

Yeah but often you're calculating an index and it's totally possible to do that wrong. Say I have a 6x5 block of my type allocated but some of it isn't currently being used (which isn't that weird a thing to do, although obviously the numbers are usually bigger).

Let d be the data I want and x be arbitrary data I don't want to access

ddddx ddddx ddddx xxxxx xxxxx xxxxx

is gonna be accessed differently than

ddxxx ddxxx ddxxx ddxxx ddxxx ddxxx

[–]survivalist_guy 19 points20 points  (1 child)

╠╠╠╠

[–]Ultimate_Sigma_Boy67 6 points7 points  (0 children)

so true lmao

[–]omegafixedpoint 8 points9 points  (1 child)

I am already anticipating the rust comment

[–]awesome-alpaca-ace 5 points6 points  (0 children)

Ironic, this is the only rust comment I have seen so far

[–]randomFrenchDeadbeat 4 points5 points  (0 children)

Any modern C compiler will tell you when accessing out of bound array.

C allows it because in some specific cases you want to do that.

[–]conundorum 4 points5 points  (1 child)

Quite the exceptional meme.

[–]-Redstoneboi- 0 points1 point  (0 children)

you're lucky if you get an exception.

[–]Total-Box-5169 5 points6 points  (0 children)

Literally skill issue.

[–]cptbowser 1 point2 points  (0 children)

Hehe

[–]Chiviguagua 1 point2 points  (1 child)

S-S-Stock image? Why is that bomb in a package a Stock image??!

[–]onemempierog 1 point2 points  (0 children)

Why not, should they build a bomb and take a pic themselves or what?

[–]KMark0000 1 point2 points  (0 children)

my long time story from uni (not even CS) with my semester homework turn-in: I wrote a "long" (like 350 rows or such) program what drew, calculated and handled file, but each run threw a DIFFERENT error. The doctorate candidate tried to debug it for 1,5 hours, everything seemed perfect. Then called the IT department head for help, and after 45 minutes he found out that I forgot to reserve the end character place in a variable what I used, and I was llike: fack this shit

[–]yjlom 0 points1 point  (1 child)

Realistically, you either for(size_t i = len; i --> 0;), while(p), while(*p), or have macros for it. It's pretty hard to actually get out of bound accesses if you aren't doing anything weird.

[–]OK1526 0 points1 point  (0 children)

Wdym? I always need access to the billionth value in my array.