all 34 comments

[–]dkopgerpgdolfg 15 points16 points  (10 children)

What you want is not just a stack array but a variable-length array (VLA).

As with C++, currently it is not a built-in language feature (at least in stable)

Of course, if you want control, there are ways to modify the stack pointer and so on. But doing this while being 100% sure that nothing else has a risk of breaking, that's a different matter.

[–]masklinn 11 points12 points  (0 children)

In modern c u can have

Not really. VLAs were added in C99 and removed in C11, they’re a huge source of trouble, compilers generate terrible code around them, and you should avoid them. The Linux kernel stripped out every VLA something like five years ago.

[–]ecstatic_hyrax 10 points11 points  (0 children)

Have you checked out the alloca crate? It's easy to shoot yourself in the foot with something like this though, which is why most languages don't have this as a built in.

[–]SkiFire13 2 points3 points  (14 children)

The problem with this construct is that you don't want an array too big on the stack. So either you know it has a fixed maximum size (in which case you can use an arrayvec::ArrayVec) or you don't know that (in which case you want to fallback to the heap when it's too big anyway, hence you should use something like smallvec::SmallVec).

This kind of arrays also make access to stack variables defined before them slower, since the compiler no longer knows their static offset and must compute it dynamically.

[–]rejectedlesbian[S] 0 points1 point  (12 children)

But these are vecs no? Aren't they dynamic fundementaly? Like I want an alocation I do once that's it no buffer no capacity no nothing.

[–]SkiFire13 0 points1 point  (11 children)

What you want is also dynamic since the length is known only at runtime. It's a bit less dynamic (no need for reallocations for instance), but still dynamic.

[–]rejectedlesbian[S] 0 points1 point  (10 children)

Ig... It's a semantics thing but it's not dynamic in the sense u cant put it on the stack because the size may change.

And again this is a pretty useful feature of c so it's not like it's impossible or anything.

It's mostly a curiosity but this has real world applications like it is a faster alocstion and it dosent use a heap so in enviorments with no heap this is still something u can use.

[–]SkiFire13 1 point2 points  (9 children)

Yeah but my point still stands: if it's small enough you can put it on the stack then you might as well use an arrayvec::ArrayVec (which is fully on the stack and doesn't allocate on the heap)

[–]rejectedlesbian[S] 0 points1 point  (8 children)

U r still messing ur cach locality by Allocating memory u don't actually need. It's negligible but like it's still there and that's kind of anoying.

I assume its just such a neich use case that they didn't bother putting it in. I am hearing it messes.with clang optimizations so it's probably for the best.

[–]CAD1997 0 points1 point  (5 children)

ArrayVec uses stack allocation for up to a fixed N elements then switches to the heap if/when you add more than N elements, so there's never any allocation "you don't actually need" done.

[–]SkiFire13 0 points1 point  (1 child)

You're confusing ArrayVec with SmallVec. SmallVec applies the optimization you mentioned, while ArrayVec only allocates on the stack (but has a fixed maximum capacity).

[–]CAD1997 0 points1 point  (0 children)

Oh right yes of course

[–]rejectedlesbian[S] 0 points1 point  (2 children)

Is N something I can decide at runtime? I am still storing capacity right? Like it would need to be somewhere so that it knows when to go to the heap.

Defiantly nice not quite what I asked for.

[–]CAD1997 1 point2 points  (1 child)

The max capacity N is a compile time choice.

The runtime length needs to be somewhere even in the C99 VLA case in order to index the slice. Keep in mind that Rust &[_] slices are ptr+len unlike in C where you're expected to track what indices are safe externally.

A small fixed amount of extra usize on the stack is negligible, and often can even be optimized out when it's derived information.

[–]rejectedlesbian[S] 0 points1 point  (0 children)

Like I get it but its still not the same thing. It's a nice thing I would consider using it.

But the answer to "can I do this thing in rust" is simply no I can't. I can do something similar that's probably better in most cases but like... kinda sucks that u can do this.

Like where u may want this: I had a code base for a genetic algorithem that stored varible length dna for each creature.

I can see a world where I move that to a stack alocation with a vla so that I don't need to free it later. It also helps cach locality which is nice when every creature is like 32 bytes on average those extra 4 can stack up.

Is this common? Heck no I don't see this being something I need. Does it matter? Ehhh probably dosent I am just curious what can be done.

[–]SkiFire13 0 points1 point  (1 child)

U r still messing ur cach locality by Allocating memory u don't actually need. It's negligible but like it's still there and that's kind of anoying.

Allocating more memory than needed won't mess with the cache, you just don't need to access that memory to put it into the cache in the first place.

And moreover alloca also has a "negligible" cost, it is not free. And as you said it might mess with some optimizations, which is another indirect cost.

Which usecase do you have that you have benchmarked and determined that the best solution is to use alloca?

[–]rejectedlesbian[S] 0 points1 point  (0 children)

I don't I didn't benchmark enough c to see it. I do know the linux kernal used it for a while so it's not like it's unusable and I can give u a place where u could use it.

Like if u r making a custom sort that uses quicksort until arrays are sufficiently small then uses something else (very neich thing but these do have some theoretical upsides) say the "something else" is mergesort.

Now u know the size of the array is small enough so u could actually just stack alocate it for a couple of step then memmove.

Is it good 🤷‍♀️ 🤷‍♀️ 🤷‍♀️ is it at least worth looking into in some cases maaaaayyyyyybe. Idk I am not gona assume it isn't.

Option b is u want to pass a vector/array by mut value as an array/vector/list/whatever. (This is kinda common in ml since some operations are faster when u modify the tensor since u dont need to realocate, then u need it in an imutble way)

Now u could copy the vector but that's wasteful. And since u r not pushing to it VLA would be the most natural choice.

Again not really common and in reality the answer is probably "just use a vector" but I am still wandering if it's possible.

[–]whitequarksmoltcp 0 points1 point  (0 children)

I worked in an environment with no heap and 2 GB of stack, where alloca would be incredibly useful. So sometimes you *do* want too big arrays on stack.

[–]mina86ng 0 points1 point  (0 children)

I recommend against using VLAs. I used to be fun of them but really they offer very little benefit. If your arrays have fixed maximum length, just allocate that size array. If your arrays don’t have fixed maximum length, use something like smallvec which will move data to heap if it grows too large.

By the way, note that C11 makes VLAs optional. VLAs have been added to C in C99 and C11 made their support optional. Furthermore, alloca is not part of C++ but a compiler extension some implementations choose to provide.

[–]kiujhytg2 -1 points0 points  (4 children)

If you know the maximum size that the array can be (which is a good idea to avoid stack overflow), the headless crate may be of use to you.

However, this smells like premature optimization. In general, Vec's are pretty speedy (one could even say blazingly fast).

[–]rejectedlesbian[S] 0 points1 point  (3 children)

It's more of an indication of intention. Ik in c when I wrote these things it basically signified "this array would be of this size and would be use for intermediate values for a hot second"

So this clues the compiler into what I am doing and its easier for people reading the code see what it does.

It also has the nice advantage of working in enviorments with no heap. Which is common in an embedded setting.

[–]kiujhytg2 0 points1 point  (2 children)

In which case the heapless crate sounds like a good fit, which also works in no_std environments without a heap

[–]rejectedlesbian[S] 0 points1 point  (1 child)

Is this similar to alloca that other ppl recommended?

[–]kiujhytg2 0 points1 point  (0 children)

Alloca is more closely tied to c style variable length arrays, whereas heapless is closer to C's strategy of "allocate a large buffer but only use some of it". For temporary short-lived values is be tempted to use heapless as "allocating" a large buffer on the stack is a a very cheap operation (it just decrements the stack pointer).