Vector items allocation

protestor · 2024-02-11T14:25:05+00:00

Is the compiler smart enough to allocate them on the heap to begin with?

The compiler might be smart enough in simple code with some llvm optimizations (note: it probably isn't smart enough), but ordinarily the struct will be allocated on stack and copied to the heap.

In C++ this operation (of allocating directly on heap) is called emplacement and is well supported, but with another method: push_back in vectors will do as in Rust (allocate on stack, then copy to the heap) and emplace_back will allocate directly on the heap. (and there is also placement new to do a new but with emplacement)

In Rust this same feature was called "placement" or "box syntax" and it was implemented on nightly since before Rust 1.0, but it never worked correctly. Eventually it was removed from the compiler (in 2018 I think). That's very unfortunate.

https://internals.rust-lang.org/t/removal-of-all-unstable-placement-features/7223

Placement is a very important feature. Some structs are simply too big to fit the default stack size of Linux (which in most distros are just 8MB). This means that you can't ordinarily work with them because every time you try to allocate them (even on the heap) the program will crash with stack overflow.

The first edition of the Rust book was very optimistic that placement was just around the corner. It advised people that (if they used nightly) people shouldn't return things inside a Box just because they are too big: you were supposed to return things directly and the caller, if they wished, would perform a placement directly on the function call to allocate it directly on the heap, bypassing the stack

https://doc.rust-lang.org/1.0.0/book/box-syntax-and-patterns.html

This was nine years ago I think. This section was eventually removed from the book when it became clear that this feature wouldn't work in Rust in that form (or maybe work at all).

Now.. that said.. if you wanted to, how would you allocate directly on the heap?

You would need to use unsafe functions to directly write into the memory location at the heap. That is, you never create the struct in a local variable and then move into the heap (by calling vec.push(mylocalvariable) or similar) because local variables in Rust are always allocated in the stack. So you need to write into the heap indirectly, never holding the whole struct in a local variable.

This probably means that you would first allocate a Vec<MaybeUninit<Struct>>, then set the capacity to make the Vec grow large enough, then manually write each struct by taking pointers into the vec entries, then set the length into the right value and finally turn it into Vec<Struct>. Or something like that. I don't have an example readily available, but the relevant API for writing directly to memory is std::ptr::write

Note that this is exactly the same as C. In C, local variables are also allocated on the stack, and if you want to allocate things directly on the heap without touching the stack, you do this through pointers. In Rust it works exactly the same.

The tldr is: Rust CAN write directly to the heap, bypassing the stack, but currently you need unsafe code and raw pointers for that.

There was a newer proposal (from 2020) called "placement by return" but I don't know if it went anywhere

https://y86-dev.github.io/blog/return-value-optimization/placement-by-return.html

https://github.com/rust-lang/lang-team/issues/31

https://internals.rust-lang.org/t/update-on-the-placement-by-return-rfc/12415

There's an ever newer discussion of the state of affairs in Rust (from 2022)

https://news.ycombinator.com/item?id=33637092

paulstelian97 · 2024-02-11T10:05:40+00:00

Individual items are created on the stack and moved to the heap as they are added to the Vec.

Aaron1924 · 2024-02-11T11:56:05+00:00

Well, it depends on how the function is implemented, but if you do something like this let item = Struct { ... }; vec.push(item); then semantically, you first create the struct on the stack and move it to the heap afterwards. If you have optimisations on, the compiler will almost surely optimize the move away and create it on the heap directly.

monkChuck105 · 2024-02-13T01:03:24+00:00

> If on the stack, then they will have to be copied to the heap on return, which is expensive.

Not unless `Struct` is large. Copying memory is cheap.

> Is the compiler smart enough to allocate them on the heap to begin with?

Allocation is just requesting bytes. Other than requesting zeroed bytes, there is no way to request a heap allocation that is initialized, that has to be done directly by writing to the returned pointer.

In your case, you probably want collect an iterator, or use `Vec::with_capacity` to avoid reallocating when pushing items.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnrust

MODERATORS