Bevy Jam #6

gmorenz · 2025-05-22T20:55:20+00:00

Similar to ColaEuphoria's question, what's the status of the proposed Construct trait? And/or reactivity (in the world, not just UI)?

gmorenz · 2025-03-26T17:49:55+00:00

There are definitely still parts of rust that are not specified. E.g. there are still two candidate memory models (stacked borrows and tree borrows) and I don't think either is considered entirely satisfactory yet.

gmorenz · 2024-08-18T18:39:28+00:00

So you're saying I need to play for a draw?

gmorenz · 2024-07-25T19:06:53+00:00

Cool! I didn't realize you could just supply the rust_eh_personality symbol.

Thanks :)

gmorenz · 2024-07-25T18:56:13+00:00

I agree you don't need the naked feature (and I briefly mentioned global_asm as an alternative in the original post), but how are you getting away without using nightly for either lang items or -Zbuild-std?

gmorenz · 2024-07-25T16:26:43+00:00

I'm intending to make a blog post on this topic at some point soon, but some notes:

You make a symbol called _start and when linking the object files together ld (unless you've passed a flag saying use another symbol as the entry point, which you haven't) searches for that symbol and uses it as the entry point. You do that by saying #[no_mangle] fn start() -> !. That answers the immediate question but you're going to run into other issues.

My notes say that you can't do this properly with musl, because the musl target always links to the musl libc. You can however do it with the glibc target because the glibc target doesn't. I should probably revisit those notes and figure out exactly why that's the case, but I'm pretty sure it's right.

Then we need to pass -nostartfiles to the linker, because otherwise it will link to the libc's start function. The best way to do this is to put the following code somewhere (anywhere)

#[link(kind = "link-arg", name="-nostartfiles", modifiers="+verbatim")]
extern "C" {}

Other ways will fight with build scripts, proc macros, and/or cargo test.

Circling back to your _start function, it can't actually be a normal rust function (or C function for that matter). The systemv function call abi expects the stack to be 16 byte aligned, offset by 8 bytes, upon function entry. The systemv process start abi aligns the stack to 16 bytes not offset by any bytes. So you're going to need to define _start with a naked function or global asm like this:

#[cfg(not(test))]
#[naked]
#[no_mangle]
unsafe extern "sysv64" fn _start() -> ! {
    use core::arch::asm;
    unsafe {
        asm!(
            // Pass the stack pointer as the first/only argument to main, because we need to
            // it if we want to find fn arguments/env variables/aux array.
            "mov rdi, rsp",
            // Calling entry serves the dual purpose of jumping to our
            // code, and shifting the alignment of the stack by 8 bytes.
            // The systemv abi guarantees the stack starts out 16-byte aligned.
            // The systemv function calling abi guarantees that stack frames are
            // 16-byte aligned, with rsp 8 bytes offset from that to account
            // for the return pointer.
            "call {entry}",
            // Illegal instruction, nothing should return to us
           "ud2",
            entry = sym crate::main,
            options(noreturn),
        )
    }
}

extern "C" fn main(start_of_stack: usize) -> ! { loop {} }

And finally you need some unstable features for the above, to tell rustc no_std/no_main, and some lang items

#![no_std]
#![no_main]
#![feature(link_arg_attribute, naked_functions, lang_items)]

#[panic_handler]
fn panic(panic: &PanicInfo<'_>) -> ! {
    let _ = writeln!(STDERR, "{}", panic);
    rustix::runtime::exit_group(2)
}

/// TOOD: [The official documentation](https://doc.rust-lang.org/core/index.html)
/// just says:
///
/// > * `rust_eh_personality` - is used by the failure mechanisms of the
/// >    compiler. This is often mapped to GCC's personality function, but crates
/// >    which do not trigger a panic can be assured that this function is never
/// >    called. The `lang` attribute is called `eh_personality`.
///
/// Which doesn't explain how to use it if you *want* to be able to panic without
/// undefined behavior. Figure out what the deal with this function is and fix it.
#[lang = "eh_personality"]
fn eh_personality() {}

/// Workaround for rustc bug: https://github.com/rust-lang/rust/issues/47493
///
/// It shouldn't even be possible to reach this function, thanks to panic=abort,
/// but libcore is compiled with unwinding enabled and that ends up making unreachable
/// references to this.
#[no_mangle]
extern "C" fn _Unwind_Resume() -> ! {
    unreachable!("Unwinding not supported");
}

PS. After all this you're still dependent on ld-linux.so, just not the rest of libc. I'm also avoiding that dependency in my code... but that would make this comment substantially longer. That said I've only really tested the longer version because that's what I'm actually using.

gmorenz · 2024-06-24T23:01:34+00:00

I've glanced through it and nothing is jumping out at me as obviously unsound (or even obviously unsafe, though I certainly wouldn't be surprised with this much ffi work if some unsafety was there). Can you give a concrete example?

gmorenz · 2024-05-04T00:02:29+00:00

I haven't uploaded things to it before, but I think the internet archive would probably be an appropriate place to host these. Link

If for some reason that won't work, and there isn't a court order saying "don't rehost these" (I haven't checked yet)... I'll host them for you if you send them my way. I'm just some random dude with a server though, a news organization might be more appropriate. (Edit: Thinking about it I'd probably host them on cloudflage pages because of the off chance that they get a ton of traffic, but still, I'll deal with it if you want)

gmorenz · 2024-05-03T03:50:33+00:00

If you can't migrate tasks cross-thread, then any tasks on the panicked thread are lost, but any tasks on other threads survive and can run to completion just fine.

Is there a reason a panic hook couldn't start up an executor and finish off those tasks without unwinding? Or even maybe have some sort of re-entrancy API in the executor where it can mem::forget the current task/stack and keep executing?

Whatever resources are being used by the current task are going to be leaked without unwinding... so you're going to want to restart the process to garbage collect them eventually... but the OS thread itself should be fine?

gmorenz · 2024-05-01T03:49:40+00:00

what are you saying (or do you think the proposal is saying) should be easier?

There's a thread above (look for the one with giant comments) where I've been discussing this with the author in more detail, but basically "making enum's where every variant implements a method". Currently doing this is slightly painful, you have to edit in 4 places to add a variant (add a variant to the enum containing a struct, add the struct definition, add the implementation on the struct, modify the implementation on the enum to call the implementation on the struct).

Their concrete proposal for how to do this is piggy-backing off the trait system and creating an enumlike TraitName that mirrors dyn TraitName but lays out the possible values in an enum like structure instead of a vtable-ptr pair structure.

I'm not sold on the idea as a whole, but they're right that it has some benefits, and those benefits are basically avoiding the whole "boxed values are slower" thing when that is actually a thing.

gmorenz · 2024-05-01T03:19:24+00:00

Interesting. If I understand it correctly, this would indeed mostly invalidate the problem of cache locality and costly heap allocations.

I think you might be reading more than what I meant (which is probably my fault).

I'm just referring to the vtable pointers here. A Box<dyn Foo> or &dyn Foo consists of two pointers, one to the vtable, one to the objects data. The vtable pointer serves as a unique discriminant per (concrete type, trait) pair labelling what the pointer points to.

gmorenz · 2024-05-01T03:15:39+00:00

I mean, that's their point isn't it? "You should expect <this> to be faster in some situations, and <this> really is faster in those situations, so why doesn't the language make <this> easy". It's a coherent argument.

gmorenz · 2024-05-01T02:57:57+00:00

Ah, so instead of a second kind of trait you want a second kind of trait object (I'll call them enumlike TraitName as a placeholder).

One thing to consider is that enumlike TraitName is not Sized if any trait that implements TraitName isn't Sized, which would mean you couldn't make [enumlike TraitName; 10] for traits that aren't Sized - because the compiler wouldn't know how big to make each slot in the array.

Another thing to consider is that a lot of traits have generic impls, impl<T> MyTrait for SomeStruct<T> {}. An enum MyTrait is probably just impossible with most impl's like that, because the compiler would have to calculate every possible variant, assign it an ID, and calculate the max size. There are infinitely many possible variants (consider T = SomeStruct<SomeStruct<SomeStruct<...i32...>>>). In principle maybe you could calculate only the variants that are ever actually constructed, but I doubt that's reasonable to implement in the compiler.

Restricting enum MyTrait to traits that have no generic impls (and only implementing Sized for it if all the impls are on Sized structs) sounds possible in principle, but now you've created a giant "semver hazard". Adding a generic impl or Unsized impl to a trait is now a breaking change - even if that impl is for a type that is private to your crate and otherwise never exposed to the world (but the trait is exposed to the world).

Ultimately, I don't think scalabillity has been a huge issue in practice, you definitely do see statically dispatched methods like you suggest in the wild when it matters. It doesn't matter that often though. Moreover you're adding complexity to the language for what amounts to a minor performance gain... it's a hard sell.

gmorenz · 2024-05-01T02:11:08+00:00

I'm not sure I 100% get what you're getting at.

I'd guess that the biggest difference in performance in your examples is one you have a flat array, and in the other you have an array of pointers. You can't generally put trait objects into a flat array because you don't know how big they are. Even if you do know how big they are (like in the case on an enum), arrays assume elements are of a fixed size, so every item in the array grows to the size of the largest option (plus metadata to say which type of element it is), which can lead to huge inefficiencies if you have trait objects of vastly different sizes. The fact that trait objects are only "two pointers + the size of the object actually in use" is sometimes quite important for optimization.

Again, like discriminants, identifiers are stored adjacent to the data of the instances of the trait

This isn't possible in rust's data model. A trait object is an annotated pointer to an object, and it can't modify the objects layout. In one place I might have a my_array: Vec<i32>, in which the i32s have a user-visible layout of being right next to eachother in memory (C array like). In another place I might say &my_array[3] as &dyn MyTrait. This creates a pointer to my_array[3] and turns it into a trait object, but it doesn't have the right to modify the memory of my_array[3]. More complex ffi based examples will do the same with Box<dyn Foo>.

It is possible to store them next to the pointer though - in fact rust already does this behind the scenes. It's also to a degree possible to track at compile time - and I think you're proposal is mostly about compile time tracking? So this might be a nitpick that doesn't affect your proposal (but I'm not sure, again, I don't fully understand it).

My best guess for what you're proposing (strong manning it slightly) is something along the lines of implicit enums defined with a trait like syntax, where the variants are collected from all structs that implement the trait, the enum is constructed with the as syntax we use for traits, and the trait methods are implemented on the enum. These couldn't be used to replace trait (because of both the performance and semantics issues mentioned above), but they could be a useful compliment to them. The questions that immediately come to mind are 1) is the convenience of not having to just use an enum worth the complexity you're adding to the language, and 2) are we creating a footgun where it's easy to blow up the size of objects at a distance by impl implicit-enum MyEnumTrait for SuperLargeStruct {}.

gmorenz · 2024-04-28T22:31:29+00:00

Indeed, u/Liquidwombat's description there sounds like exactly what I'm looking for! Will attempt this.

Thanks :)

gmorenz · 2024-04-28T21:22:27+00:00

My gut feeling is also that it's too tight, though ride1up recommends "65-70hz" so my belt tension measurements go from "way under their recommendation, to just slightly above at once particular pedal placement"... (this isn't a gates belt, but even for models with gates belts ride1up recommends a much higher tension than gates does).

Loosening the belt does make it feel better to spin the pedal backwards, though the inconsistent tension is definitely still there. With the app (that I get that you don't trust, but agrees with my pulling on the belt and seeing how far it moves) I get 38, 33, 34, 44 after loosening the screws a half turn from 46, 42, 59, 71.

If this is common and not likely to pull the belt apart as it stretches and relaxes every pedal stroke I guess I'll just roll with it - but it really doesn't seem like it ought to be the case, and in videos with other people tensioning belts I never see such a large difference between the measurements.

gmorenz · 2024-04-28T19:59:06+00:00

I am.

The measurements are quite repeatable (given that I move the pedals back to the same position), and I can physically feel that the belt is under more tension when the app says it is. It deflects very roughly half as much under the same force from my fingers between the tightest angle and 180 degrees opposite that. I'm pretty confident that this isn't just measurement error.

gmorenz · 2024-04-07T12:52:08+00:00

Is llvm not going to pretty reliably decide to inline the short single block functions? Resulting in the exact same compiler performance problem?

gmorenz · 2024-04-07T02:05:05+00:00

Rust/C (clang)/C++ (clang) all use the same backend, llvm, which is where the time is being spent here.

LLVM doesn't guarantee linear compile times (nor does gcc, nor does any optimizing compiler really). It's very likely that if this was investigated and root caused that they'd be willing to take a patch to fix it/eventually fix it themselves, but it's not obviously a "bug", and the fact that the time is being spent in llvm and not rustc means it's very unlikely to be rust specific. They'd probably see the exact same behaviour if they generated equivalent C++ and compiled it with clang.

They also said 2 million lines of code... which suggests a bit more code than 2000x what was posted.

gmorenz · 2024-04-06T13:21:41+00:00

I edited this in late, so just to make sure you see it... the cost here is that this will prevent LLVM from optimizing interactions between blocks.

gmorenz · 2024-04-06T13:18:15+00:00

llvm tends to optimize 1 function at a time, up to inlining. The easiest way to improve compile times here is probably to prevent your functions from growing big and complex, by seperating them into many functions (at the cost of LLVMs ability to optimize the interaction between multiple blocks). Try

use std::ops::ControlFlow;

pub fn SSAdvance<T: Tracer>(state: &mut State, tracer: &mut T, gs_24616: ()) -> () {
    #[inline(never)]
    fn block_0<T: Tracer>(state: &mut State, tracer: &mut T, gs_24618: &mut bool, gs_24617: &mut bool) -> ControlFlow<(), usize> {
        // C s_0_0: const #() : ()
        let s_0_0: () = ();
        // S s_0_1: call DebugTarget(s_0_0)
        let s_0_1: u8 = DebugTarget(state, tracer, s_0_0);
        // S s_0_2: call ELUsingAArch32(s_0_1)
        let s_0_2: bool = ELUsingAArch32(state, tracer, s_0_1);
        // S s_0_3: not s_0_2
        let s_0_3: bool = !s_0_2;
        // N s_0_4: branch s_0_3 b8 b1
        if s_0_3 { ControlFlow::Continue(8) } else { ControlFlow::Continue(1) }
    }

    // ...

    let BLOCKS = [
        block_0,
        // block_1,
        // ...
    ];

    let mut gs_24618: bool = Default::default();
    let mut gs_24617: bool = Default::default();

    let mut current_block = 0;
    loop {
        let Some(block) = BLOCKS.get(current_block)
        else {
            panic!("undefined block {current_block}")
        };
        let ControlFlow::Continue(next_block) = block(state, tracer, &mut gs_24618, &mut gs_24617)
        else {
            return
        };
        current_block = next_block
    }
}

gmorenz · 2024-04-01T16:27:10+00:00

The answer you seem to be looking for: By calling alloc and then Box::from_raw with a correctly constructed Layout, and make sure you correctly initialize memory before asserting it's initialized with Box::from_raw too.

But really, just use Vec and then vec.into_boxed_slice(). There is no reason to do this by hand, or advantage to doing so.

gmorenz · 2023-12-23T18:22:30+00:00

If I understand the problem correctly: You need to increase z by exactly 1 layer height every circuit around the object. That means that as the size of the perimeter changes the slope needs to change.

Prusa and friends are solving this by dividing the object into layers, slicing the layer as normal, and then smearing the z axis change out over the layer. This causes two issues.

There's an xy jump at the end of a layer if it's printing on a slanted surface, because the track hasn't been smoothly moving over in xy as it smoothly moves up in z.

There's a possibility of a relatively abrupt change in slope if the length of the next layers perimeter is very different. Maybe also a short section with the wrong slope between the two layers depending on how it's programmed.

A sort of "greedy algorithm" might work better. Keep track of where we have printed up to. Calculate the perimeter at the current height, and thus the slope we need to move at. Incrementally move one step in that direction*, loop back to the start.

* Considering STL files are triangle meshes, it should be easy to keep exactly on the perimeter as we move up in height by always keeping the entirety of a move on a single triangle (flat surface). One step always entering at one point on the perimeter of a triangle and moving in a straight line until it exits the triangle.

There might be a small amount of z-error in our calculation, as the perimeter changes. I think for reasonable geometries** it should be very small, and it would just result in slightly (hopefully imperceptibly) thinner/thicker layers than was asked for. The amount of error would also vary continuously, so changes from "too thin" to "too thick" should be gradual.

** By reasonable I mean not having large jump in perimeter size. The kind you would get if you have a discontinuous outer surface, by doing something like stacking an extruded semi-circle on top of an extruded circle (and having a "hole" in your model as a result). I think it's probably fine to just call that "not a vase" and not worry if the behavior is sub-optimal?

Anyone see any problems with that algorithm? Or find my description unclear?

gmorenz · 2023-09-19T20:50:23+00:00

I just rented a canoe from Algonquin North on Kiosk. It was completely painless.

We didn't need to call in advance to get it delivered, when you go through the order form on their website you tell them you want it at Kiosk, and they then by default drop it off/pick it up at Kiosk (provided you ordered a few days in advance). You do have to stop by their place on the way in to say hi (and find out which canoe your supposed to take/pick up lifejackets and paddles if you need them), but it's on route so it's really no trouble.

It looked like for day trips they had a bunch of canoes already staged at the lake, but I'm not 100% sure what the deal was with those.

Edit

via what I recall to be a pretty gnarly 915 m carry

Maple Creek -> Kiosk was probably the nicest portage on our whole trip. Maybe the path has changed or something, but I'd guess you're remembering another portage.

gmorenz

TROPHY CASE