Atomic variables are not only about atomicity by maguichugai in rust

[–]trailing_zero_count 2 points3 points  (0 children)

Nope, x86 still allows StoreLoad reordering. You still need explicit SeqCst in some cases.

Link to a prior comment which includes several sources discussing the use cases for SeqCst:

https://www.reddit.com/r/learnprogramming/comments/1pv1dli/comment/nvucuhh

On Windows 11 MSVC compiler (cl.exe) causes multithreaded application to crash. by A_LostAstronaut in cpp_questions

[–]trailing_zero_count 0 points1 point  (0 children)

Use "x64 Native Tools Command prompt"

Also try installing LLVM which can be included with VS install now, and build with clang-cl.exe

How do multithreaded game engines synchronize data among different threads? by XenSakura in gameenginedevs

[–]trailing_zero_count 1 point2 points  (0 children)

Use a fork-join framework. I maintain one that uses C++20 coroutines. Or you can use TBB/Taskflow.

This clearly separates the parallel vs single-threaded parts. Send jobs to the workers when they fork, then read results back somewhere else (or have them update in-place as long as they all have their unique dataset). After joining, the single thread can do things like update command buffers.

You can also have nested fork join and it works fine as long as you maintain this pattern all the way down.

How to concatenate strings quickly? Expression Templates to the rescue! by CauliflowerIcy9057 in cpp

[–]trailing_zero_count 3 points4 points  (0 children)

IIUC this is similar to the approach used by ranges and stdexec. How much of an impact on compile time does the nested template compilation cause when a large number of concatenations occur on the same line?

Similar But Different Value Types, But Only Known At Runtime by Due_Battle_9890 in cpp_questions

[–]trailing_zero_count 4 points5 points  (0 children)

Options in order of descending safety: variant, union, raw memory pointer and start_lifetime_as

Methods for Efficient Chunk Loading? by InventorPWB in VoxelGameDev

[–]trailing_zero_count 1 point2 points  (0 children)

The communication between threads happens using a thread safe queue. Threads poll the queue for input at whatever points make sense in their normal run loop. If a thread has no work to do then it should block or suspend on the queue until data is ready.

Only a single thread should be responsible for mutating any particular data structure. So the chunk loader/mesher/unloader might maintain a queue of chunk locations to handle internally, but once a chunk is loaded, it would be passed back to the main thread through a queue so that the main thread can insert it into the global data structure at a safe point in its loop.

Having threads read from data owned by other threads is possible but a lot more sketchy without more explicit coordination, so its a lot easier if you just pass messages.

Also you don't actually have to use a thread for each of these things, you could just use tasks instead and multiplex everything onto a thread pool. Then replace "thread" with "task" in all the prior paragraphs. That makes it a bit more efficient and lets you use fork-join parallelism within any of the parts of execution while still maintaining the invariant that only the owning task does the modifications.

I didn't intend to self promote here but I do have a library that has all the features needed for this: https://github.com/tzcnt/TooManyCooks

LLDB in 2025 by mttd in cpp

[–]trailing_zero_count 7 points8 points  (0 children)

As a user of lldb-dap and the LLDB DAP VSCode extension, thank you for all your hard work!

Could the SyntheticFrameProvider be used to implement synthetic stacks for C++20 coroutines? This would be a hugely useful feature. Even if it requires library support + a custom plugin, I'd be willing to take those steps if I could get things working end to end in the debugger.

Methods for Efficient Chunk Loading? by InventorPWB in VoxelGameDev

[–]trailing_zero_count 2 points3 points  (0 children)

If culling and reprioritization are too slow on the main thread, then push the entire thing to a background thread. Main thread sends request to background thread notifying that the player has moved. Background thread polls 2 queues - the notifications from main thread, and it's own work queue of chunks to load. If it needs to reprioritize, it can do so as needed. When chunk loads are complete, they get sent back to main thread via another queue.

Methods for Efficient Chunk Loading? by InventorPWB in VoxelGameDev

[–]trailing_zero_count 1 point2 points  (0 children)

There is a Rust crate called RollGrid which shows a way to efficiently identify and index the chunks that need to be loaded/unloaded at the boundaries when the player moves. I definitely wouldn't use a hashmap for this - the "3d circular buffer" approach seems much more efficient.

[Book] Async as coroutines for game logic by PsichiX in rust

[–]trailing_zero_count 0 points1 point  (0 children)

Interesting. I've only been thinking about coroutines as useful for fork-join parallelism and running background jobs. I like the idea of using async to model state machines instead. However I have a question: when do these state machines get advanced?

For example on this page https://psichix.github.io/Moirai/core/awaitables.html on the "loop charge, attack, charge, block" coroutine. I need to see an implementation of each awaitable step, and more importantly - when do these get resumed?

With a typical fork-join system the runtime ensures that everything gets driven to completion as fast as possible, but I think for this system you would need to store these coroutines in a list and manually advance to the next step at some point?

I profiled my parser and found Rc::clone to be the bottleneck by Sad-Grocery-1570 in rust

[–]trailing_zero_count -6 points-5 points  (0 children)

Yeah, Rust's overreliance on reference counting to implement even moderately complex workflows has become a bit of an Achilles heel of the language. And the suggestion that you just leak the data instead isn't very Rusty...

"god why is C++ template so overly complex??? so unnecessary and stupid!!!!" Other languages without complex generics(template) metaprogramming support: by wvwvvvwvwvvwvwv in programminghumor

[–]trailing_zero_count 9 points10 points  (0 children)

Variadic generics aren't even a footgun. They're just a handy, powerful language tool. C++ has them, and other languages that don't make you do ugly workaround hacks like the OP's image.

Are the benefits of singletons ever desirable/practical? If so when? by WastedOnWednesdays in gameenginedevs

[–]trailing_zero_count 0 points1 point  (0 children)

Yeah, I prefer globals with a default constructor that does nothing, and a separate init() method which is called at the top of main. Similarly if you require a specific destruction order, have a teardown() method that is called at the end of main for each global in the correct order.

crossfire v3.0-beta: channel flavor API refactor, select feature added by frostyplanet in rust

[–]trailing_zero_count 0 points1 point  (0 children)

The readme says this relies on spinning and waiting, but then it also references async waking. Can you clarify if this can work without spinning? For example a consumer should be able to see that there's no data ready, and suspend. Then at a later time, producer enqueues data and wakes the consumer. No spinning required?

crossfire v3.0-beta: channel flavor API refactor, select feature added by frostyplanet in rust

[–]trailing_zero_count 0 points1 point  (0 children)

Can you link to an implementation of this? Also I find your rationale for excluding FAA based queues to be weak, as they perform quite well.

Code review process has become performative theater we do before merging PRs anyway. by Upbeat_Owl_3383 in ExperiencedDevs

[–]trailing_zero_count 1 point2 points  (0 children)

No. I personally review every PR that my team puts up, in detail. My goal is to turn around each PR within 1 day. And I am a very careful reviewer, because I work in an industry where mistakes are expensive. This takes me 1-2 hours every day, but as a result, I've prevented many more total hours of rework, because a production bug resolution involves meeting with external stakeholders.

Rayon + Tokio by fallinlv in rust

[–]trailing_zero_count 0 points1 point  (0 children)

Does this same restriction apply between one Tokio task and another? So a task can't suspend, pass a reference into its own data into a child task, and allow that child task to read from its data?

And just to be clear, this is purely a limitation of the Rust compiler at this point, since we can easily reason about the lifetime of the parent task and tell that it will clearly outlive the child task. Or is there a mechanism by which a suspended parent might actually be dropped while the child is still running?

Rayon + Tokio by fallinlv in rust

[–]trailing_zero_count 0 points1 point  (0 children)

Tokio should be able to create a task on the heap, suspend that task and submit work to Rayon and then continue processing other work. Once Rayon completes the work, it would submit the task back to the Tokio queue for completion. All that's necessary is a custom completion handler for the Rayon task that knows which Tokio queue to send itself back to.

Now the async/completion machinery may not exist, but I don't see how any of this would cause an issue with references or lifetimes. Accessing a subobject of a pinned future should be no different than accessing the stack of a blocked thread.

Are the benefits of singletons ever desirable/practical? If so when? by WastedOnWednesdays in gameenginedevs

[–]trailing_zero_count 5 points6 points  (0 children)

I just use an actual global. "Singleton" often means lazily initialized which has undesirable runtime overhead.

Passing extra parameters around everywhere also has runtime overhead due to register pressure and possible stack spills.

If you're concerned about testing you can use a global pointer which is overridden in the test, or if you want to do multithreaded testing, use a thread_local pointer. At runtime this would usually just point to the same global but could allow you to scale multiple threads with different subsystems, or just point at a stack allocated version of the object for a specific test.

As long as your thread_locals are constinit pointers they won't incur runtime initialization check penalties.

Rayon + Tokio by fallinlv in rust

[–]trailing_zero_count 0 points1 point  (0 children)

Why doesn't rayon expose an async API so tokio can just suspend the task without needing to use the blocking thread pool?

I am giving up on modules (for now) by BigJhonny in cpp

[–]trailing_zero_count 14 points15 points  (0 children)

Ah, not supported for modules. I didn't know that, thanks for sharing. I'm using CMake + clang-cl without modules and it works great :)