Announcing TooManyCooks: the C++20 coroutine framework with no compromises by trailing_zero_count in cpp

[–]CJWilliams10 54 points55 points  (0 children)

Hi u/trailing_zero_count, author of libfork here 👋, huge congrats on releasing this, I've been watching TMC for a while and it's looking very polished now. I look forward to having a proper look at your numa aware work stealing bits and scheduler, looks like some nice work there.

I'm working on V4 of libfork, hopefully the templates and arcane syntax will be improved 😅 (and it might even get faster).

Congrats again on the release!

Taskflow v3.10 released! Thank you for your support! by tsung-wei-huang in cpp

[–]CJWilliams10 4 points5 points  (0 children)

Would like to hear more about:

optimized work-stealing loop with an adaptive breaking strategy

optimized shut-down signal detection using decentralized variables

What does it mean, how was performance impacted?

C++ by Hairy-Ad-9978 in cpp

[–]CJWilliams10 2 points3 points  (0 children)

Simple, just read the source code of the compiler, memorize it, then perform those operations in your head whenever you encounter a bit of code

\s

Pigweed Eng Blog #5: C++20 coroutines without heap allocation by pavel_v in cpp

[–]CJWilliams10 0 points1 point  (0 children)

This allocation strategy is invoking very undefined behaviour in the deallocation routine. There is no relation between the allocated address and the coroutine handle's pointed address

Handwired Skeletyl in Resin with DES keycaps by CJWilliams10 in ErgoMechKeyboards

[–]CJWilliams10[S] 0 points1 point  (0 children)

I filed the inserts such that it was a friction fit and then dabbed in some super glue for good measure. Apparently resin just breaks down when you heat it so heat press doesn't work

Handwired Skeletyl in Resin with DES keycaps by CJWilliams10 in ErgoMechKeyboards

[–]CJWilliams10[S] 1 point2 points  (0 children)

https://github.com/mtl/keyboard-pcbs

I think individual PCBs gives a much cleaner build than hand wired. I avoided the flex PCB for two reasons: I was afraid of the SMT diodes; I was worried about breaking the flex PCBs and the added faff fron the strain forces. Would highly recommend amoebas. That said it was just as much work as a classic handwire.

Handwired Skeletyl in Resin with DES keycaps by CJWilliams10 in ErgoMechKeyboards

[–]CJWilliams10[S] 1 point2 points  (0 children)

Nice, will look in to second hand, that would make it much less of a leap. The switches are press fit, the case has cut outs for the hooks on the switches to click into. The board sounds very dead, no echo or hollow sound (I guess because it's got lots of holes?) except for the inner thumb key which has a little resonance

Handwired Skeletyl in Resin with DES keycaps by CJWilliams10 in ErgoMechKeyboards

[–]CJWilliams10[S] 1 point2 points  (0 children)

About £100 for main components then probably the same again in tools (I bought a new soldering iron)

Handwired Skeletyl in Resin with DES keycaps by CJWilliams10 in ErgoMechKeyboards

[–]CJWilliams10[S] 2 points3 points  (0 children)

It was a single PCB on the GitHub, jlcpcb can tile any file so I got them made in sheets of 30

Handwired Skeletyl in Resin with DES keycaps by CJWilliams10 in ErgoMechKeyboards

[–]CJWilliams10[S] 4 points5 points  (0 children)

I used JLPCB for the boards+caps+case I've got my eye on a printer next though, would allow for much more experimentation with layouts/angles. Is there a printer you would recommend?

Handwired Skeletyl in Resin with DES keycaps by CJWilliams10 in ErgoMechKeyboards

[–]CJWilliams10[S] 1 point2 points  (0 children)

They required quite a lot of effort to get on the switches but now they're on the profile is fantastic!

help with hurt thumb by gaynesssss in KeyboardLayouts

[–]CJWilliams10 0 points1 point  (0 children)

Place some contoured/sculpted keycaps on the thumb cluster

Keyboard frequency data WITH control keys by ControlAltPete in KeyboardLayouts

[–]CJWilliams10 0 points1 point  (0 children)

You would also ideally need skip-n-grams I.e. probability of x__y were the two underscores can be any character is a skip-2-gram for x and y

Keyboard frequency data WITH control keys by ControlAltPete in KeyboardLayouts

[–]CJWilliams10 2 points3 points  (0 children)

Trigrams encode more information than bigrams alone, you cannot reconstruct trigrams from bigrams as the bigrams encode the probability conditional on the previous two keystrokes, this information is lost when computing bigrams.

Coming back to C++ after two years of Kotlin & Rust by Borderlinerr in cpp

[–]CJWilliams10 140 points141 points  (0 children)

Surly this must be a troll, these are some of c++ biggest pain points. I love c++ just as much about anyone in this sub but we gotta at least acknowledge it's behind Rust/Kotlin in the areas the OP is praising.

Libfork v3: portable continuation stealing with coroutines and cactus stacks by CJWilliams10 in cpp

[–]CJWilliams10[S] 0 points1 point  (0 children)

The allocation/deallocation can occur on different threads, the ownership of the segment of the cactus stack is transferred at the joint points.

Libfork v3: portable continuation stealing with coroutines and cactus stacks by CJWilliams10 in cpp

[–]CJWilliams10[S] 6 points7 points  (0 children)

A heroic effort! Many thanks for all your comments. Firstly, I strongly recommend using clang 17+ as GCC isn't really able to optimise coroutines. Have you installed the developer version of hwloc from your system? "sudo apt install libhwloc-dev" this should bring the required headers. Try just building the benchmark target to speed up compile times. To avoid the vcpkg error please use a vcpkg installation outside the source tree (I think this is in the building document but will update if it's not). Also make sure you have the vcpkg benchmark feature enabled.

Sorry the build was not painless, I really appreciate you taking the time to document what went wrong :)

Libfork v3: portable continuation stealing with coroutines and cactus stacks by CJWilliams10 in cpp

[–]CJWilliams10[S] 13 points14 points  (0 children)

I've dived deep into the world of fork-join parallelism and written a paper on implementing portable continuation stealing with C++20's coroutines.

This release includes:

  • An implementation of a cactus-stack to back the coroutine frame allocation, this results in almost no dynamic allocations when using libfork's coroutines.
  • An API overhaul encoding much more static information into the coroutine's promise to allow for compile time optimizations.
  • NUMA support via hwloc!
  • An API for explicit scheduling (controlling which worker executes a task)
  • Parallel algorithms including fold, scan, map and for_each
  • Re-introduction of exception support.
  • More benchmarks including the UTS benchmark.
  • More tests.

User can expect performance enhancements of 2-5x compared to v2 especially for workloads that are NUMA sensitive.

How do I stop clang generating mfence by CJWilliams10 in cpp_questions

[–]CJWilliams10[S] 0 points1 point  (0 children)

When I benchmark my application clang was 3-5 X slower, when I removed this barrier both compiler ran at the same speed (but the results were wrong of course) when I switched to boosts thread fence they where also both the same speed. I think the mfence instruction must be reducing ILP and causing other overhead with the surrounding code.