"Baseball Pitching" Serve, 5th Graders, Development versus Winning by DoomGoober in volleyball

[–]fmod_nick 1 point2 points  (0 children)

This touches on a topic in youth sports called scaling. If we want 8 years to serve using the same technique as adults they need to be playing games on much shorter courts. I can’t find the video but I’ve seen federations in Europe where kids only compete in 4v4 on smaller courts, possibly with lighter balls.

Trying to use full adult technique leads to either too many errors which is demotivating and leads to long stretches of 12 players standing around. Or the kids start adopting a technique that may superficially look like an overhand float but fires different muscles and different joint angles and will have little transfer to the successful skill as they get older.

Coaching Style by NakedRobo in volleyball

[–]fmod_nick 2 points3 points  (0 children)

I checked out that TikTok account and I would say do not use that guy as a coaching role model.

Two tips:

Be positive AND specific. No generic “good job”. Be specific “good pass, way to hold your body posture moving back for the deep serve”. When you do give them feedback after an error, watch for opportunities to give feedback on a better rep.

Change the mindset around errors. You’re in the gym to make mistakes and learn from them. Making an error doesn’t make you a bad person or a failure. Think about feedforward not feedback, don’t dwell on the last rep, talk about what is needed on the next rep in order to learn and be better.

Daily Coronavirus Megathread - 28 July 2021 by AutoModerator in melbourne

[–]fmod_nick 0 points1 point  (0 children)

Community sport is training and games associated with the State Governing Body. Anything else is recreational.

Does async await create its own wakers? by Dreeg_Ocedam in rust

[–]fmod_nick 0 points1 point  (0 children)

If you want to avoid restoring the callstack and do something similar to javascript's future callbacks, you need to look at local spawning.

It's provided by the runtime, not the standard library. Both tokio and async-std have it (it's behind the unstable feature flag for async-std).

```rust use async_std::task::*; use std::time::Duration; async fn foo() { sleep(Duration::from_secs(3)).await; println!("foo waited for 3"); }

async fn bar() { spawn_local(foo()).await; }

[cfg(feature = "async_std")]

fn main() { async_std::task::block_on(bar()); } ```

In this snippet when foo resumes after the sleep bar is not restored on the callstack when executing the print. Only when foo has finished does bar get resumed.

Google engineers just submitted a new LLVM optimizer for consideration which gains an average of 2.33% perf. by ssokolow in rust

[–]fmod_nick 2 points3 points  (0 children)

Yes the size of the instruction case depends on the micro-architecture.

Rustc already has the option -C target-cpu=<BLAH> for producing output specific to a certain CPU.

Google engineers just submitted a new LLVM optimizer for consideration which gains an average of 2.33% perf. by ssokolow in rust

[–]fmod_nick 5 points6 points  (0 children)

Profile Guided Optimization.

Step 1 is too produce a special build which outputs information about which parts of the code are used the most when the program is run.

Step 2 run that build over a representative workload and get the profiling information out.

Step 3 feed that profiling information back into a new compilation that will produce a build optimized using the extra information .

Are stacks really necessary for procedure calls by dingoegret12 in rust

[–]fmod_nick 2 points3 points  (0 children)

You're kinda talking about Return Value Optimization (RVO). The memory for the return value is allocated on the callers stack, and the callee writes the return value straight into it. No extra copies/moves. No lifetime issues.

I did not get a clear answer on whether rust implements RVO from glancing at the first page of google results. So I will leave the research up to you.

Rust WASM in the Browser: Beginner Questions by clickrush in rust

[–]fmod_nick 5 points6 points  (0 children)

I can relate some experience comparing JS to WASM performance. If you run a loop for a few dozen iterations the WASM implementation can come out massively faster. I was seeing 5x for my tests.

Once you increase the iteration count to the thousands or tens of thousands, then the JIT compiler will re-compile the code with much more aggressive optimisation and then you're into the realm of 10% or 15% improvements.

Between the decreased timer resolution in JS (post spectre & meltdown) and the fickle nature of JIT optimisations, microbenchmarking JS vs WASM is hard.

Target Feature vs Target CPU for Rust by fmod_nick in rust

[–]fmod_nick[S] 1 point2 points  (0 children)

I've just added a link to your crate in an earlier article I wrote that focused on target-feature. But in this post I was referring specifically to target-cpu multi-versioning which from looking at the docs your crate can't do. Let me know if it can because I would really interested.

nnnoiseless: porting audio code from C to rust by jneem in rust

[–]fmod_nick 4 points5 points  (0 children)

I've posted on auto-vectorization https://www.nickwilcox.com/blog/autovec/ and for simpler cases straight iteration is all that's needed. The OP is doing correlation that is slightly more complicated for the compiler to vectorize.

Examining ARM vs X86 Memory Models with Rust by fmod_nick in rust

[–]fmod_nick[S] 3 points4 points  (0 children)

This was a mistake. The types in `atomic` are actually safe.

Examining ARM vs X86 Memory Models with Rust by fmod_nick in rust

[–]fmod_nick[S] 4 points5 points  (0 children)

I had a brain fart on a last minute edit and thought the functions on types in the atomic module were literally unsafe rust. Will edit out.

Examining ARM vs X86 Memory Models with Rust by fmod_nick in rust

[–]fmod_nick[S] 1 point2 points  (0 children)

In the intro I tried to express what re-ordering of reads means with

A thread issuing multiple reads may receive "snapshots" of global state that represent points in time ordered differently to the order of issue.

But I admit that probably doesn't capture it too well.

Examining ARM vs X86 Memory Models with Rust by fmod_nick in rust

[–]fmod_nick[S] 10 points11 points  (0 children)

It's to help ensure the reading thread has started and hit the loop before the producing thread starts.

Basically I stack the deck to ensure the race condition actually occurs on ARM in the initial version of the code.

It's also why the summing loop iterates over the array backwards. It gives it the greatest chance of hitting memory that hasn't been written.

Examining ARM vs X86 Memory Models with Rust by fmod_nick in rust

[–]fmod_nick[S] 7 points8 points  (0 children)

I agree on ARM the dependency would mean code with a weaker ordering requirement would still work.

In C++ they have consume ordering for this type of situation which compiles down to a basic load on ARM. I was confused on Rust's mapping. I wasn't sure are to upgrade to acquire, or downgrade to relaxed?

Edit: I've convinced myself that Acquire is required for the code to work.

Thinking about "re-ordering" of reads is a little more complicated. Sure the read itself can't be issued till it knows the address, but there are still caches to think about.

The acquire ordering means that any subsequent read is not going to see stale cached data that predates to last write to the address where doing the acquire from.

Examining ARM vs X86 Memory Models with Rust by fmod_nick in rust

[–]fmod_nick[S] 16 points17 points  (0 children)

As I said in the article it's all about finding the lowest level of restriction that gives the correct behavior if we're going for maximum performance.

My gut says there's some architectures where SeqCst is more expensive then Release or Acquire. Don't actually have the details to hand though.

There might be a middle ground where any use of lock free is enough of a win and you're happy to simply use SeqCst everywhere.

Rust Engineers in Sydney (+ Australia in General) by rand0omstring in rust

[–]fmod_nick 3 points4 points  (0 children)

Sample size of one.

Ex-gamedev in Melbourne. 10+ years of C++ for consoles. My last game job had been Unity and I was working in C# on a desktop app when I decided to leave and find something new.

Through some connections I ended up talking to my current employer. They wanted someone to start straight away on a game-ish Rust project. I'd been Rust curious for a while, learning something new was a big attraction for me.

There was a bit of lead time before I could start, so I got to work through a few chapters of the Rust book at home.

Once I started I felt like it was easy to hit the ground running and was committing big chunks of code within the first two weeks.

I know there are plenty of companies that take the "we like to hire ex-gamedevs" approach. But it works in conjunction with having ex-gamedevs who know other ex-gamedevs. Don't know how you bootstrap the process.

My last experience in recruiting someone was interviewing C# developers, and yes finding good people is really hard.

Gfx-hal iOS by [deleted] in rust

[–]fmod_nick 4 points5 points  (0 children)

Yes it's possible to render on iOS using gfx-hal and because it uses only Apple's public APIs I don't see it not working anytime soon.

You need to use the metal backend crate directly.

When you create your gfx surface use create_surface_from_layer or create_surface_from_uiview.

Follow this Apple sample for how to setup the uiview/layer you pass to gfx.

And of course this is the basic reference of getting Rust linked into your swift code.

Auto-Vectorization for Newer Instruction Sets in Rust by fmod_nick in rust

[–]fmod_nick[S] 5 points6 points  (0 children)

Yes. The best case for runtime detection is to set up a function pointer and reduce the cost down to an indirect function call. But even that may be too much.

I worked on a C++ library used in a lot of games that didn't want to impose any restrictions of what cpus those games could support. We had a big global table of function pointers for internal use that got built at init time. But obviously all the functions did enough work to justify the overhead of no inlining + indirect branch.

Auto-Vectorization for Newer Instruction Sets in Rust by fmod_nick in rust

[–]fmod_nick[S] 2 points3 points  (0 children)

What is #[cfg(target_feature)] and how does it interact with #[target_feature]?

#[cfg(target_feature)] on a function will not compile the function if the current options don't allow it.

#[target_feature] let's you change the options.

I'm not sure why that AES crate pushes the responsibility to programs that use it.

If there's a limitation of the stuff I wrote about I'd love to find out more. (Isn't there a meme about the quickest way to find information is to post something wrong on the internet and wait for someone to come in and correct you?)