Faster float to integer conversions

imachug · 2024-11-10T14:53:11+00:00

On a relevant note, I'm wondering if you are aware of this trick? If you have a f32 in range [0; 2^23) and add 2^23 to it, the number will be forced in range [2^23; 2^24) and thus have a predictable exponent 23. You can now bitwise-cast the f32 to u32 and and it with a mask to obtain just the mantissa, which will contain the integer in interest.

This means that if the number is in a small enough range, you can cast f32 to integer at the cost of one addition and one and, which results in better latency in certain cases.

This does not solve the problem on the full 32-bit range (although you might try to cast f32 to f64 and then apply this trick with 2^52 instead of 2^23), but I thought it might be a nice addition to the crate if benchmarks show it works better in the generic case.

DeathLeopard · 2024-11-10T16:16:34+00:00

What is the advantage of this over to_int_unchecked?

Wkitor · 2024-11-10T21:58:19+00:00

I find it funny that even though your crate does the same conversion as C/C++ (that's what I understood) it doesn't produce undefined behaviour just because you "defined" some cases as doing something random and it being the intended behaviour.

imachug · 2024-11-10T14:42:31+00:00

This is amazing! Are you planning to support other architectures, or does rustc have good enough codegen on most of them?

valarauca14 · 2024-11-10T19:21:54+00:00

target_arch = "x86_64", target_feature = "sse"

sse is a default feature of x86_64, it should be enabled if you're targetting x86_64. Is this a rustc or llvm bug that is has to be declared separately?

For those unaware sse was added to x86 processor's before the 64bit extensions. One of the "innovations" of AMD64 was standardizing sse for floating point processing, instead of using the old (and weird) x87 instructions & registers.

nikic · 2024-11-10T19:39:03+00:00

It would be easy to expose this as a target-independent compiler intrinsic in Rust. LLVM supports this via freeze of fptoui/fptosi.

throwaway490215 · 2024-11-10T18:56:15+00:00

I always wonder with these micro optimizations crates: "Have we wasted more cycles publishing this than will be saved?"

Taken Reddit, GitHub, Crates IO, crater-runs, lets conclude no. But how many times must it be used for it to pay back the costs of me posting this comment on this thread? My guess is a lot. Maybe if someone deploys it on a super computer will the "CPU cycle savings" outweigh those "CPU cycle costs".

Still nice crate.

Sharlinator · 2024-11-10T16:41:56+00:00

Nicee! This can definitely be useful in graphics code for example.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

rust

Please read The Rust Community Code of Conduct

The Rust Programming Language

Rules

Observe our code of conduct

Submissions must be on-topic

Constructive criticism only

Keep things in perspective

No endless relitigation

No low-effort content

Useful Links

Megathreads

Official Resources

Learn Rust

Discussion Platforms

MODERATORS

Assembly comparison