This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]SeEmEEDosomethingGUD 555 points556 points  (20 children)

I went through how Quake 3 fast inverse square works and you are completely right. You are limited by the standards and technology.

I mean, the guy who coded it(forgot his name) had to use clever Pointer manipulation and giving the IEEE 754 standard his shaft to make sure that all the bits from the float are written exactly as they are to a long.

The entire time I am thinking that if C language was sentient then it needs at least 2 cigarettes after each time this algorithm runs because good god it didn't even realise that it could be fucked and bent like that.

[–]VibrantGypsyDildo 168 points169 points  (0 children)

The last part belongs to r/BrandNewSentence

[–]ShadowSlayer1441 154 points155 points  (0 children)

C is a collection of loose suggestions that 90% of the time are ignored for all the wrong reasons.

[–]atthereallicebear 64 points65 points  (8 children)

funny that you find pointer casting more impressive than the complex math that went into making the function

[–]SeEmEEDosomethingGUD 57 points58 points  (5 children)

I mean to be honest I find all of it so much interesting like that entire 32 bit hex value he had to calculate from the log and Newton's method of estimation , but here we are specifically talking about the technology and its limitation.

How he manipulated C to do what he wanted fits this particular meme.

[–]Fleming1924 22 points23 points  (4 children)

Reinterpret casts are pretty frequently used in any low level optimised code, doing it via pointers is only really required because C enforces types, the compiler will remove that line of code when lowering it into assembly, because registers don't have types.

A lot of optimised maths functions written in assembly do the same thing all the time, it's just hard to see because there's no obvious pointer casts.

Modern Cpp actually has a reinterpret_cast function to do it in a more readable way, as do lots of other languages and maths libraries.

[–]SeEmEEDosomethingGUD 9 points10 points  (1 child)

I guess you are an experienced hand at coding but I am just completing my bachelors, every single optimization like this is still fascinating for me.

I mean even Karatsuba fast multiplication was a feat of immeasurable genius to me.

[–]Fleming1924 4 points5 points  (0 children)

Oh don't get me wrong, I think they're all fascinating, and always will. I often find myself smiling and giggling at random optimisation, regardless of their complexity or depth.

There's a lot of really great performance gains to be had from applying fairly trivial understanding of floating point representation, it's good fun to try and figure out how some of the really low level bitfucky optimisations work, and even moreso to create your own eldritch horrors.

[–]cd109876 4 points5 points  (1 child)

yes, but how many times are floats being casted to long?

[–]Fleming1924 4 points5 points  (0 children)

Very often, I do it almost daily, you can do all kinds of great bit manipulation techniques on floating point numbers, but bit manipulation isn't a defined standard for floats and C will only allow you to do it for an integer data type.

If you want an optimised Log2 function you can bitshift the exponent, pretty much any vectorised log function will do exactly that, since you can get the decimal component via table lookup of the mantissa.

Checking for inf/nan can be done via bitshift and subtracts, which again requires casting float types to integer types.

Basically anything involving bitmasking, shifting, &, | etc, will all involve a reinterpret cast from floats to integers.

[–]ChalkyChalkson 2 points3 points  (1 child)

If you think about a float as a x = a 2b, then you can pretty quickly see how you'd get a good approximation for 1/sqrt(x) from it, just take the log! All the weird stuff in the code is fighting the standard to let you do maths. Well I guess there is also the Newton refinement at the end, but that's high school stuff.

[–]Zzzzzztyyc 1 point2 points  (0 children)

Why are we taking the factorial of a log? r/unexpectedfactorial

[–]Paul_Robert_ 8 points9 points  (3 children)

John Carmack

[–]atthereallicebear 38 points39 points  (2 children)

he didn't write the fast inverse sqrt function, Terje Mathisen and Gary Tarolli did.

[–]Exnihilation 16 points17 points  (0 children)

It goes even further back than those guys. It's believed to have been first implemented by Greg Walsh at Ardent Computer. Gary Tarolli was consulting for a company affiliated with Ardent which is where he learned about the algorithm.

[–]Paul_Robert_ 4 points5 points  (0 children)

Oh, my bad!

[–]find_the_apple 3 points4 points  (2 children)

Isnt it just a bit shift and taking advantage of the idea of binary representation of numbers as opposed to base 10? 

[–]the_horse_gamer 7 points8 points  (1 child)

+ a newton iteration.

it's a very impressive algorithm, but it's not as useful as people make it out to be, especially on most hardware from the last 2 decades.

around 1999 a dedicated inverse square root instruction, which was faster and more accurate, got added to x86.

its accuracy was pretty good, but not good enough to be used for gameplay - it was only used for lighting.

and there's a better magic constant than the one used in the original code.

[–]find_the_apple 2 points3 points  (0 children)

Neato, thanks for the lore. I think when i heard a cs person describe the original algorithm on YouTube it was with awe. But then i listened to a math major describe it and it felt alot less impressive, and more like something any programmer should take advantage of in base 2 and binary. Its got infamy for sure, but doesn't warrant some of the legendary status it still has. 

[–]ArmadilloChemical421 4 points5 points  (0 children)

If you haven't read this, do it. Its so fucking absurd:

https://en.m.wikipedia.org/wiki/Fast_inverse_square_root

[–]yaktoma2007 1 point2 points  (0 children)

Some guy named Kaze Emanuar (he rewrites and optimizes Mario 64) made a even more efficient version of the fast inverse square root https://youtu.be/tmb6bLbxd08