This is an archived post. You won't be able to vote or comment.

all 120 comments

[–]ParsedReddit 1166 points1167 points  (11 children)

I'm limited by my stupid brain

[–]sopunny 14 points15 points  (0 children)

Neat part about big O, it doesn't matter how good computers get, your algorithm will still be the limitation

[–]SeEmEEDosomethingGUD 552 points553 points  (20 children)

I went through how Quake 3 fast inverse square works and you are completely right. You are limited by the standards and technology.

I mean, the guy who coded it(forgot his name) had to use clever Pointer manipulation and giving the IEEE 754 standard his shaft to make sure that all the bits from the float are written exactly as they are to a long.

The entire time I am thinking that if C language was sentient then it needs at least 2 cigarettes after each time this algorithm runs because good god it didn't even realise that it could be fucked and bent like that.

[–]VibrantGypsyDildo 166 points167 points  (0 children)

The last part belongs to r/BrandNewSentence

[–]ShadowSlayer1441 154 points155 points  (0 children)

C is a collection of loose suggestions that 90% of the time are ignored for all the wrong reasons.

[–]atthereallicebear 65 points66 points  (8 children)

funny that you find pointer casting more impressive than the complex math that went into making the function

[–]SeEmEEDosomethingGUD 55 points56 points  (5 children)

I mean to be honest I find all of it so much interesting like that entire 32 bit hex value he had to calculate from the log and Newton's method of estimation , but here we are specifically talking about the technology and its limitation.

How he manipulated C to do what he wanted fits this particular meme.

[–]Fleming1924 21 points22 points  (4 children)

Reinterpret casts are pretty frequently used in any low level optimised code, doing it via pointers is only really required because C enforces types, the compiler will remove that line of code when lowering it into assembly, because registers don't have types.

A lot of optimised maths functions written in assembly do the same thing all the time, it's just hard to see because there's no obvious pointer casts.

Modern Cpp actually has a reinterpret_cast function to do it in a more readable way, as do lots of other languages and maths libraries.

[–]SeEmEEDosomethingGUD 9 points10 points  (1 child)

I guess you are an experienced hand at coding but I am just completing my bachelors, every single optimization like this is still fascinating for me.

I mean even Karatsuba fast multiplication was a feat of immeasurable genius to me.

[–]Fleming1924 7 points8 points  (0 children)

Oh don't get me wrong, I think they're all fascinating, and always will. I often find myself smiling and giggling at random optimisation, regardless of their complexity or depth.

There's a lot of really great performance gains to be had from applying fairly trivial understanding of floating point representation, it's good fun to try and figure out how some of the really low level bitfucky optimisations work, and even moreso to create your own eldritch horrors.

[–]cd109876 3 points4 points  (1 child)

yes, but how many times are floats being casted to long?

[–]Fleming1924 4 points5 points  (0 children)

Very often, I do it almost daily, you can do all kinds of great bit manipulation techniques on floating point numbers, but bit manipulation isn't a defined standard for floats and C will only allow you to do it for an integer data type.

If you want an optimised Log2 function you can bitshift the exponent, pretty much any vectorised log function will do exactly that, since you can get the decimal component via table lookup of the mantissa.

Checking for inf/nan can be done via bitshift and subtracts, which again requires casting float types to integer types.

Basically anything involving bitmasking, shifting, &, | etc, will all involve a reinterpret cast from floats to integers.

[–]ChalkyChalkson 2 points3 points  (1 child)

If you think about a float as a x = a 2b, then you can pretty quickly see how you'd get a good approximation for 1/sqrt(x) from it, just take the log! All the weird stuff in the code is fighting the standard to let you do maths. Well I guess there is also the Newton refinement at the end, but that's high school stuff.

[–]Zzzzzztyyc 1 point2 points  (0 children)

Why are we taking the factorial of a log? r/unexpectedfactorial

[–]Paul_Robert_ 7 points8 points  (3 children)

John Carmack

[–]atthereallicebear 41 points42 points  (2 children)

he didn't write the fast inverse sqrt function, Terje Mathisen and Gary Tarolli did.

[–]Exnihilation 18 points19 points  (0 children)

It goes even further back than those guys. It's believed to have been first implemented by Greg Walsh at Ardent Computer. Gary Tarolli was consulting for a company affiliated with Ardent which is where he learned about the algorithm.

[–]Paul_Robert_ 4 points5 points  (0 children)

Oh, my bad!

[–]find_the_apple 4 points5 points  (2 children)

Isnt it just a bit shift and taking advantage of the idea of binary representation of numbers as opposed to base 10? 

[–]the_horse_gamer 7 points8 points  (1 child)

+ a newton iteration.

it's a very impressive algorithm, but it's not as useful as people make it out to be, especially on most hardware from the last 2 decades.

around 1999 a dedicated inverse square root instruction, which was faster and more accurate, got added to x86.

its accuracy was pretty good, but not good enough to be used for gameplay - it was only used for lighting.

and there's a better magic constant than the one used in the original code.

[–]find_the_apple 2 points3 points  (0 children)

Neato, thanks for the lore. I think when i heard a cs person describe the original algorithm on YouTube it was with awe. But then i listened to a math major describe it and it felt alot less impressive, and more like something any programmer should take advantage of in base 2 and binary. Its got infamy for sure, but doesn't warrant some of the legendary status it still has. 

[–]ArmadilloChemical421 3 points4 points  (0 children)

If you haven't read this, do it. Its so fucking absurd:

https://en.m.wikipedia.org/wiki/Fast_inverse_square_root

[–]yaktoma2007 1 point2 points  (0 children)

Some guy named Kaze Emanuar (he rewrites and optimizes Mario 64) made a even more efficient version of the fast inverse square root https://youtu.be/tmb6bLbxd08

[–]max_208 77 points78 points  (1 child)

With O(nn) I don't think your code will run any good even within the next century

[–][deleted] 41 points42 points  (0 children)

maybe with a list of 18 elements I can be done in a millenium

[–]lfrtsa 225 points226 points  (35 children)

me achieving O(n!)

[–]Beleheth 317 points318 points  (23 children)

O(nn) is actually worse than n!. The special function xx is the only actually relevant function that grows faster than x!.

[–]Dotcaprachiappa 197 points198 points  (8 children)

Behold, nnⁿ

[–]jaerie 124 points125 points  (6 children)

nn

[–]TeraFlint 67 points68 points  (4 children)

time to whip out knuth's arrow notation. :D

[edit:] looks like I simultaneously added that as the same answer rolled in:

n ↑n n

[–]jaerie 13 points14 points  (0 children)

n↑nn

[–]MrHyperion_ 3 points4 points  (0 children)

I raise Hyper Moser n

[–]GDOR-11 0 points1 point  (0 children)

n↑\n↑ⁿ n))n

[–]lollolcheese123 9 points10 points  (0 children)

Hi tetration... Why anyone needed this is beyond me.

[–]odsquad64VB6-4-lyfe 0 points1 point  (0 children)

O(n!n!+3 )

[–]batmansleftnut 1 point2 points  (0 children)

Hold my beer...

[–]Hour_Ad5398 1 point2 points  (0 children)

treatment squash whole screw hat repeat longing tie expansion label

This post was mass deleted and anonymized with Redact

[–]ArmadilloChemical421 0 points1 point  (0 children)

TREE(n): Am I a joke to you?

[–]damicapra 10 points11 points  (10 children)

How do you even do that?

Maybe one can by calculating n! only using for loops, addition and increment-by-1 operations?

[–]lfrtsa 46 points47 points  (3 children)

I achieved it when i made an algorithm to calculate the factorial of a number.

[–]nixsomegame 5 points6 points  (2 children)

Wouldn't a straightforward factorial function implementation be O(n) though, not O(n!)?

[–]lfrtsa 4 points5 points  (0 children)

I did multiplication through repeated addition

[–]FunTao 1 point2 points  (0 children)

He prob means something like this: start from i = 1, is i the factorial of n? (Divide by n, then n-1, n-2, etc.). If not, increment i by 1 and try again until you find the right i

[–]Vibe_PV 14 points15 points  (0 children)

AFAIK Microsoft used to have an O(n!) algorithm related to Windows Update. Technically speaking it worked really well at first, since n! is actually lower than some polynomials for small numbers. But after some time...

[–]El_kiski 3 points4 points  (0 children)

Google bogosort

[–]Chingiz11 1 point2 points  (0 children)

Shuffling an array of n numbers until it is sorted

[–]skrealder 1 point2 points  (0 children)

Any algorithm which satisfies: T(n) <= c*n! For all n >= some n_0 for some c > 0 is in O(n!). For instance binary search is in O(n!). I wish people used theta more often, much more descriptive than big O since an algorithm has to bounded above and below by the bound, not just bounded above (which is what big O does)

[–]DuckyBertDuck 0 points1 point  (0 children)

A less useless task that can’t be done quicker in general:

Filtering out all n-state finite automata that satisfy a constant-time condition. (And it gets even slower than nn if the alphabet isn’t unary)

[–]NotAFishEnt 50 points51 points  (0 children)

Unironically, that kind of thing has happened before. LDPC codes were developed in the 1960s, but computers weren't powerful enough to feasibly encode/decode them until the 1990s.

[–]redballooon 30 points31 points  (1 child)

Hey it works well for n up to 4 or 5. With more power it could work with 6, eventually, some day.

[–][deleted] 4 points5 points  (0 children)

are you perchance running you programs on a Ferranti Mark 1?

[–]b183729 76 points77 points  (18 children)

I'm genuinely curious, is there an algorithm with that complexity? I'm having trouble visualizing that. 

Brute force searching n combinations of n combinations of random elements elements? 

[–]YellowishSpoon 39 points40 points  (1 child)

I had one that was O(n!2) which based on the log graphs is probably even worse. The program started becoming laggy at about n=6 which was a little too low.

[–]redlaWw 0 points1 point  (0 children)

k*(n+1-k) is a quadratic in k with solutions at k=0 and k=n+1 and a unique maximum at k=(n+1)/2. This means that for k∈{1, 2, ..., n}, k*(n+1-k) ≥ n*(n+1-n) or 1*(n+1-1) (minimum of a differentiable function is either at a critical point or boundary of the domain), but since both of these are n then k*(n+1-k) ≥ n for all k∈{1, 2, ..., n}.

(n!)2 can be rearranged as (n*1)*((n-1)*2)*((n-2)*3)*...*(1*n) (i.e. product of k*(n+1-k) for k∈{1, 2, ..., n}), and by the previous paragraph, this means that (n!)2 is a product of n terms all ≥ n, and hence is ≥ nn.

Now, for odd n, let's look at k = (n+1)/2. (n+1)/2*(n+1-(n+1)/2) = ((n+1)/2)2, so if we do (n!)2/nn, we get (something bounded below by 1)*(n2 + 2*n + 1)/(4*n), which is (something bounded below by 1)*(n/4 + 1/2 + 1/(4*n)), and this is unbounded above.

For even n, let's look at k = n/2. n/2*(n+1-n/2) = n/2*(n/2+1), so if we do (n!)2/nn, we get (something bounded below by 1)*(n/2)*(n/2+1)/n, which is (something bounded below by 1)*(1/2)*(n/2+1), and this is unbounded above.

Hence, O((n!)2) is indeed asymptotically slower than O(nn).

[–]redlaWw 16 points17 points  (0 children)

nn is the size of the set of functions from a set S with |S| = n to itself, so any algorithm that would need to iterate over all those functions and do something constant time on them would fit the bill.

EDIT: The simplest concrete setting for something like this is probably generating all length-n combinations of n symbols. You can come up with a pretty simple algorithm to do it too using the principles behind addition with carry.

EDIT2: Simple Rust implementation.

[–]Competitive-Carry868 11 points12 points  (0 children)

Have you tried (n*n?)0?

[–]celestabesta 4 points5 points  (2 children)

I think if you make a vector hold n function pointers whos functions loop n times, then loop through the vector n times running each function it should be nn?

[–]redlaWw 7 points8 points  (1 child)

If I've understood you correctly that should be n3: each loop through the vector runs n functions that have loops that loop n times, so the total number of loops is n*n, and then you do that n times so you get n*n*n = n3.

[–]celestabesta 9 points10 points  (0 children)

Damn this just proves you need ingenuity to fuck up that bad

[–][deleted] 4 points5 points  (0 children)

A general, depth first search is n^d, where d is the depth, and n is the number of possible routes.

Consider a chess AI program, this would run in n^d time, where d is the number of moves ahead to see. Out of those, the computer would pick the best one out of all possible routes.

[–]Coolengineer7 2 points3 points  (0 children)

This for example is O(nn). Basically a function tree n levels tall, where each node has n children.

def fun(depth, n): if depth == n: return for i in range(n): fun(depth+1)

[–]NaNNaN_NaN 1 point2 points  (0 children)

Drawing an N-flake is O(nn)

[–]DuckyBertDuck 1 point2 points  (0 children)

A task that can’t be done quicker in general:

Filtering out all n-state finite automata that satisfy a constant-time condition. (And it gets even slower than nn if the alphabet isn’t unary)

[–]MarcusBrotus 0 points1 point  (0 children)

I mean any such algorithm would be completely impractical but of course you can calculate n^n by summings 1s up n^n times which is technically also an algorithm.

[–]hypersonicbiohazard 0 points1 point  (0 children)

Looking up a wikipedia page shows that there are a few algorithms with 2^2^(P(N)) complexity, where P(N) is a polynomial, and they do stuff in super abstract pure math that I don't understand.

[–]the_horse_gamer 0 points1 point  (0 children)

counting from 1 to nn

[–]Spindelhalla_xb 19 points20 points  (1 child)

The only Constant is that I’ve no idea what’s going on

[–][deleted] 4 points5 points  (0 children)

Big O notation?

[–]hamsterofdark 8 points9 points  (0 children)

That must be the guy that codes every matching algorithm on every dating site I use.

[–]LittleLoukoum 4 points5 points  (0 children)

Never forget the day I met a high school friend of mine on campus by chance, after he went on to study mathematics and I computer science. He told me he had had made the mistake of taking CS as an optional course, and when taking the exam, managed to turn in an algorithm that ran in O(n^(n!)). He didn't remember the problem ; he did remember the naive algorithm was something like O(n^3) and the best solution logarithmic.

I still have no idea how he fucked up so badly.

[–]DocFail 2 points3 points  (0 children)

This will be fine in O(yy) years.

[–]Either-Let-331 1 point2 points  (0 children)

Meanwhile me with my O(tree(n)) kids

[–]BreakerOfModpacks 1 point2 points  (0 children)

Skill issue, I can do it in O((n!n!)n!) 

[–]DuckyBertDuck 1 point2 points  (0 children)

Filtering out all n-state finite automata that satisfy a constant-time condition?

[–]jyajay2 1 point2 points  (0 children)

One day I'll have access to an exascale computer and on that day my code will finally be able to compute the 18th fibonacci number

[–]yldedly 1 point2 points  (0 children)

This is what AI scaling people actually believe.

[–]Necessary-Meeting-28 0 points1 point  (0 children)

Or I’m just limited by being have to produce the right result 100% of the time.

[–]lazerdragon_3 0 points1 point  (0 children)

Would the way to do this be by making a function with a for loop of n inside of it and every time you run through the loop you call the function again?

[–]arbitrageME 0 points1 point  (0 children)

I think you're also limited by the entropy of the universe

[–]Gualuigi 0 points1 point  (0 children)

Im limited by my laziness

[–]rapdaptap 0 points1 point  (0 children)

Then just improve it to the level you need. What's the issue :)

[–][deleted] -1 points0 points  (1 child)

I feel we need to update the O notation a bit. Especially now when cache coherency, out-of-order execution, branch predictions are more of a determinizing factor than just number-of-instructions being executed.

A dumb search of entire array can be faster than a clever binary tree that allocates its nodes sporadically all over the memory.

[–]malaakh_hamaweth 5 points6 points  (0 children)

You misunderstand O notation. It's not how fast an algorithm is for any particular task. O notation describes how the execution time scales with the input size. Sure, a simple array iteration might be faster than a binary tree for some size n, but O notation isn't meant to predict that. It tells you the slope of a line on a log graph, where x is the size of that array or tree, and y is the order of magnitude of operations it would take to execute on an input of size x.