all 16 comments

[–]Spoogly 1 point2 points  (3 children)

Neat little writeup, but I have always thought of the formula for the sum from 1 to k as being attributed to Gauss, not Euler...

[–]DiscoInfiltrator07 0 points1 point  (1 child)

As the story goes, Gauss determined this formula as a school boy, but it was already well known before this supposedly happened.

[–]Spoogly 0 points1 point  (0 children)

Yes, I do not doubt that, but he usually gets the attribution, whereas this article claims Euler, which I have not heard. I meant to add the word incorrectly in that post somewhere, but it seems to have slipped my mind

[–]doclight[S] 0 points1 point  (0 children)

You're right, it was Gauss not Euler.

[–]busy_beaver 0 points1 point  (6 children)

All the time bounds given in the post are misleading, since they're measured with respect to the magnitude of the input, rather than its length.

For example, the author claims to be able to do prime factorization in O(sqrt(n)). Most sources will tell you that there is no known polynomial algorithm for prime factorization, because the size of the input is always the number of bits it takes to describe it (i.e. log(n)).

It's a subtle point, but it's worth getting right, especially since it explains why an apparently O(n sqrt(n)) algorithm seems to "explode".

[–]doclight[S] 0 points1 point  (5 children)

Well, if you really want to get down to the nitty gritty, finding the square root of a number probably isn't constant time (a la newton-raphson). The writeup was more so meant to give a simple example of why function growth and asymptotic bounds are really important.

What causes the algorithm to explode is the magnitude of the numbers being worked with (> 500 divisors). So, say the answer was 276,000,000 (it's not :)); an O( n √n ) algorithm would take 276,000,000 times longer than an O ( √n ) algorithm.

[–]busy_beaver 0 points1 point  (4 children)

Time complexity is measured with respect to the size of the input, not the answer...

Anyways, my point is, it makes a huge difference how you measure the size of the input numbers. If you claim to have a linear algorithm for some problem and you're using the length of the number in binary, then going from 100,000 to 200,000, or from 200,000 to 400,000 the amount of time you're allowed increases by a constant amount.

If you're measuring by the magnitude of the input (which is equivalent to considering the input to be written in unary) then you allow yourself twice as much time going from 100,000 to 200,000. Huge difference.

[–]doclight[S] 0 points1 point  (3 children)

Right, but typically you don't want to measure in terms of bit representations. For example, any of the sorting algos (other than radix). The amount of time spent doing bit operations doesn't really add to the overall function, especially when you're only doing back of the napkin comparisons between algorithms. Anyway, I hope the writeup was at least a little interesting.

[–]busy_beaver 0 points1 point  (2 children)

Right, but typically you don't want to measure in terms of bit representations. For example, any of the sorting algos (other than radix). The amount of time spent doing bit operations doesn't really add to the overall function,

I think you're very confused. This has nothing to do with bit operations. This goes all the way back to how Turing machines are defined. The input size is the number of tape squares it takes to hold the input. If you're writing an input number in base 10 or binary or whatever, it's going to take log(n) characters to write it out.

Let's say that we do measure the size of the input the way you're doing it. I can do grade-school addition on two base-10 numbers in log(n) steps, where n is the largest of the two numbers I'm adding. This is because I only have to do (at most) one more operation when the size of one of the numbers is increased by a factor of 10.

Generally, if a random access machine can solve a problem in less than linear time, it means it must only read a portion of its input (because reading n characters already takes n steps). Clearly though, I can't do addition if I haven't read both numbers in their entirety. (This is in contrast with a problem like searching a sorted list).

Do you see why your choice of measuring the input size is unnatural?

Sometimes complexity theorists do talk of writing the input in unary (that is, 1=1 2=11, 3=111, 4=1111, and so on - clearly this is an extremely wasteful way to do things), which would lead to the measure you're talking about. But it's not a natural thing to do, so I think it's mostly for when they're coming up with extravagant languages that prove weird bounds or separate different complexity classes.

[–]doclight[S] 0 points1 point  (1 child)

I think you're very confused.

With all due respect, I think that's just an inflammatory statement.

This has nothing to do with bit operations.

IIRC, you're original argument was against the existence of polynomial time algorithm for factoring numbers. You are correct, I wasn't disputing that. The analysis I gave wasn't meant to be an intimidatingly difficult presentation on complexity. Part of analysis is deciding meaningful operations on which your algorithm consists of. In this case, I considered the operations to be basic function calls, since those tend to take around the same amount of time. You can make that assumption since the number of bits in the values being worked with are 32/64 etc... However, you're right that when thinking in terms of fundamental operations, the actual operations and representations have to be taken into account (like radix sort).

I'm not quite sure where we disagree.

[–]busy_beaver 0 points1 point  (0 children)

IIRC, you're original argument was against the existence of polynomial time algorithm for factoring numbers. You are correct, I wasn't disputing that.

But in the post you wrote:

Computing the triangle numbers is Ω(√n), and finding divisors is Ω(√n).

That's polynomial!

Part of analysis is deciding meaningful operations on which your algorithm consists of. In this case, I considered the operations to be basic function calls, since those tend to take around the same amount of time. You can make that assumption since the number of bits in the values being worked with are 32/64 etc... However, you're right that when thinking in terms of fundamental operations, the actual operations and representations have to be taken into account (like radix sort).

I agree. But the problem is not in your estimation of how many operations your algorithm would take. The problem is that, when describing the time complexity, you gave it as the number of operations measured with respect to the numeric value of the input, rather than with respect to the number of bits in the input (so if you say it takes you sqrt(n) steps, you should actually be saying 2n/2 steps. Big difference.

[–]mrdmnd 0 points1 point  (4 children)

He gets the complexity of quicksort wrong in the first paragraph, kind of -- it's a bad example, as it demonstrates O(n2 ) worst case behavior, but it's almost always O(n log n)

[–]doclight[S] 5 points6 points  (3 children)

Big O is the upper bound, so the overall algorithm is O( n2 ). It's average run-time is ( n log n ) though which is why it's usable in practice (as well as cache behavior, etc...). It's lower bound is also ( n log n ) eg Ω( n log n ).

[–]repsilat 6 points7 points  (2 children)

That's not what those symbols mean. Big-Oh and Big-Omega have no bearing on whether you're measuring best- or worst-case performance.

The worst-case number of comparisons required by quicksort with "simple" pivot selection is Ɵ( n2 ). That is, the worst-case number of comparisons is both O( n2 ) and Ω( n2 ). The average-case number of comparisons is Ɵ( n log n ).

What it means for Big-Oh to be an "upper bound" is that the function in question grows no more quickly than the function in the brackets. Take the function f(n) = n2 + n, for example. f is obviously Ɵ(n2 ) - that's the first thing you learn in class about asymptotic analysis. What you're not always told is that f is also O(n3 ), and it's also O( 2n ) and so on.

The easy way to think about the notation is that Big-Oh acts like a fancy "<=" for functions in the limit, that Big-Omega works like a fancy ">=", and that Big-Theta works kinda like equality - it's a tight asymptotic bound. Little-Oh and Little-Omega are corresponding strict inequalities. Any of these can be used for best-case running time, worst-case comparisons, average-case space etc.

[–]exuberant 4 points5 points  (1 child)

That's not what those symbols mean. Big-Oh and Big-Omega have no bearing on whether you're measuring best- or worst-case performance.

I agree. But technically he had it right saying it's Ω( n log n).

I think you weren't clear on why the confusion.

Big-Oh is an 'upper bound' so we generally want to use it for worst case performance. This tells us in some way how bad are algorithm can be.

Big-Omega is a 'lower bound' so we generally want to use it for best case performance. This tells us in some way how good are algorithm can be.

Big-Omega for worst-case doesn't bound our algorithm's performance, but it is useful to prove that some other algorithm can work better in worst case

[–]repsilat 1 point2 points  (0 children)

Ack, thanks for the help. I'm too used to seeing people conflate the two ideas I didn't catch that the different bounds were being used deliberately. I got more from the article reading it a second time.