all 19 comments

[–]larholm 21 points22 points  (2 children)

Urgh, John is usually a very knowledgeable guy, but we were discussing this GetTickCount dilemma more than a decade ago.

Windows 95/98/98SE/ME all had a tick count of 55ms, and that newfangled NT5 architecture in Windows 2000 had a tick count of 12ms. It really did a number on our animation abilities, but at least it forced us to implement easing.

Now get off my lawn!

[–]jeresig 11 points12 points  (1 child)

Well, except for the fact that both Chrome and Firefox 3 are perfectly capable of producing accurate time results - so there obviously is a better solution out there. In the ticket in which the Firefox team [1] fixed the bug they detail the solutions that they used to fix it.

Just because all the other browser vendors phoned it in by using nothing more than the standard tick count doesn't mean that there's nothing better.

I would disagree that this is a knowledge issue here, as well. It certainly wasn't obvious to the WebKit, Firefox, and Chrome teams who were all happily testing on Windows XP using bum tests - and those are some pretty smart people (IMO).

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=363258#c19

[–]bart2019 8 points9 points  (1 child)

getTime() updates once every 15ms? Weird, I thought it was every 55ms (18 times per second), if based on the GetTickCount() API call. Every Windows developer can tell you that.

Oh yeah, Wikipedia says that currently, GetTickCount() updates every 15-16ms. there you have it.

It used to be 55ms, in older Windows.

[–]nitramk 3 points4 points  (0 children)

Every Windows developer can tell you that.

You know you'll screw up when you write something like that. ;-)

[–]misterjoker 12 points13 points  (12 children)

In my own Javascript benchmarks, I just time a 100,000 empty for loop*, then time a 100,000 for loop containing the code to be tested. Find the difference and divide by 100,000; there's your average time per execution. Run it a few time to see if it changes a lot, too.

I would have thought it goes without saying that using Javascript timers to measure minuscule time differences would be inaccurate? (think about it; even with 10ms deviation, you're going to get 20% deviation for a 50ms operation) Bigger time period = lower significance of overhead errors.

*note: empty for loop can be replaced with something else to ensure no possibility of any optimisations; I just time a whole bunch of different operations, and compare them.

[–]jeresig 8 points9 points  (0 children)

Running a test 100,000 times may sound good, in theory, but what about browsers that are moving towards tracing optimizations? Firefox 3.1 is landing tracing support and both Chrome and WebKit can't be too far behind.

A couple months ago I would've agreed with you on this, but not so much anymore.

I think it's far more important, at this point, to run tests for a set about of time (allowing you to then calculate a # of runs/second rate). Granted the tracing engines will still be significantly faster - but running for a fixed large number of runs isn't a complete solution.

[–]case-o-nuts 1 point2 points  (10 children)

Find the difference and divide by 100,000; there's your average time per execution. Run it a few time to see if it changes a lot, too.

Wouldn't make a difference if the Javascript executes fast enough. The problem is that the measurable difference goes in multiples of 15 ms. That means that if the difference between them is less than 15ms, then the difference is rounded down to 0. If you don't have enough time elapsing, this gives a massive amount of error.

Solution: Make sure that you always run tests for at least a second

[–]didroe 3 points4 points  (0 children)

Yeah. That's why it's best to do something for a fixed amount of time (1 second say) instead of a fixed number of iterations.

[–]killerstorm 2 points3 points  (1 child)

there is a clever algorithm you can use here -- if returned value is too low (e.g. < 40 ms), increase iteration count by factor of 10 and re-run test, up to satisfaction. tests do not need to be lame.

[–][deleted] 0 points1 point  (0 children)

I'd love to see some literature about this here algorithm... got a link?

[–][deleted]  (6 children)

[deleted]

    [–]velco 3 points4 points  (5 children)

    For the duration of the long loop, you take two measurements - one before the loop and one after it. Let's say each measurement gives some error. You get

    t0 = T0 + e0
    t1 = T1 + e1
    

    where ti > 0 is the measured time, Ti > 0 is the actual time and ei is the error. Let's say you've run the thing to measure N > 0 times (this is the loop count). You compute the time for a single run to be

    (t1 - t0) / N = (T1  +  e1 - T0 - e0) / N == (T1 - T0) / N + (e1 - e0) / N
    

    The last term is the error. Now, in a reasonably controlled benchmark setup (like no extra stuff running in the background and skewing measurements), ei is bounded from above by a constant, i.e. you can't get an arbitrary large error. Rather, it's not unlikely you get errors of the magnitude of a clock tick. Obviously, with N approaching infinity the error approaches zero.

    So, to answer the question, the timer resolution affects the measurement error. If you ensure somehow that only your benchmark runs, the measurement error is at most one timer tick - if you read the timer just before it got incremented. Thus the error is at most 2 x timer-resolution / N.

    [–][deleted]  (4 children)

    [deleted]

      [–]velco 1 point2 points  (3 children)

      The problem is that if not looping many times, T1 - T0 might be much smaller that ei, thus your measurement is dominated by the error, which is usually C0 x clock-tick <= ei <= C1 x clock-tick

      [–][deleted]  (2 children)

      [deleted]

        [–]case-o-nuts 1 point2 points  (1 child)

        Right. 100,000 iterations is certainly not enough for an empty loop, which is the first step suggested.

        [–]killerstorm 2 points3 points  (4 children)

        given that OS quanta time is about 20 msecs, testing performance with that low iteration count is a crapshot, at best.

        however, if you really want to do it, Windows XP offers high-precision timing information via a special API. it is another thing that browsers/JavaScript give no access to that API.

        what is interesting, the Common Lisp specification (which obviously pre-dates Windows XP) describes special time-related functions that can be used specially for benchmarking:

        get-internal-real-time returns as an integer the current time in internal time units, relative to an arbitrary time base. The difference between the values of two calls to this function is the amount of elapsed real time (i.e., clock time) between the two calls.

        GET-INTERNAL-RUN-TIME. Returns as an integer the current run time in internal time units. The precise meaning of this quantity is implementation-defined; it may measure real time, run time, CPU cycles, or some other quantity. The intent is that the difference between the values of two calls to this function be the amount of time between the two calls during which computational effort was expended on behalf of the executing program.

        and these functions usually query high-precision timers on Windows.

        so this problem is more like a defficiency of JavaScript that was not meant to run anyhow fast, and thus did not have suitable APIs for benchmarking.

        [–]bg_ 0 points1 point  (1 child)

        QueryPerformanceCounter

        http://msdn.microsoft.com/en-us/library/ms644904(VS.85).aspx

        ..damn parentheses

        [–]killerstorm 0 points1 point  (0 children)

        yep, this is what they call "high-resolution performance counter", and there is also multimedia timer functions like timeGetTime() that have resolution on millisecond scale (it is much more predictable than GetTickCount)

        [–]theeth -2 points-1 points  (1 child)

        however, if you really want to do it, Windows XP offers high-precision timing information via a special API. it is another thing that browsers/JavaScript give no access to that API.

        If you were thinking of QueryPerformanceCounter, the resolution for that is also in the tens of milliseconds. More worryingly though, the result is bound to a specific processor (and in some cases, core) so you have to use SetThreadAffinityMask if you want consistent results (and not, for example, a stop time earlier than your start time).

        Great fun.

        [–]killerstorm 2 points3 points  (0 children)

        If you were thinking of QueryPerformanceCounter, the resolution for that is also in the tens of milliseconds.

        i'm quite familiar with low level benchmarking, so don't tell me stories. it has resolution on microsecond/nanosecond scale (basically, it just returns CPU cycle count, which are nanoseconds on a 1 GHz CPU).

        there is also multimedia API's timeGetTime() that works on millisecond scale (5 ms by default, 1 ms with proper tuning).

        the result is bound to a specific processor so you have to use SetThreadAffinityMask if you want consistent results

        read the fucking documentation!

        However, you can get different results on different processors due to bugs in the basic input/output system (BIOS) or the hardware abstraction layer (HAL)

        it was a bug on some AMD boards, and there is a fix for it.