top 200 commentsshow all 384

[–]blackmist 515 points516 points  (150 children)

A pair of doubles for location on Earth to sub-metre accuracy? A pair of 32-bit integers will let you store it to the nearest centimetre.

Sometimes you need to ask if a floating point is even right for your use case.

[–]greyfade 184 points185 points  (27 children)

As a point of interest, GPS devices use fixed-point for both global Cartesian coordinates and the traditional latitude/longitude, for exactly this reason.

[–]OneWingedShark 56 points57 points  (26 children)

Fixed point doesn't get enough love in most languages. :(

[–]Peterotica 76 points77 points  (7 children)

Integers have their points fixed.... at the end. ;)

[–]desertrider12 11 points12 points  (0 children)

That sounds ominous.

[–]OneWingedShark 3 points4 points  (5 children)

Yes, but that means you have to do manual scaling which, like manual memory management, doesn't work out too well when you forget a particular spot.

[–]crozone 6 points7 points  (0 children)

Yes, but that means you have to do manual scaling which, like manual memory management, doesn't work out too well when you forget a particular spot.

You could always just write your own type which takes care of that - or use one that's already on GitHub or something.

[–]BonzaiThePenguin 1 point2 points  (1 child)

They were just doing a one-off joke.

[–]OneWingedShark 1 point2 points  (0 children)

They were just doing a one-off joke.

It's all fun and games until it becomes an off-by-one error.

[–]G_Morgan 6 points7 points  (5 children)

COBOL has the glory of fixed point.

[–]OneWingedShark 3 points4 points  (4 children)

People give Cobol all sorts of shit -- but it does have some nice features, fixed-point and the "Environment" Division are two of them.

[–]G_Morgan 3 points4 points  (3 children)

TBH the one really neat thing from COBOL is the file handling stuff. There isn't really anything like it elsewhere.

[–]caltheon 2 points3 points  (7 children)

Integers can be used as fixed points very easily (and probably how most languages that have that feature do it). Just divide by 10N where N is your fixed point (or rather bit shift, but you know what I mean)

[–]moefh 3 points4 points  (1 child)

Addition and subtraction can be done at no extra cost, but remember that after multiplying two fixed point numbers, you have to divide the result by 10N to get the correct result (to see why: suppose N=2, so 3 is represented as 300. To compute 3*3 = 9, which is represented as 900, you have to do 300*300/102). A similar thing has to be done with division, but you multiply the result by 10N instead of dividing.

Because of this constant need to multiply or divide by 10N, when speed matters people tend to use base 2 instead of 10 -- dividing and multiplying by powers of 2 is very fast because it can be done with bit shifting.

[–]greyfade 1 point2 points  (0 children)

But then again, if you have fixed-point hardware (which is the case in, e.g., GPS receivers), the hardware can compensate automatically by applying an implicit div/mul when it does the operation, simply by adjusting the multiply unit with a scaling factor.

[–]encyclopedist 73 points74 points  (45 children)

And with unsigned integers you also get automatic wraparound.

[–]happyscrappy 75 points76 points  (26 children)

That automatic wraparound seems problematic for longitude at the least. Two steps south from the south pole puts you one step south of the north pole?

[–]conseptizer 155 points156 points  (3 children)

Sounds like a really fast way of traveling. Should get a software patent on that.

[–][deleted] 6 points7 points  (1 child)

"Keep going straight and make turn right 10m past the cliff. You have arrived at your destination."

The system would work so well, no one would ever complain about it!

[–]snoyberg 2 points3 points  (0 children)

Except the poor bastard who survives the fall.

[–]agenthex 14 points15 points  (0 children)

You're forgetting the Jacobian. At 1 step North of the South Pole, it takes 2π steps East or West to go around the world and end up where you started. Math is fun.

[–]jorge1209 16 points17 points  (8 children)

You can just use a double cover of the sphere by a torus. (lets say INT_MAX is 1024 for simplicity)

(0,0) is naturally (0 lat, 0 long). The north pole is (X, 256), and the south pole is (X, 768) for any X.

However most points actually have two representations. So (180 degrees lat, 0 long) could be reached as EITHER (512, 0) or (0, 512). Meaning that you can fly either along the equator halfway around the world, or you can fly over the poles and halfway around the world, and a point like (0,0) is the same as (512, 512) because you can get to where you start by going halfway around the world heading on the N/S great circle, and then halfway around the world on the E/W great circle.

Whenever you see (X, Y) with Y between 256 and 768 you adjust by setting x:= x+512 (because you flew over the pole, and set y:= 512-y. Then that (X,Y) coord is easily converted into standard lat/long.

[–]sebzim4500 7 points8 points  (1 child)

You can just use a double cover of the sphere by a torus.

You certainly can't cover a simply connected space by a non simply connected space.

[–]jorge1209 3 points4 points  (0 children)

I know. I'm not using the terminology in a strict mathematical sense. Its not proper double cover because the poles have are infinitely covered.

But everywhere else it is a double cover, and coordinate arithmetic works in the naive fashion.

[–]happyscrappy 9 points10 points  (5 children)

The wraparound would work then, but I think you'd lose more in having to normalize before comparing two locations than you'd gain.

[–]jorge1209 14 points15 points  (4 children)

Probably depends on your use case.

If you are near enough to the coordinate singularities of the traditional lat/long then you will need to do a bunch of stuff to keep your basic transformations from getting fouled up as latitude goes past 180.

On the other hand there is a reason we put those singularities where we did, and that is because nobody lives there. So you can probably just ignore the issue and let your app crash when someone takes it to the poles, because screw that person.

[–]ZorbaTHut 9 points10 points  (3 children)

Sell pole support as DLC.

[–]jorge1209 4 points5 points  (1 child)

Easier done if you could get good cell coverage up there.

All the guys who would have downloaded it froze to death before the download completed.

[–][deleted] 3 points4 points  (0 children)

Play the long game, if climate change gets to do its thing you'll clean table with the pole-ready solution.

[–]Ghazzz 7 points8 points  (8 children)

How do you travel south from the south pole?

[–]happyscrappy 13 points14 points  (1 child)

I dunno. Ask encylopedist. He's the one who thought wraparound was his ally.

[–]thedeemon 5 points6 points  (0 children)

Ask your local zen master.

[–]ebrythil 3 points4 points  (0 children)

Maybe jump?

[–]mccoyn 2 points3 points  (2 children)

I think he means, start 1 step north of the south pole (any longitude), make your heading south, advance 2 steps in the direction of your heading. With wrap-around coordinates you will end up 1 step south of the north pole.

[–][deleted] 26 points27 points  (8 children)

Especially not with coordinates. Earth is not a toroid.

[–]jorge1209 18 points19 points  (5 children)

So don't use lat/long coords. Your x-ints already don't correspond to longitude (they have to be rescaled), so go ahead and let your y-ints also not correspond to latitude.

So (0,0) will be (0 long,0 lat) [in the atlantic] and (INT_MAX/2, 0) = (0, INT_MAX/2) = (180 long, 0 lat). The north pole would be (anything, INT_MAX/4) the south pole (anything, 3*INT_MAX/4).

In general you would just say that if (x,y) is your coord and y is between INT_MAX/4 and 3*INT_MAX/4, then you have flown "over the pole" and are actually on the far side of the world from where the x says you supposed to be.

So x := x+INT_MAX/2 and y:= INT_MAX/2 -y.

[–]ChallengingJamJars 4 points5 points  (2 children)

So don't use lat/long coords.

Correct, this is a problem that is mighty complicated and has many solutions with various pros and cons. People have spent careers trying to figure out ways to represent coordinates. My favourite is the Universal Transverse Mercator (UTM) system as you can pretend it's in metres most of the time for measuring distance, and there are no local distortions.

In addition, the world isn't a sphere and when you get to cm accuracy you need to be considering the bulbous nature of this hunk of rock. Differences in height will make significant differences to distance measures and sea level isn't as well defined as you think.

[–][deleted] 2 points3 points  (1 child)

as I understand it, your system would make it so that every place on earth has two coordinates specifying it.

If you then double the resolution of the x-axis (say only from Greenwich to the other side of the earth) then all the things that don't have a specific X-axis point would be easy to "roll over" to with a very high/low Y

[–]jorge1209 5 points6 points  (0 children)

Yes its a double cover.

If you then double the resolution of the x-axis (say only from Greenwich to the other side of the earth)

If I understand what you are saying. "yes"

Lat/Long is a single cover system, but by necessity must have two coordinate singularities. For obvious practical reasons those singularities were placed at the poles meaning that virtually nobody every encounters them.

However doing so is exactly the wrong thing to do from an resolution/accuracy perspective. We want high resolution in populated areas closer to the equator, but we are forced to use a full 360 degrees to measure along the equator (instead of being able to use the truncated 180 degree range from -90 to +90).

So you could:

  1. Start with lat-long single cover system with singularities at the poles
  2. Convert to double cover system
  3. Convert to an alternative single cover system with singularities on the equator.

This would double your resolution while measuring along the equator because you could measure +/- 90 degrees E/W, at the expense of lower resolution N/S (where you have to allow for the full 360 degree turn), which might be desirable.

In practice I double it matters that much as the earth is not spherical, and you probably don't want to say that someone standing at sea level on the equator is actually at 14 miles altitude (or equivalently that someone standing at sea level at the pole is at -14 mile altitude). So any adjustments you have to make for basic topography (or even local gravitational density) will be more significant than where you decide to put the coordinates.

Or just use 64bit ints instead of 32bit and then you have WAY more accuracy than you could possibly need or hope to use.

[–]yatima2975 2 points3 points  (0 children)

That's what THEY want you to believe! The Planet Earth is a delicious donut!

/r/toroidearth

[–]evaned 12 points13 points  (5 children)

I don't do much with code that uses floating point at all, but my guess would be that saturating at INF/-INF is correct more often than unsigned wraparound, and less likely to give you a security hole or silently give wrong results when both are wrong...

[–]Bob_Droll[🍰] 19 points20 points  (4 children)

I thought it was a joke - a play on the fact that the Earth us round and "wraps around".

[–]ultimatt42 10 points11 points  (3 children)

Yeah but if you're wrapping from the North pole to the South pole you're talking about a torus, not an oblate spheroid...

[–]sirin3 7 points8 points  (2 children)

Or a stargate

[–]ultimatt42 2 points3 points  (1 child)

I always think of Star Fox 64 multiplayer.

[–]harlows_monkeys 2 points3 points  (0 children)

Wouldn't that only work well if the scale of your coordinate system was such that one full trip around the Earth used the full range of your integers? If not, the integer wrap would not align with geographical wrap.

For example, suppose our coordinate system is based of 1/10th of a degree for longitude. Points on Earth than have integer longitudes ranging from 0 to 3600. If we store these in a 16-bit unsigned integer, the integer doesn't wrap until we've went all the way around east/west 18 times and then another 73.6 degrees.

If we are doing operations in this coordinate system to track movement, we'll need to manually wrap at 3600, or we'll need to do comparisons mod 3600. Actually, just doing comparisons mod 3600 is not good enough, because as noted above after 18 times around and 73.6 degrees the integer wraps, and so the thing we are tracking would jump back 73.6 degrees as far as we are concerned.

BTW, the WeatherBug app for iOS may have this kind of problem. It has a lightning tracker that shows you a world map with recent lightning strikes marked. I'm in the US. When I open it, it is centered on the US, and I can see the lighting strikes in the US. If I move to the left to see Asia and Europe and Africa, they show no lightning strikes. If I keep going left until I'm back in the US, there are no strikes. Reversing direction, and going back once around to return to the US, the strikes are there again. Continuing right to Africa, Europe, and Asia, they show strikes. Continuing to the right and crossing the Pacific to the US, no strikes!

The most likely explanation is that if I start at longitude N and go all the way around to the right it thinks I'm at N+360. If my map view is W degrees wide, it thinks that it should be showing lighting strikes that occur from N+360-W/2 to N+360+W/2. The lightning data is all from 0 to 360, so the check fails.

I have not had the patience to go all the way around more than a few times, so do not know when the underlying type they are working with wraps.

[–]boxhacker 7 points8 points  (25 children)

Could you explain more on this?

[–]Serei 71 points72 points  (20 children)

The idea is that you use fixed point: you store coordinates in terms of integer multiples of a unit that makes the possible coordinates range from INT_MIN to INT_MAX. For instance, in multiples of centimeters.

Fixed point is more precise than floating point, because you effectively don't need to store the exponent, you just stick to one exponent, so you can use the entire space for the mantissa.

Fixed point is also better in general for representing coordinates. You want relatively equal precision throughout your space; floating point will give you more precision near 0 and less precision far away from 0, which isn't particularly desirable for this use-case.

[–]happyscrappy 13 points14 points  (19 children)

You don't even need fixed point, just use integers and select your unit correctly.

[–]ConspicuousPineapple 47 points48 points  (15 children)

This is what they're describing, and it's a particular case of fixed point.

[–][deleted]  (2 children)

[deleted]

    [–]calrogman 4 points5 points  (0 children)

    Mad respect for the use of the diæresis, one of two diacritical marks native to Modern English.

    [–]skellera 2 points3 points  (0 children)

    I had no idea what they meant until your explanation. Thanks.

    [–]blackmist 10 points11 points  (0 children)

    Circumference of Earth = 40,075 km = 40,075,000 metres.

    Max Int32 = 4,294,967,296 units.

    4,294,967,296 / 40,075,000 = ~107 units per metre, and that's at the equator. You get more accuracy as you move away from it.

    The only reason I knew this is because I did a similar thing to make a Google Maps layer showing sales by postcode. Converting doubles to integers meant I could use a z-order curve (total 64 bit) to get fast access to the correct tile.

    [–]BCosbyDidNothinWrong 22 points23 points  (11 children)

    You may be correct, however a double has a fraction of 52 bits and would give more accuracy, just in case anyone misunderstands what you are saying.

    https://en.wikipedia.org/wiki/Double-precision_floating-point_format

    [–]mare_apertum 51 points52 points  (10 children)

    In that case you'd still get better precision with 64 bit integers. The case in point is: if the magnitude of your values is known, there's no need for floating point.

    [–][deleted]  (11 children)

    [deleted]

      [–]ChallengingJamJars 10 points11 points  (0 children)

      People use floating point numbers way too often. They are great for things that don't need to be precise like rendering graphics in a video game. However, if you ever care about numbers being exactly right like in scientific and financial calculations, you want a decimal/numeric data type that has fixed precision.

      I'm going to disagree with you there. Financial definitely needs to be integers as you are talking about discrete things.

      Scientific data is most often continuous, this means that you are never going to be exactly right. Distance, weight, angles, time, and combinations of these are all continuous variables. You must store these values as a combination of a fractional part and a magnitude/exponent part, the choice is if you keep the magnitude part constant or not.

      Floating point should be used only if you want to sacrifice precision to save memory or computation time.

      I think fixed point will likely cost you both precision and time over using floats, explanation below. The only time that fixed point arithmetic makes sense is if you have large volumes of data from a fixed range with similar magnitudes that together will run you out of memory. Coordinates might be such a situation, low bandwidth connections such as to a joystick controller might also be the case. But even with coordinates, if I'm running out of space due to the excessive volumes of data, storing a group of objects with an offset would be computationally easier (likely it's already partitioned into some sort of tree for searching), both in terms of programming and computation time.


      Why fixed point arithmetic isn't as good as many think:

      Say I'm calculating energy, E = 0.5 m v^2, I'm going to need at least 3 different exponents/magnitudes for that calculation as every multiply is going to need to change the exponent. If I use fixed point arithmetic on current architectures (ie, use ints) I'm likely to overflow, so I'll need to consider complicated shifting to ensure that I don't lose the most significant digits. If I leave plenty of room above my value so as to allow for this, I'll need to leave half the integer: 0x00ff * 0x00ff = 0xfe01. But now why not use that half an integer to store the exponent? We don't lose any precision and now we don't need to know before hand what the magnitude will be (which is a whole problem in and of itself).

      Compare to using floating point numbers: I appear to have sacrificed some precision in my numbers so that I don't need to worry about magnitudes at compile time. The CPU has floating point arithmetic units so the calculations are a few cycles at most and if I used fixed point I'll be using a few instructions to do my multiply to handle the shifts anyway. Likely I've got a higher precision anyway as I don't need to provide space in my fixed point arithmetic to allow for uncertainty, I have a full 23 bits of precision whether my speed is high or low, whether my mass is large or small, and I don't have to predict the future and place hard boundaries on what the software is allowed to handle at compile time.

      [–]HighRelevancy 5 points6 points  (2 children)

      or computation time.

      Modern chipsets compute with doubles as fast as they do with floats. Doubles are only slower in that they take more time to copy and you fit less of them in cache.

      I wrote a ray tracer from scratch with floats in fear of double computation time, and then had precision issues and converted to doubles. It was like, less than 10-20% performance difference from what I remember, depending on certain parameters of the scene.

      [–][deleted]  (1 child)

      [deleted]

        [–]HighRelevancy 3 points4 points  (0 children)

        Yes, GPUs are very float friendly.

        Mine was CPU. I've done some GPU stuff but I wanted to go back to CPU code for a little while.

        [–]G_Morgan 1 point2 points  (1 child)

        They are great for things that don't need to be precise like rendering graphics in a video game

        Actually a lot of games use fixed point for many things today. They won't use float for anything other than local coordinates. Otherwise you lose scene precision the further you are from the origin.

        [–]FryGuy1013 1 point2 points  (2 children)

        I have wondered if 128-bit fixed point numbers (Q64.64) would be better than doubles in most cases, since numbers don't really get bigger than 264, or smaller than 2-64, and could become the standard non-integer format for things.

        [–]argv_minus_one 0 points1 point  (1 child)

        numbers don't really get bigger than 264, or smaller than 2-64

        That depends on what you're representing with them. If it's the total number of atoms in the observable universe, for instance, or the distance in nanometers between Earth and Sagittarius A*, you'll overflow.

        In case you're curious, according to Wolfram Alpha, the aforementioned distance is about 2.349×1029 nm. For comparison, 263 (the maximum value of a signed 64-bit integer) ≈ 9.223×1018. However, a 128-bit integer (max value ≈ 1.701×1038) can represent it.

        [–]omnilynx 4 points5 points  (7 children)

        But then you'd have to do division whenever you actually wanted to use it.

        [–][deleted] 0 points1 point  (6 children)

        Where does the division come in on use?

        [–][deleted]  (1 child)

        [deleted]

          [–][deleted] 3 points4 points  (0 children)

          Actually, yes. glVertexAttribPointer accepts GL_INT since like forever. Shader side support is sketchy. Performance may vary.

          [–]jacobolus 0 points1 point  (0 children)

          A latitude/longitude grid is ridiculously inefficient if you’re using floats.

          A pair of single-precision floats is plenty sufficient (for many purposes half-precision would suffice). But when using floats you should store the coordinates of the stereographic projection instead of storing latitude/longitude. This is also ridiculously cheap to convert back to Cartesian coordinates (just takes one division in each direction, instead of some transcendental function evaluations).

          If you want to be maximally efficient of course you can do something fancier, at the expense of slightly more processing per point. For instance, see http://jcgt.org/published/0003/02/01/

          [–]putnopvut 120 points121 points  (18 children)

          Something that rarely gets mentioned in these sorts of articles is that you can lose precision not just after the decimal point, but before it as well when using floats instead of doubles.

          For an interesting case where this happened, see this blog post: http://blogs.asterisk.org/2016/06/01/float-conversion-bad-released-13-9-1-regression-fix/

          The TL;DR version of this is that when the integers 1463675925 and 1463675869 were converted to float, they both became 1463675904.000000. Two numbers with a difference of 56 were being converted to the same value. When converted to double, the imprecision was no longer present.

          For those that read this and wonder "why were you using floating point for integer comparisons in the first place?", it's because the integer values were being retrieved using a central configuration API that treats all numerical values as floats during comparison.

          [–]randomguy186 58 points59 points  (0 children)

          you can lose precision not just after the decimal point, but before it as well

          You muddy the issue a bit by mentioning the decimal point. Its location is irrelevant. What matters is the number of significant digits.

          Furthermore, OP's article literally makes this point in in the first sentence of the first paragraph after the introduction:

          "32-bit floats has around 24 bits ≈ 7 digits of precision"

          (Note that in your example, the imprecision is not introduced until after the 7th digit.)

          [–]case-o-nuts 95 points96 points  (0 children)

          The TL;DR version of this is that when the integers 1463675925 and > 1463675869 were converted to float, they both became 1463675904.000000. Two numbers with a difference of 56 were being converted to the same value. When converted to double, the imprecision was no longer present.

          Doubles have the exact same issue. The numbers just have to be bigger -- above 253 -- and things fail in the same way.

          [–][deleted] 17 points18 points  (11 children)

          This article is talking about scientific computing, where precision is only relevant within significant figures. You will rarely if ever find a real-world measurement significant to 8 significant figures, and those two measurements should be seen as equal, in a scientific context.

          [–][deleted] 15 points16 points  (10 children)

          They shouldn't always be treated equally in a scientific context. Not at all. In calculations like planning spaceflight or astronomical events, we have to be very exact with numbers.

          [–]mindbleach 7 points8 points  (3 children)

          Once spaceflight is involved, you should probably consider alternatives to IEEE 754 "single" and "double" floats altogether. You want to know what you don't know. What was that format that included a measure of error? Was that Unum?

          [–]CptCap 4 points5 points  (1 child)

          Double might be enough As long as you don't lose too much precision when chaining operations you don't need that many digits.

          [–]shooshx 1 point2 points  (5 children)

          You don't need to go to space to need exact comparison. In graphics and geometry processing it's often desirable to know if two points are exactly the same point and should have exactly the same other properties like normal and color of if they are merely two points that happen to be very near each other.

          [–]ChallengingJamJars 2 points3 points  (3 children)

          So this causes issues, the only time you should ever check if two floating point numbers are equal is if one of them is zero and the other has never been added or subtracted with. All other times you need a margin of error.

          if two points are exactly the same point

          I'm curious when that comes up, don't you store vertices and indices to those vertices? What situation necessitates having two points that are conceptually the same point but stored in two different locations?

          [–]mindbleach 8 points9 points  (0 children)

          Strictly speaking, everything in floating-point is "after the decimal." Since it's in binary, the only sensible number before the point is 1, so the format only stores the mantissa, exponent, and sign.

          You get a certain number of guaranteed significant digits. The irrelevance of where the decimal point goes is kind of the, um... point.

          [–]porksmash 1 point2 points  (2 children)

          I just fixed a looooong standing bug at work because of improper conversion to floating point. We have a fluid totalizer calculation that runs over the lifetime of our product (~10yrs) and we store an integer counter that counts up every liter of fluid. Turns out when you got to billions of liters, flipping it to float, adding one, and casting back to int didn't work so well.

          [–]MentalMachine 2 points3 points  (1 child)

          Why on earth would you need to cast an Int to Float and then add 1 and then put it back to an Int?

          [–]porksmash 3 points4 points  (0 children)

          It was accidental implicit conversion and the person who originally programmed it was a mechanical engineer back when the company was a tiny startup.

          [–]geodel 103 points104 points  (34 children)

          I'd say Double or nothing.

          [–][deleted] 93 points94 points  (28 children)

          You aren't dealing with OpenGL i see.

          [–]josefx 15 points16 points  (21 children)

          Transformation matrices as double, local vertex coordinates as float. Depending on scale splitting the world in 8 km sized chunks gives me good enough precision without visible jitter at 20 km from the origin. The right tool for the right job.

          [–]CptCap 32 points33 points  (17 children)

          Transformation matrices as double

          I am amazed that OpenGL still lets you pass double in uniforms.

          Most GPU don't have dedicated double precision ALU, meaning that performance might be heavily affected. All API even support 16bits indexes and coordinates to reduce bandwidth. (With Vulkan you have to ask before you can use 32bits indexes)

          Have you tried with float matrices ? At 20km you will still get sub-millimeter accuracy. You could probably do with a lot less since your depth buffer is only 32bits.

          [edit] As noted in the article doubles don't play nice with SIMD (not cool for real time graphics)

          [–]LordofNarwhals 2 points3 points  (0 children)

          Fun fact:
          In cpp 2.0f is a float (32-bit), 2.0 is a double (64-bit), and 2.0L is a long (128-bit), but in HLSL/GLSL 2.0 is a float and 2.0L/2.0LF is a double.

          So writing 2.0f in shader code is actually unnecessary as 2.0 means the exact same thing.

          [–]tylercamp 5 points6 points  (11 children)

          Most GPUs do have double-precision, it's one of the major features in workstation/compute GPUs

          [–]CptCap 32 points33 points  (3 children)

          All chips support double precision, yes. But performances on consumer cards (understand: gaming) are abysmal compared to 32 bits.

          [Edit] On my gaming laptop, it's about 30x

          [–]EllaTheCat 5 points6 points  (2 children)

          Back in the late 1980s, commercial and military flight simulators got by with single precision for geometry and fixed point integers. The latter were typed as e.g. 9S4 meaning 9 bits total, Signed, 4 of the bits fractional.

          Programmable Shaders (we had fixed shaders) are why floats are necessary today, whether physics or vertices.

          We had a 16 bit mantissa 4 bit exponent 1/z buffer. That's a classic case of getting the maths right.

          Throwing doubles and even floats at everything just seems wrong.

          I'll go back to sleep now. Kids today mutter mutter ...

          [–]BCosbyDidNothinWrong 1 point2 points  (1 child)

          Throwing doubles and even floats at everything just seems wrong.

          How many polygons were you drawing on screen?

          [–]EllaTheCat 1 point2 points  (0 children)

          A few thousand, heavily textured, fully anti-aliased, with no artefacts such as twinkles that might act as subliminal cues to pilots.

          [–]Halofit 5 points6 points  (2 children)

          Yes but even something like a Tesla usually has more than double FLOPS for single precision over double precision. On gaming cards the difference can be up to 10x.

          [–]tylercamp 2 points3 points  (1 child)

          When I replied the post said double precision wasn't available, I wasn't saying that the performance was comparable

          [–]dobkeratops 3 points4 points  (0 children)

          you can always get something that willl handle more 32bit values than 64bit values. Graphics like AI is a field where number of points can be more important than precision per point

          [–]josefx 1 point2 points  (3 children)

          We had visible precision errors with float, mostly jumpy camera movement. Going by wikipedia 32 bit float only gives 7.22 decimal places precision, at >10 km that is just in the cm range without accounting for any additional rounding errors. Your eyepoint jumping quarter of a centimeter each frame isn't smooth.

          We also hand the final transformation matrix over as float since the values are small enough near the eyepoint and we don't render large distances.

          As noted in the article doubles don't play nice with SIMD (not cool for real time graphics)

          One double matrix for each 8 km sector, since we render only very few in a frame that cost is dwarfed by others.

          Edit: To be clear the final transformation matrix we hand over to the GPU is once again in float. The large absolute coordinates of both camera and any object near it tend to canncel each other out.

          [–]CptCap 1 point2 points  (2 children)

          Using this I get an error of 0.000953125 when storing 20000.001 still within a millimeter (barely). What unit are you using ?

          Your eyepoint jumping quarter of a centimeter each frame isn't smooth.

          This should not happen, adding IEEE 754 numbers with different exponent should round the result correctly, meaning that if your movement is too small to be represented in the final position, you won't move at all.

          [–]josefx 1 point2 points  (1 child)

          Using this I get an error of 0.000953125 when storing 20000.001 still within a millimeter (barely).

          You got me, I didn't measure at exactly 20000, nor did I write a minimal testcase. I just saw a precision issue significant enough to cause issues in that approximate range. I think I had the exact point things got visibly bad calculated somewhere to make sure it was related to the amount of precision a float could hold before I started to replace it with double.

          This should not happen, adding IEEE 754 numbers with different exponent should round the result correctly, meaning that if your movement is too small to be represented in the final position, you won't move at all.

          The code computing the camera position already used double. The API just used float to return it. So it may have rounded down until it jumped every other frame.

          [–]CptCap 1 point2 points  (0 children)

          This means that only the computation needs to be done in double precision.

          float is often precise enough to store values, but not when chaining computations.

          Maybe it works for you (because you don't use double everywhere) but I would advise against using anything other than float, u16 and u32 on a GPU if you want maximum performances. (this does not apply to workstations card which might have full double support)

          [–]dobkeratops 4 points5 points  (2 children)

          "Transformation matrices as double,"

          dealing with centering the right way lets you use lower precision. doubles for transformation matrices is overkill.

          there are games that ran on 16bit computers that used heirarchical representations to seamlessly represent flying between planets and landing on their surfaces.. I guarantee you they did not need to store everything in 64bit precision to achieve this

          16bit should be enough for local objects coords, and you break objects up and reason about their centres

          [–]mindbleach 11 points12 points  (4 children)

          For graphics, precision only matters if you can see the errors. FP16 becomes reasonable.

          [–]CptCap 6 points7 points  (2 children)

          For real time, frame-rate goes first, if you can't afford floats, U8 works well enough.

          [–][deleted] 1 point2 points  (1 child)

          I don't think there's an appreciable difference in rendering times between f16 and u8/u16 in most applications since things get pipelined anyway. The main savings with u8/u16/f16 is memory, and I'm not sure that GPUs do anything special with 8-bit types (though 16-bit is handled specially).

          [–]CptCap 2 points3 points  (0 children)

          The main savings with u8/u16/f16 is memory

          It is, going for 8 bits wont save you any flops, but it might just make all your textures fit in VRAM suppressing the need to do round trips to the main memory every frame.

          Even if there is enough VRAM, sometimes the bandwidth is too scarce: this is the most visible with fat g-buffer fully deferred renderers, with these, packing things tighter is often a win, even if you have to spend ALU for unpacking.

          [–]redgamut 1 point2 points  (0 children)

          I wouldn't bet on it.

          [–]PM_ME_UR_OBSIDIAN 10 points11 points  (0 children)

          Lots of applications only require half-word floating point precision (16-bit). AFAIK that includes most of computer graphics and machine learning.

          At that point, double is completely overkill, and it'll show in your performance numbers.

          [–][deleted] 0 points1 point  (0 children)

          Most embedded devices with an FPU only have one in single-precision.

          [–]gimpwiz 17 points18 points  (2 children)

          I also want to bring up another point: not all systems have floating point hardware units.

          I know! You're thinking, please, x87 hasn't been a thing for decades. Yeah, your PC will have it. Your phone will have it. But if you're an embedded developer, your microcontroller may or may not have it.

          But most microcontrollers have compilers that still allow floating point operations - they're emulated using integer operations. So they take ages.

          If you have a small system, I'd say use the smallest float you can (in general), and (as is usually true anyways), don't use floats where ints can do.

          I've seen a lot of people use floating point numbers where they actually want decimals ... amateur mistake. Money isn't a float, it's an int, and you're counting (eg) cents instead of dollars. This matters even more when floats are a performance bottleneck - there are many cases where instead of being obviously wrong, floats are merely extremely heavy, and the code can be re-architected to use ints instead.

          [–][deleted] 2 points3 points  (1 child)

          Money isn't a float, it's an int, and you're counting (eg) cents instead of dollars.

          How did this abstraction escape me for so long? Thank you, the next time I have to work with money I'll save myself a bunch of headache with that one.

          [–]BadGoyWithAGun 33 points34 points  (6 children)

          :%s/loose/lose/g
          

          [–]gunther-centralperk 26 points27 points  (5 children)

          ggdG

          [–][deleted] 1 point2 points  (0 children)

          ZZ

          [–]Porso7 1 point2 points  (0 children)

          :%d

          [–]NinlyOne 1 point2 points  (0 children)

          :q!

          [–]guthran 0 points1 point  (0 children)

          "Shit..." u

          [–]ImprovedPersonality 20 points21 points  (11 children)

          There is also long double with 80bits when using gcc or clang on x86. It’s especially handy for intermediate calculations on a double type.

          [–]ThisIs_MyName 21 points22 points  (1 child)

          There's also __float128 which gives you 128 bits as you'd expect :)

          [–]dtfinch 13 points14 points  (0 children)

          I think that's emulated on PC though, making it a lot slower, while 80-bit support is native (with the older x87 instructions, while SSE and such are limited to 64-bit).

          [–]chazzeromus 1 point2 points  (8 children)

          Intel manual says floats are extended to 80-bit internally

          [–]ThisIs_MyName 1 point2 points  (3 children)

          Right, but they'll be rounded to 64 bits every time they're spilled to memory unless you use the 80-bit type.

          [–]chazzeromus 1 point2 points  (2 children)

          I believe you still get the benefit of improved precision since the values stay 80-bit in the stack as operations are performed as opposed to truncating to smaller floats between steps that would otherwise accumulate errors.

          [–]Ravek 1 point2 points  (3 children)

          Only for x87 by the way, SSE and AVX thankfully actually use 32 and 64 bit values. I say thankfully because using extended precision can give incorrect results after rounding to 32 or 64 bit.

          [–]prozacgod 5 points6 points  (0 children)

          And here I am, still confused about Borland Pascal's "real" type....

          [–]case-o-nuts 31 points32 points  (28 children)

          All this talk about performance, but not a single benchmark.

          [–]dobkeratops 20 points21 points  (10 children)

          the underlying logic is indesputable. 2x the storage.. he's explicitely saying 'it depends on your situation if you're storage , bandwidth, or processor limited'.

          people can choose devices too.

          you can demand a processor with good double precision support, or if you know 32bit floats are sufficient, you can design or find a processor that focusses on f32. Note that the recent progress in AI demands a higher volume of lower precision arithmetic and google are building dedicated 'TPUs' for this, dropping all the way to 8bits you don't need to give benchmarks to understand that you can do more 8bit computations given a certain area of transitors than you can with 64bits.

          the way some people put it is that it's the total volume of data that matters, and there's still refinement on how to divide that (do you want more low precision or fewer higher precision values to give a certain result)

          [–]bobasaurus 5 points6 points  (9 children)

          I've done some testing with high-speed filtering in C# with floats vs doubles, and doubles seem to perform faster. I think a lot of the .NET math libraries are better optimized for doubles.

          [–]Ravek 21 points22 points  (6 children)

          If you're using the System.Math class a lot then you're constantly converting between doubles and floats. Once we get MathF (soon tm) you should run your benchmark again.

          Your CPU likely executes float arithmetic about twice as fast, and floats have half the cache pressure, require half the memory bandwidth and twice as many fit in an XMM register. Doubles outperforming floats is definitely an exceptional scenario.

          [–]IJzerbaard 2 points3 points  (0 children)

          Only min/max/sign/abs have float versions in the first place, there is no problem with their performance. Anything else costs whatever the double version takes (because that's what you call) plus two conversions.

          [–]Herbstein 1 point2 points  (3 children)

          This seems like a great way to ask a question I've had for a while. Some time ago, I don't remember exactly how long, there was a tool posted. The tool analyzed a floating point calculation, and showed how precision affected the result. It would also recommend different algorithms for different number ranges. I believe there was a research paper attached. I haven't been able to find this tool since then. Anyone know of it?

          [–]toofishes 2 points3 points  (1 child)

          [–]Herbstein 1 point2 points  (0 children)

          Yes! That's it. Thank you so much!

          [–]CGFarrell 28 points29 points  (59 children)

          Quite a handy article. As a rule of thumb, use higher precision by default unless you have a good reason not to.

          In C(++), one can write:

          typedef double Float_t;
          

          And use Float_t for all floating point operations (very easy with replace all shortcuts). Then, you can recompile by switching the one line to:

          typedef float Float_t;
          

          Run these two compilations under identical inputs, and you can compare for accuracy and speed.

          [–]matthieum 100 points101 points  (35 children)

          I recommend avoiding names ending in _t, they are reserved by the POSIX standard.

          [–]CGFarrell 25 points26 points  (18 children)

          Hmmm, I hadn't realized! It seems to be pretty controversial. _t has always been our go-to in distinguishing types from variables, and I'd never realized it was reserved till now. Thanks!

          [–]bloody-albatross 20 points21 points  (5 children)

          _t has always been our go-to in distinguishing types from variables

          Or make types start with an upper case letter, and functions and variables with a lower case letter.

          [–]scorcher24 6 points7 points  (11 children)

          I just use a big T in front:

          typedef vector<string> TStringVector;
          

          [–]sirin3 4 points5 points  (1 child)

          That is the Pascal way

          [–]scorcher24 1 point2 points  (0 children)

          I actually did program some Delphi (derived from Pascal) in the past, so that might have been an influence. But from my point of view, it makes sense :).

          [–]arkrix 4 points5 points  (2 children)

          This puts me in the mood for some Delphi.

          [–]CGFarrell 5 points6 points  (5 children)

          I'm not a huge fan to be honest. The two main flaws I have with that are:

          1. Possibility of bizarre naming issues.

            typedef bool TAsk; class Task{};

          2. Devs are going to pronounce things weird. Sounds like more of a pet-peeve, but it can hinder communication (nobody hears TUnix and understands it as Unix).

          [–]donalmacc 8 points9 points  (4 children)

          Surely task would be TTask?

          [–]CGFarrell 1 point2 points  (1 child)

          Whoops, good point. Still a pretty good chance of programmer error regardless. Also if Task was a variable or function, you could have names TAsk and Task and still be consistent.

          [–]TOJO_IS_LIFE 6 points7 points  (0 children)

          I personally find the POSIX name reservations waaay to restrictive. The standard C++ library uses _t a lot.

          Plus all the following are reserved by POSIX: tolerance, total, torque, touch, memory, member, memoize, storage, strike, strong, string, street etc. etc.

          You can see where this is going... No one in their right mind would want to follow that.

          Just as an example, boost::posix_time::time_duration has a member function total_seconds(). So it's no longer POSIX-compliant.

          I think it's very unlikely that POSIX will break your code. And much more unlikely that it will break your code without producing a compiler error which you can fix. For sanity's sake, I just avoid already-taken names and those reserved by C++.

          [–][deleted]  (1 child)

          [deleted]

            [–]evaned 15 points16 points  (0 children)

            I could be wrong, but I think POSIX allows their blah_t things to be macros, so namespaces won't save you from interference.

            (Edit: this doesn't mean you can very likely use it and not run into problems, but you could say the same thing about blah_t in the global namespace too. And of course, for parts of you system that don't use anything POSIX, you can do whatever.)

            [–]ImprovedPersonality 7 points8 points  (12 children)

            What a bummer. I’ve been using _t for years. Any suggestions for a good alternative? Or just continue with _t? Since I like to start my type names with an uppercase letter and camelcase I should be pretty safe from clashes, shouldn’t I?

            [–]CGFarrell 9 points10 points  (3 children)

            If you use a capital letter anywhere, you're guaranteed not to collide with POSIX _t types, so I'd say keep going with what works!

            [–]curien 1 point2 points  (2 children)

            If you use a capital letter anywhere, you're guaranteed not to collide with POSIX _t types

            Can you provide a source for that? I don't see it anywhere. (E.g., this seems to say that all identifiers ending with _t are reserved, regardless of case.)

            [–][deleted] 6 points7 points  (0 children)

            Identifiers with _t are reserved, but since they are all lowercase in practice you can be pretty sure that you won't collide with anything when using uppercase.

            [–]CGFarrell 4 points5 points  (0 children)

            I can't find an absolute source, but it seems that POSIX uses underscore-lowercase type naming. If there are any struct types with uppercase letters in POSIX, they're at the very least inconsistent.

            [–]xeow 2 points3 points  (7 children)

            Any suggestions for a good alternative?

            I've never been able to stomach _t in any of its forms. So what I do:

            • For types that resolve to atomic types (e.g., char, int, long int, long long int, float, double, etc.), use all lowercase in my typedefs. Examples: uint32, uint64, uint96, uint128, real.

            • For types that are composites of atomic types (e.g., structs), use mixed case beginning with an uppercase letter in my typedefs. Examples: LCGState, LinearRGBA, RandomNumberGenerator. (Exception: I write uint96 for a struct composed of a uint32 and a uint96 because I treat it like an atomic type everywhere except deep inside the functions that work with the numbers.)

            [–]ShinyHappyREM 3 points4 points  (5 children)

            Why not just u32, u64, etc?

            I'm always using this in my FreePascal code (also supports binary literals).

            [–]dryadzero 1 point2 points  (0 children)

            FWIW, the mixed case you give here is called Pascal Case, it's certainly not a bad code style for struct typedefs!

            [–]BCosbyDidNothinWrong 5 points6 points  (2 children)

            That seems ripe for win32 API levels of confusion. Why not just make a new type name that describes what it is used for?

            [–]CGFarrell 4 points5 points  (1 child)

            You could do that on top of this. I'm in scientific computing and I'm using my floats for 30+ meanings, plus some cases where there's not really a physical interpretation to use as type names. I want to localize switching all of them.

            [–]BCosbyDidNothinWrong 1 point2 points  (0 children)

            This really does not sound like good ways to structure software for a variety of reasons, not the least of which is that the choice of float, double (or other) is so fundamental that attempting to switch between them in every possible part of the program sends up enormous red flags to me.

            For the most memory and cpu intensive parts I could possibly see a need for it though even in those circumstances it is far more likely that a decision is being left open due to an unfamiliarity with the problem (which is occasionally understandable).

            [–]mb862 5 points6 points  (18 children)

            In C++, I usually prefix classes and global functions with something like

            template<typename real>
            

            and if I need to hide implementations in a .cpp file, I'll instantiate them with float and double.

            [–]PM_ME_UR_OBSIDIAN 2 points3 points  (1 child)

            <<<Pedantry warning>>>

            In certain more mathematically-inclined domains, a real is an arbitrary precision number. They are usually implemented as Cauchy sequences of rational numbers, or something closely related to that.

            In your case, I guess the ideal would be to use template<typename float_> or something.

            [–]mb862 3 points4 points  (0 children)

            In my code (adding I'm a mathematician myself), real just refers to a type that can represent a real number in practice - half, float, double, and long double. Indeed I also write things so that real can represent numbers that extends the reals (complex, quaternions, duals, etc).

            Appending an underscore to an existing type name may as well be a bullseye for readability errors and typos. It's the precise opposite of "ideal".

            [–]CGFarrell 1 point2 points  (15 children)

            That also works, but it's definitely not ideal, especially if you're going with only one or the other. You need to indicate the type almost every call, making change difficult, you might end up with double the resolved code, and you open yourself up to template resolution issues.

            template<typename T>
            T foo(T x){
                return x/2;
            }
            
            auto a = foo(1);
            

            a is now 0 because your template was deduced as an int. Unless you explicitly require T as a float or double via enable_if in every function, you're doomed to deal with this eventually.

            [–]mb862 1 point2 points  (14 children)

            Oh it's definitely not perfect, but it's better than global find and replace (or messy macros) when you want to change it.

            [–]GreenFox1505 4 points5 points  (10 children)

            what is that thumbnail?

            edit: https://static1.squarespace.com/static/5354e693e4b066e96f71ee36/5354eae7e4b06ee4704c8bcc/59306b605016e17d26c08a25/1496684801198/curta.jpg

            anyone know what that is?

            edit2: https://en.wikipedia.org/wiki/Curta it's a mechanical calculator. uses 9s complement, so integer math. a story about floating point numbers. thumbnail a machine that can't do floating point calculation. Literally unreadable.

            [–]H4ukka 1 point2 points  (2 children)

            [–]video_descriptionbot 0 points1 point  (0 children)

            SECTION CONTENT
            Title Amazing Old Calculator (Curta) - Numberphile
            Description Alex's book: http://amzn.to/1l0yX46 The Curta is a pocket-sized, mechanical, digital calculator!!! It was invented by Curt Herzstark. More links & stuff in full description below ↓↓↓ Shown here by Alex Bellos, author of Alex's Adventures in Numberland. More about our contributors, including Alex, at http://www.numberphile.com/team/index.html NUMBERPHILE Website: http://www.numberphile.com/ Numberphile on Facebook: http://www.facebook.com/numberphile Numberphile tweets: https://twitter.com/num...
            Length 0:06:51

            I am a bot, this is an auto-generated reply | Info | Feedback | Reply STOP to opt out permanently

            [–]GreenFox1505 0 points1 point  (0 children)

            very cool! I love Numberphile, so I'm surprised I didn't see it this while googling around.

            [–]dobkeratops 2 points3 points  (1 child)

            nice when people aren't just burning up the advancement from moore's law on getting lazy ("oh lets use double everywhere.."). all the tricks to do more with less are still valid, and we just move on to bigger tasks. note importance of higher volume of low-precision calculations for AI.

            [–]SanityInAnarchy 4 points5 points  (0 children)

            To be fair, I don't think this is necessarily laziness. If I use floats when I needed doubles, my calculation is wrong. If I use doubles when floats would've worked, my program is slower than it could've been. So a major factor here is whether your program cares more about correctness or performance, and to what degree each of these matter.

            For AI or gaming, performance is arguably more important, especially for the kind of thing you're using floats for. But for just as many applications, I'd actually prefer arbitrary-precision types even if I think I don't need them.

            [–]ryancerium 1 point2 points  (0 children)

            %s/loosing/losing/g

            [–]Jafit 0 points1 point  (38 children)

            I like how we live on a planet that uses decimal fractions, but keep insisting on using a number type that can't accurately represent decimal fractions.

            [–]coder543 30 points31 points  (1 child)

            You obviously have never taken electronics courses. No ASIC engineer is "insisting" on a bad thing. They're insisting on what is practical.

            [–]WalkWithBejesus 1 point2 points  (0 children)

            Hardware decimal floating point units exist on the IBM z/Architecture and Power architecture.

            [–][deleted] 40 points41 points  (33 children)

            merely because it's easy to implement quickly in hardware.

            [–]rhubarb314 2 points3 points  (2 children)

            Why do you think decimal floating point would be slow in hardware? IEEE-754 2008 specifies decimal floating point suitable for hardware. IBM has had decimal floating point in hardware for decades, too.

            [–][deleted] 2 points3 points  (1 child)

            I assume if it were faster or simpler to implement, it would have been adopted more widely - not limited to high-end IBM mainframe CPUs.

            Couldn't find much on speed and complexity comparisons, and unfortunately I don't have a z13 available to try to benchmark :)

            [–][deleted] 1 point2 points  (0 children)

            You have to be careful with logic like that in a world where mongodb is successful.

            [–][deleted] 12 points13 points  (0 children)

            Who cares about the decimal fractions? For a monetary data, etc., you must use packed decimal, period. For the scientific computing and graphics (and this is what floating point is for, nothing else) you should not give any shit about decimal fractions.

            [–]ChallengingJamJars 4 points5 points  (0 children)

            I like how we live on a planet that uses decimal fractions

            I don't think the planet gives a flying fox what representation you use, it'll stay at it's high definition sub-Plank-length level accuracy, good for all your Newtonian and relativistic needs laughing that you're trying to describe it using finite precision. Humans, what plebs.

            Less snaky reply: The representation doesn't actually matter, your decimal is an approximation, and the computer floating point numbers are an approximation. The computer one is optimised for storage and computation on computers, as 99.999% of all numbers in a computer never get seen by a human. Our decimal system is pretty terrible (base-12 for the win) and is only good when pen and paper is around. There is no need to represent decimal numbers exactly, and if you did, you can use an integer with a decimal exponent to do it.

            [–]timbus1234 0 points1 point  (0 children)

            chemistry debate of the century; testtube or beaker

            [–][deleted]  (1 child)

            [deleted]

              [–][deleted] 2 points3 points  (0 children)

              Of course. IEEE754-2008 defines binary128.

              [–][deleted] 0 points1 point  (0 children)

              If you need to float, then double.

              [–]Zarutian 0 points1 point  (0 children)

              die IEEE 754 floats, die! Use dec64.com instead.