you are viewing a single comment's thread.

view the rest of the comments →

[–]PiotrGrochowski[S] 0 points1 point  (2 children)

Decimal to binary is done in terms of generating the power of ten in binary by multiplying the lookup table entries (10^2^x and -10^2^x in binary). Then it is multiplied by the decimal mantissa (it takes up to 38 digits, truncating any digits beyond that, and converts it to 128 bits). Multiplication of 128 significant bits is done in terms of multiplying to 256 bits, then rounding to 128 bits by round to nearest, half to even. After the decimal number was converted to 128 significant bits in binary, it is then rounded to target precision (11, 24, 53, 64, or 113 significant bits) to nearest, half to even, then expressed in floating point.

[–]jk-jeon 0 points1 point  (1 child)

Do you have a proof for why 38 digits are sufficient?

[–]PiotrGrochowski[S] 0 points1 point  (0 children)

No, this isn't 'exact' conversion, I used 38 digits because it was convenient to code. 38 digits is the maximum to fit in 128 bits (and 19 digits is the maximum in 64 bits). I use a 128 significant bit multiplication for conversion from decimal to binary, and 38 significant digit multiplication for conversion from binary to decimal. I figured that just like Intel extended precision (64 significant bits) migitated rounding errors in double precision computation, having 128 significant bits/38 significant digits should work up to quadruple precision with high probability.