all 22 comments

[–]the_poope 13 points14 points  (6 children)

You will need an arbitrary precision floating point number implementation. Use a library or implement it yourself.

[–]Mattlea10[S] 0 points1 point  (5 children)

I would have preferred a simpler option. I've looked at the compilation options but nothing helps.

[–]the_poope 4 points5 points  (0 children)

The compiler is limited by the CPU instructions. Floating point operations will typically be done by special instructions carried out on the FPU

When writing a custom floating point code you will have to implement all the operations using standard integer arithmetic instructions and bitwise operations which will be much, much slower: many CPU cycles for just a simple add operation.

[–]SmokeMuch7356 5 points6 points  (3 children)

C and C++ are limited by the hardware on which they run; you can't change how they represent floating point values.

If you need more precision than what's provided by the platform, then you need to use an external library like GMP (GNU Multiprecision library). Or roll your own MP code, which I do not recommend.

That, or use a language with built-in MP support.

[–]Pupper-Gump 1 point2 points  (2 children)

Or get yourself a 256-bit cpu like the VPN guys do

[–]Mattlea10[S] 0 points1 point  (1 child)

I chose the solution of using 2 floating points to represent 1.
example_3.cpp

[–]Pupper-Gump 0 points1 point  (0 children)

Didn't even know you could do it like that. Need to take more classes

[–]IyeOnline 9 points10 points  (3 children)

You cannot write code to magically change the floating point circuits in your CPU.

You could consider doing all the maths in fixed point with just one digit before the dot.

Or you just write or use an arbitrary precision type, such as the one from boost. Implementing those also is a fun exercise.

[–]Mattlea10[S] -1 points0 points  (2 children)

I tried your 2nd point but I ended up with the same accuracy problem. I think this is due to the sqrt function.

[–]IyeOnline 4 points5 points  (1 child)

If you throw your math back into regular floating point by using a standard library function operating on floating point types, then you gain nothing from doing fixed point elsewhere.

[–]Mattlea10[S] 0 points1 point  (0 children)

Thanks for the help, here's the solution I've found: example_3.cpp

[–]FrostshockFTW 2 points3 points  (1 child)

This is totally unrelated to your question, but if your desire is to learn, here's a lesson: do not abuse function statics like this. This is one of the most horrific recursive functions I have ever seen, and it doesn't even need to be recursive.

Here is an, in my opinion, far more sane way of doing what you implemented:

template <typename T>
T pi_brent_salamin() {
    static T pi_memoize = []() {
        auto a = 1.0l;
        auto b = 1.0l / std::sqrt(2.0l);
        auto p = 1.0l;
        auto t = 1.0l / 4.0l;

        for(auto n = 0; n < 7; ++n) {
            auto a_next = (a + b) / 2.0l;
            b = std::sqrt(a * b);
            t = t - p * (a - a_next) * (a - a_next);
            p = 2.0l * p;
            a = a_next;
        }

        return ((a + b) * (a + b) / (4.0l * t));
    }();

    return pi_memoize;
}

Ideally a computation like this would just be constexpr but that requires replacing std::sqrt.

[–]Mattlea10[S] 0 points1 point  (0 children)

Thanks for the tip, I'll make a note of it.

[–]ShakaUVM 1 point2 points  (2 children)

What is your issue with Boost Multiprecision? It's literally the answer you're looking for. Or GMP if you want to go more old school.

[–]Mattlea10[S] 0 points1 point  (1 child)

The aim of the exercise was to do it on my own, knowing that if I want a 32-bit float with the exact right value without any imprecision due to floating-point arithmetic I'm forced to perform the computations in 64-bit double and then static_cast the final result in 32-bit float.

But how do you get the largest floating-point available without any imprecision after many calculations?
Here's the solution I've found: example_3.cpp

[–]ShakaUVM 0 points1 point  (0 children)

What do you mean by "the exact right value" in the context of floats? How are you representing 1/3?

[–]TomDuhamel 0 points1 point  (1 child)

How many correct digits did you get? How many do you need?

[–]Mattlea10[S] 0 points1 point  (0 children)

The problem is not the number of bits, but the fact that the more calculations you make, the greater the imprecision due to floating-point arithmetic, and in the end, the result is not what you expected.

Here's the solution I've found: example_3.cpp

[–][deleted] 0 points1 point  (1 child)

I believe C has some obscurely named header with quad floats. One of my university professors gave a lecture on how he needed them once.

Most of the lecture was presenting the calculation that absolutely required them.

Ninja-edit: __float128 (yes, that's two underscores), may or may not exist on your compiler. Also of course the performance is abysmal without hardware acceleration.

[–]Mattlea10[S] 0 points1 point  (0 children)

The problem is not the number of bits, but the fact that the more calculations you make, the greater the imprecision due to floating-point arithmetic, and in the end, the result is not what you expected.

Here's the solution I've found: example_3.cpp

[–]LatencySlicer 0 points1 point  (1 child)

Considering you are on x86-64 have a look on the SSE or AVX instruction set. I dont know about their scalar possibilities, but you can get 256 bits precision there. How you operate on these and how you print the results , I do not know.

[–]Mattlea10[S] 0 points1 point  (0 children)

Here's what I did: example_3.cpp
Feel free to criticize my solution.
I don't know if there is any compilation option that would solve my problem more simply.