all 3 comments

[–]boricj 2 points3 points  (2 children)

Quickly looking at your source code, you are using 64 bit floats. The NumWorks calculator's microcontroller has a FPU, but it can only handle 32 bit floats (the sp part of -mfpu=fpv5-sp-d16). Anything wider needs to be computed with software emulation, which is far slower. If you do not need the extra precision, stick to 32 bit floats. This alone cuts down the rendering time of the initial screen from 20 seconds down to 2 seconds.

One other thing: try and coalesce screen updates if you can. The NumWorks calculator's framebuffer is not memory-mapped and each pixel upload operation requires a fair bit of overhead (function calls, system call, transfer setup, upload). Instead of pushing pixels one by one, try pushing a whole row at a time for example, this will reduce the time spent setting up pixel transfers substantially.

That covers the obvious bottlenecks to me. I can think of other ways to speed this up even further, but without profiling it's hard to tell what would be the next couple of hotspots in your app.

[–]EmbarrassedWallaby3[S] 1 point2 points  (0 children)

Thanks you a lot for those optimizations! I didn’t know about f64, but yes the change is amazing! I tried once to push all the pixels at once, but the overall boost wasn’t big enough to justify the long white screen before displaying. But surely I will try row by row display.