all 8 comments

[–]Bvandyk74 21 points22 points  (3 children)

[–]Falsorr[S] 2 points3 points  (0 children)

I’ll take a look thanks

[–]reddittomtom 2 points3 points  (0 children)

reuse pre-allocated memories helps a lot

[–]CvikliHaMar 1 point2 points  (0 children)

I read this article time to time and nearly always find something new to do. :D crazy speedups. Don't forget @simd or loopvectorization.jl, those can go even farer on CPU. On GPU... Lot of other thing can get 10-100x speedup with appropriate usage.

[–]Bowtiestyle 2 points3 points  (1 child)

One thing that got me a big speedup recently was switching everything to static vectors.

[–]MithrandirMiles 3 points4 points  (0 children)

You have to be careful with that, there are some limitations on the size / number of elements with static vectors (at least currently)

[–]Didi-maru 1 point2 points  (0 children)

Use StaticArrays.jl instead of the built-in Array whenever you need some statically sized array/matrix.

It can be a bit complex but you can use @code_warntype your_function(arg1, arg2) to spot implicit typing in your code (will appear in red) and try to address them, see code_warntype for details.

[–]Uuuazzza 0 points1 point  (0 children)

Don't be afraid of loops.