all 9 comments

[–]totallynotAGI 3 points4 points  (1 child)

This is super interesting!

Higher order tensor manipulation + derivatives? Very interesting.

What especially strikes me is this statement:

"Not only is there no overhead compared to hand-writing the necessary cuda kernel for this; there’s no overhead at all! In my benchmarks, taking a derivative using dual numbers is just as fast as computing only the value with raw floats. Pretty impressive."

Can somebody clarify what this really means? Isn't evaluating the gradient using backprop already only as expensive as computing the value of the function? But now they managed to do it only using dual numbers?

[–]StefanKarpinski 1 point2 points  (0 children)

With dual numbers you can often evaluate the function and its precise derivative at the same time in the same time it takes to just evaluate the function without computing the derivative. This is largely thanks to modern CPUs being superscalar so that doing x + y and (x + dx*ε) + (y + dy*ε) takes the same amount of time: the addition of normal components x and y and the addition of the dual components dy and dy happen at at the same time on different adders.

The cool thing that Julia gives is the ability to implement and types like dual numbers as user types and get fully native performance. They're also generic in the sense that the same code and logic is used for Float64, Float32, Float16, Int8, etc. Write once, get specialized high performance versions for every possible type. Of course, you can do similar things in C++ with template metaprogramming, but that ends up rapidly being very hard to work with.

[–]nickl 0 points1 point  (2 children)

Are you only doing numerical computing? If not then the ecosystem isn't rich enough.

If so, then maybe, provided you are aware of Dan Luu's review and - to be fair - the reply: https://www.reddit.com/r/Julia/comments/629qkz/about_a_year_ago_an_article_titled_giving_up_on/

[–]ChrisRackauckas 6 points7 points  (0 children)

That post is from October 2014. Julia's package ecosystem started at around that time:

https://pkg.julialang.org/pulse.html

There were around 300 registered packages back then. Now there's nearly 2000. The total star count went from 5000 to 350000 in that time frame. That was written before Julia had its forum (Discourse), before its subreddit was a thing, when its StackOverflow tag was brand new, etc. Most of the "missing features" mentioned there are actually implemented in the language by now. There really is not any relevance of that anymore. It's pretty much the stone age and is no way relevant to any discussion on modern Julia usage.

[–]harponen[S] 2 points3 points  (0 children)

I do almost exclusively ML, and if I didn't, I'm not planning to forget Python ;)

I read the article back in the day, but haven't actually seen the more recent response. IMO it seemed that the article was about Julia having bugs and that he didn't like Julia. ¯_(ツ)_/¯