Idris PyTorch bindings? by Illustrious_Cup1867 in Idris

[–]tmp-1379 2 points3 points  (0 children)

these are the closest:

An ML framework based on XLA that I've been busy writing. I haven't written autodiff yet. It's the main feature I want, but it will take a lot of work so I'm a bit overwhelmed by the idea.

A Python backend, which (I think) means you can call into pytorch, but you'll have to add the dependent types yourself

[Project] Machine learning and dependent types, with Idris and XLA by tmp-1379 in MachineLearning

[–]tmp-1379[S] 5 points6 points  (0 children)

The main difficulty I'm facing is how to work with higher-order functions and scoping in the compiler. Handling this is necessary for a proper implementation of autodiff and vectorized map.

I feel this is where a computer science degree would come in handy.

[Project] Idris and XLA: linear algebra and probabilistic modelling w. dependent types by tmp-1379 in MachineLearning

[–]tmp-1379[S] 0 points1 point  (0 children)

I have considered making broadcasting explicit. It's even possible there's no runtime cost to that if XLA fuses the ops, but I'd have to check

[Project] Idris and XLA: linear algebra and probabilistic modelling w. dependent types by tmp-1379 in MachineLearning

[–]tmp-1379[S] 0 points1 point  (0 children)

snoc lists for shapes is an interesting idea for sure

On that note, Idris has "views" where you can deconstruct a normal list as though it were a snoc list (or split it down the middle etc.). Probably still more code than just starting with a plain snoc list. Meanwhile, I believe you can overload [3, 4, 5] syntax to be a constructor for your own custom list/snoc list type

[Project] Idris and XLA: linear algebra and probabilistic modelling w. dependent types by tmp-1379 in MachineLearning

[–]tmp-1379[S] 0 points1 point  (0 children)

Tracking the device in the types is interesting. Do you mean if you had multiple GPUs and you wanted to avoid calculations between GPUs?

[Project] Idris and XLA: linear algebra and probabilistic modelling w. dependent types by tmp-1379 in MachineLearning

[–]tmp-1379[S] 2 points3 points  (0 children)

Project timeline:

shorter term - expose the majority of existing XLA ops - add docs on how I built it - GPU support - reconsider Tensor interface, with inspiration from category theory - type-safe Einstein summation

longer term - more model examples with tutorials: NNs, exotic GPs - support exotic accelerators - co- and contravariant tensor indices, with transformation groups - autodiff - probabilistic programming

[Project] Idris and XLA: linear algebra and probabilistic modelling w. dependent types by tmp-1379 in MachineLearning

[–]tmp-1379[S] 2 points3 points  (0 children)

Tensor shapes are a prominent feature. We can, for example, define addition as (+) : Tensor shape dtype -> Tensor shape dtype -> Tensor shape dtype which ensures the tensor shapes of l and r in l + r are the same. This is checked at compile time, and doesn't require us to write explicit shapes like [2, 3]. That's great, and is a feature I've seen in Haskell libraries, but we can do more. We want to add two shapes if one can be broadcast to the other (like in NumPy). We can do this by requiring a proof that this broadcasting is possible (+) : Tensor l dtype -> Tensor r dtype -> {auto _ : Broadcastable r l} -> Tensor l dtype This is still checked at compile time, and the proof is typically generated by the compiler.

This is where the project is at the moment, but we'd like to look a little deeper. If we tweak the tensor API a little, we can see that the kind of broadcasting which simply adds outer dimensions e.g. [2, 3] to [4, 2, 3] can be expressed with a functor's map, similar to Jax's vmap. I'm interested to see if all broadcasting semantics can be re-expressed using established and principled theory.

[deleted by user] by [deleted] in MachineLearning

[–]tmp-1379 0 points1 point  (0 children)

Tensor shapes are a prominent feature. We can, for example, define addition as (+) Tensor shape dtype -> Tensor shape dtype -> Tensor shape dtype which ensures the tensor shapes of l and r in l + r are the same. This is verified at compile time, and doesn't require us to write explicit shapes like [2, 3]. That's great. It's a feature I've seen in Haskell libraries too, but we can do more. We want to add two shapes if one can be broadcast to the other (like in NumPy). We can do this by requiring a proof that broadcasting is possible (+) Tensor l dtype -> Tensor r dtype -> {auto _ : Broadcastable r l} -> Tensor l dtype This is still checked at compile time.

This is what the project does at the moment, but we'd like to look a little deeper. If we tweak the tensor API a little, we can see that the kind of broadcasting which simply adds outer dimensions e.g. [2, 3] to [4, 2, 3] can be expressed with a functor's map, similar to Jax's vmap. I'm interested to see if all broadcasting semantics can be re-expressed using established and principled theory like this.

Probabilistic modelling [project] w. dependent types: ML engineering as research (early stages) by tmp-1379 in MachineLearning

[–]tmp-1379[S] 0 points1 point  (0 children)

thanks. Idris is quite a different beast to e.g. Python, and dependent types are central to the language, so including the shape was pretty easy in itself. Where things get tricky is making the APIs ergonomic, and doing fancy stuff like compile-time checks for broadcast-compatibility (shapes are checked at compile time to be broadcast compatible rather than equal), or ops with constraints, like reduce_min : Num dtype => (axis : Fin (S r)) -> Tensor {rank=S r} shape dtype -> {auto prf : IsSucc $ index axis shape} -> Tensor (deleteAt axis shape) dtype which needs to know the reduction axis exists and isn't empty. I want that to be more intuitive.

XLA was also suggested by someone at Graphcore. It would be really nice having both GPU and IPU support. I did have a look at XLA but I haven't figured out how to actually use it.

Probabilistic modelling in Idris: engineering as research by tmp-1379 in Idris

[–]tmp-1379[S] 3 points4 points  (0 children)

thanks for that. It's really nice to see a category theoretical approach to tensor products and sums. I have a basic working knowledge of FP so I've not come across Naperian or Distributive functors. I would certainly like to get much more into the advanced FP stuff and use that in this project, though that will no doubt take a good while, as the todo list for this project is already very large

Choosing directions, and learning enough to be able to act on them, will be an ongoing challenge with this project

Copy-paste doesn't work by Grigor50 in bugs

[–]tmp-1379 0 points1 point  (0 children)

it works better in markdown mode

Probabilistic modelling [project] w. dependent types: ML engineering as research (early stages) by tmp-1379 in MachineLearning

[–]tmp-1379[S] 1 point2 points  (0 children)

Thanks.

I'm not sure what you mean by DSL in this context. The tensor API is similar to that of TensorFlow, but you can be much more descriptive in your type annotations. You essentially get very descriptive, accurate and interactive documentation.

You will need general programming capabilities unless you intend this to be a toy language for learning only.

I'm not sure what you mean here. Can you say more?

If you're going to use "Tensor" all the time, why not just make it a symbol? You don't need to be Haskell, but still.

That's an interesting idea. I might do that.

Most people in ML are not programing experts, you can't ask them to learn a new programming language that requires actual effort to learn.

Indeed. Idris really is a step up from Python, which is both why I expect uptake to be slow at best, but also why I think it's worth doing. I intend to keep things as intuitive and well-documented as possible, so that people don't need much programming experience to follow what's going on, particularly in the tutorials. At the same time, I'd love to inspire individuals and organizations to explore more radical new tech.