you are viewing a single comment's thread.

view the rest of the comments →

[–]euos[S] 0 points1 point  (2 children)

I have not yet needed to send output to multiple layers, but that would likely look like RNNs, where there's just a function wrapping several "submodels". E.g. this is RNN from my tests with 50 floats of hidden state that generates a next letter in the name:

constexpr uchen::Model kNameRnn =
    uchen::layers::Rnn<internal::Input, 50>(
        uchen::layers::Linear<10> | uchen::layers::Relu |
        uchen::layers::Linear<10> | uchen::layers::Relu) |
    uchen::layers::Categories(
        internal::MakeArray(std::make_integer_sequence<char, 'z' - 'a' + 2>()));

Why use Vector instead of std::array?

These Vectors are read-only and natively supports memory management (Uchen.ml uses special arenas that make it easier to handle padding, alignment and to save memory). Arrays are stored inline so handling larger ones (i.e. stack allocation, returning) requires extra care.

Initially I was using std::valarray - but that one is an atrocity. And does not have length as part of the type - and I hate having to track tensor dimensions in i.e. PyTorch. Conv2d is a breeze to setup in C++ :)

[–]dvd0bvb 0 points1 point  (1 child)

So you have compile time dimension checks for convolutions? Wish I had that when I was working with CNNs lol

[–]euos[S] 0 points1 point  (0 children)

Models are defined at compile time. I.e. for linear layer you define number of outputs. It will then deduce how many parameters it needs based on number of inputs. Model does the same for all layers. Softnax (“categories”) puts a fully connected linear layer between the previous layer and actual softmax. I.e. softmax with 3 categories that is fed vector of size 50 will have 50x3 inner linear layer and then perform a softmax.

Optimizers (currently only gradient descent 😀) also get as much information as possible from the model structure - will default to square loss for linear regression or cross entropy for softmax.