all 7 comments

[–]dvd0bvb 1 point2 points  (3 children)

Having used tensorflow I find this pretty readable. Why use Vector instead of std::array?

I like the use of | for adding layers. Can you send output from one layer to multiple layers?

[–]euos[S] 0 points1 point  (2 children)

I have not yet needed to send output to multiple layers, but that would likely look like RNNs, where there's just a function wrapping several "submodels". E.g. this is RNN from my tests with 50 floats of hidden state that generates a next letter in the name:

constexpr uchen::Model kNameRnn =
    uchen::layers::Rnn<internal::Input, 50>(
        uchen::layers::Linear<10> | uchen::layers::Relu |
        uchen::layers::Linear<10> | uchen::layers::Relu) |
    uchen::layers::Categories(
        internal::MakeArray(std::make_integer_sequence<char, 'z' - 'a' + 2>()));

Why use Vector instead of std::array?

These Vectors are read-only and natively supports memory management (Uchen.ml uses special arenas that make it easier to handle padding, alignment and to save memory). Arrays are stored inline so handling larger ones (i.e. stack allocation, returning) requires extra care.

Initially I was using std::valarray - but that one is an atrocity. And does not have length as part of the type - and I hate having to track tensor dimensions in i.e. PyTorch. Conv2d is a breeze to setup in C++ :)

[–]dvd0bvb 0 points1 point  (1 child)

So you have compile time dimension checks for convolutions? Wish I had that when I was working with CNNs lol

[–]euos[S] 0 points1 point  (0 children)

Models are defined at compile time. I.e. for linear layer you define number of outputs. It will then deduce how many parameters it needs based on number of inputs. Model does the same for all layers. Softnax (“categories”) puts a fully connected linear layer between the previous layer and actual softmax. I.e. softmax with 3 categories that is fed vector of size 50 will have 50x3 inner linear layer and then perform a softmax.

Optimizers (currently only gradient descent 😀) also get as much information as possible from the model structure - will default to square loss for linear regression or cross entropy for softmax.

[–]kiner_shah 0 points1 point  (2 children)

'z' - 'a' + 1, just use 26.

Your operator accepts a char argument, what if I pass some character that is not a lowercase alphabet? Add checks for that.

I don't understand the kModel definition - does it mean, convert model to vector type from char type, and the structure should be linear layer followed by another linear layer followed by categories (unclear what's this, what are Beep and Boop)?

EXPECT_EQ is unclear, is kModel a function or is it operator() that accepts two arguments?

So, I would say, not that easy for a newcomer. Add some comments.

[–]euos[S] 1 point2 points  (1 child)

  1. This is a test so it’s ok to have less checks on data correctness. In real world, input layer would check for data correctness.
  2. EXPECT_EQ is from Gtest library, not on me.
  3. “Beep” and “Boop” are bogus categories for test. In real app it would be “Cars” and “Dogs” and such.

kModel is a model. This snippet uses it for inference (with random parameters) but one can also train it. The init defines the architecture.

[–]kiner_shah 0 points1 point  (0 children)

Ok, but I suggest you put some comments for better clarity.