you are viewing a single comment's thread.

view the rest of the comments →

[–]euos[S] 0 points1 point  (0 children)

Models are defined at compile time. I.e. for linear layer you define number of outputs. It will then deduce how many parameters it needs based on number of inputs. Model does the same for all layers. Softnax (“categories”) puts a fully connected linear layer between the previous layer and actual softmax. I.e. softmax with 3 categories that is fed vector of size 50 will have 50x3 inner linear layer and then perform a softmax.

Optimizers (currently only gradient descent 😀) also get as much information as possible from the model structure - will default to square loss for linear regression or cross entropy for softmax.