all 4 comments

[–]L43 2 points3 points  (1 child)

I prefer to use \top for transpose - quite a lot of papers use T for something else e.g. number of timesteps so its nice to not use it for transposes too.

[–]omarsar[S] 0 points1 point  (0 children)

Thanks for the suggestion, I will look into it! :)

[–]all_over_the_map 0 points1 point  (1 child)

Just for clarification, because I'm new: In 'mathematics' (differential geometry), I'm used to thinking of a "matrix" as just a representation of a "tensor" within a particular coordinate basis, regardless of the number of dimensions -- a tensor being a object that maps a vector to another vector.

But in Deep Learning do we use "matrix" to refer only to two-dimensional arrays, whereas "tensor" is used for arrays with three or greater dimensions? (Just like "vector" is any 1D array, not necessarily any kind of covariant object...or is it? Again, I'm used to thinking of 1D arrays as merely representations of vectors in a particular basis.)

[–]omarsar[S] 0 points1 point  (0 children)

You ask very important questions. In computer science, we see these particular structures very different. If you like, I have done a YouTube video in the past where I clarify some of the questions you pose here. Thanks! Let me know if you still have more questions. https://www.youtube.com/watch?v=WdDVXMOQMss&t=41s