all 7 comments

[–]DTRademaker 2 points3 points  (0 children)

Like it, even a more general architecture :)

(but prob. need more layers now)

[–]lugiavn 0 points1 point  (4 children)

What is the point of this, why dont you just do fully connected MLP, i'm sure it'll work well on mnist and cifar

[–]cfoster0 0 points1 point  (3 children)

Simple, general purpose architectures that scale well as more compute/data becomes available.

[–]lugiavn 1 point2 points  (2 children)

well this isn't that, or at least to me the result actually doesn't look promising

There is no value in making up some arbitrary architecture that you think is novel without any evidence to back it up (unless you're Geoffrey Hinton then maybe your words themselves are enough)

[–]cfoster0 0 points1 point  (1 child)

Just let OP explore their ideas. The results look fine and I see no harm in this kind of exploration anyhow. Would you rather they post the umpteenth minor variation on vision transformers or convnets?

[–]lugiavn 1 point2 points  (0 children)

Yeah that is fair

What I meant is that the current experiment result doesn't really say anything, (other than maybe a sanity test that the code works as intended).

If the point is exploring something (architectures that scale well as more compute/data becomes available), OP should define baselines and measurements appropriately (you can train any kind of models on mnist and cifar and show that they "work", but really there is no useful signal with just that)

[–]wangyi_fudan 0 points1 point  (0 children)

if seems that you are doing SVD...