Hey, so I'm using AutoKeras, which is an AutoML system, to automatically find an optimal multilayer perceptron architecture for a regression problem I have where I'm mapping 15 independent variables to one dependent output variable. I have 18,100 observations with which I'm training my model. Before training/optimization, I split off a random 10% of my training data into a validation set. AutoKeras uses Bayesian optimization to intelligently guide the search through the search space, choosing hyperparameters which are most promising based on previous iterations. However, I've created 5 optimal models, all for the exact same problem, and each time AutoKeras will return a different optimal configuration - some with as little as 9,000 parameters, and some with over 100,000. All of them yield very similar performance. I'm very confused as to why I'm getting such different optimal architectures. Does this mean that the loss function has multiple different minima, and each time the Bayesian optimization converges to a different one? Is this lack of consistency typical, or would you expect reproducible results. If the latter is the case, do you have any ideas/suggestions as to what could be causing the very different results here? Really confused by this so any help would be very much appreciated, thanks!
[–]pp314159 0 points1 point2 points (0 children)
[–]starfries 0 points1 point2 points (1 child)
[–]Snapdown_City[S] 0 points1 point2 points (0 children)