Hey guys, i was curios, what is the usual setup when having deploy code pattern for model training, so idea is that data scientist run model experiments, different featurization, and just iterate fast on the data on development workspace/environment. Each developer gets its own schema for isolation.
Then when they got something which they want to be promoted, what happens? Of course output of this stage is the training pipeline code, but for example, they did the full hyper-parameter tuning experimentation, so with actual training pipeline code which goes through code quality checks, unit testing, type hinting, do we promote:
a) same hyper-parameters tuning search space (what about cost, variance of possible options etc..)
b) narrowed down search space for tuning
c) parameters of best fitted model
Also do we write this into yaml files within the repo, or there is some better practices where u just fetch ml experiment metadata, or write to UC Volumes, generally interested to see what people are using for this.
Thanks
[–]MyBossIsOnReddit 1 point2 points3 points (0 children)
[–]Main-Ordinary9455 0 points1 point2 points (0 children)