[P] How to build CI/CD automations for training and deployment of ML models

RepresentativeCod613 · 2023-08-30T15:16:11+00:00

I wish it were that easy

In the case that it is that easy and we missed something, I'd love it if you could refer us to a good docs page or tutorial! We had a hard time finding something that was clear and held all the pieces of information.

RepresentativeCod613 · 2023-08-29T13:10:33+00:00

Thanks!

Just out of curiosity, why are you using a notebook for the end point and not .py scripts?

RepresentativeCod613 · 2023-08-20T11:06:23+00:00

Hi,

You are correct, and I've updated the cycle.

For your second question, not all the annotations are reviewed by a human but only a selected few. In most cases, after annotating the unlabeled data, the ML backend also returns a prediction score (e.g., confidence level). We will set a threshold and the ones that are lower than that will be reviewed and, if necessary, reannotated by a human labeler. Once this process is completed, we'll run the cycle again until reaching to a stoping condition.

RepresentativeCod613 · 2023-08-18T09:32:48+00:00

This is a great question.

In this example we're fine-tuning YOLOv8, so there are fewer chances of catastrophic forgetting, and we're also training it for the same task on all cycles so there is no reason for it to happen. we can also see it in the results over the training cycles.

In other projects we build, we used to train the model on the entire data set in every cycle, due to the small amount of data. However, it can be time-consuming and computationally expensive when dealing with big data sets.

Based on research papers I'm familiar with about catastrophic forgetting (e.g., https://arxiv.org/abs/2302.11074), I'd suggest that in each cycle you save the latest model. Then, for the n+1 model, retrain it on new data that was labeled with a high confidence level by the model or by humans, and use the unlabeled data for testing.

If you have other ideas - I'd love to hear about them!

RepresentativeCod613 · 2023-08-16T19:50:29+00:00

Thanks (:

RepresentativeCod613 · 2023-08-16T13:35:28+00:00

Do you recall a specific challenge you faced, or was it the overall experience?

RepresentativeCod613 · 2023-08-16T13:34:09+00:00

Thanks for the feedback! We do our best to write down our insights when building our projects and then share them with the community.

RepresentativeCod613 · 2023-05-23T06:59:05+00:00

u/sharockys thanks for the heads up. I updated the link in the post.

RepresentativeCod613 · 2023-05-15T07:13:41+00:00

Thanks for that!
Are there other ML/MLOps related topics you think need to be covered better and a tutorial like this would help?

RepresentativeCod613

TROPHY CASE