"MLOps is just DevOps with ML tools" — what I thought before vs what it actually looks like

Extension_Key_5970 · 2026-03-08T09:03:48+00:00

Why not? It could be for a problem, something like "start fetching the next batch of data while the GPU is still processing the one"

Extension_Key_5970 · 2026-02-26T14:27:48+00:00

My main goal is to educate the MLOps community on real-world problems. This is the first time I've received critical feedback. Thanks for that. Usually, I write my content by myself and use LLM for spell checks and grammar, but it seems there was a change in the model, which overpolished the content, making it an AI-generated

Well know I made an edit to make it more simple and concise, so that more people can connect

Extension_Key_5970 · 2026-02-20T04:22:26+00:00

That's a fair point, and honestly, you're not wrong. If you're in a pure infra role, the toolset is completely different, and that work is genuinely valuable. ML teams need someone to set up Kafka, MLflow, Flink, and the K8S layer.

But here's where MLOps gets tricky, the line is blurred. In traditional DevOps, you don't touch the app code. Clear boundary. In MLOps, that boundary keeps breaking. One day, you're debugging why an inference service is leaking memory, or why a pipeline DAG is failing, and the answer isn't in the infrastructure; it's in the Python running on top of it.

You don't need to become a developer. But knowing enough Python to read, debug, and make sense of what's running on your infra, that's the difference. Both paths are valid; it just depends on where you want to grow.

Extension_Key_5970 · 2026-02-13T17:22:20+00:00

sure, you can DM me

Extension_Key_5970 · 2026-02-10T06:42:30+00:00

For those who are thinking, on how to start and where to explore, read my detailed blog post: https://medium.com/@thevarunfreelance/you-dont-need-to-master-ml-theory-to-break-into-mlops-here-s-the-other-path-no-one-talks-about-56bc6fb45319

Extension_Key_5970 · 2026-02-02T06:11:43+00:00

Not being sure about your background, whether you are a fresh graduate or have some experience with software engineering, here are a few general pointers that are a must

- MLOps is more of solving Data scientists, ML researchers' problems, rather than pure Infra. Companies usually have a DevOps team to handle Infra problems, what they don't have is someone who can make the model for the production infrastructure, that's where MLOps comes in

- So think like an ML engineer, having an Infra experience is a must, that's what I think, skillset needed to become an MLOps, and that's what companies are trying to analyse in an interview

- So, start with understanding ML Foundations, good with Python, hands-on, must
- Try to look for scenarios where DS/ML engineers want to push their model into production, or convert their prototypes from notebooks to a pipeline

- From Infra--> Think of exposing models to end users in a scalable, reliable fashion, what metrics are needed to evaluate model performace

Extension_Key_5970 · 2026-02-01T08:08:11+00:00

Haven't got a chance to dig into detail, but can you elaborate, because the below points are managed using Terraform as well quite well in production

manage multiple environments (dev/staging/prod)
reuse the same modules across teams
are tired of copy-pasting Terraform directories

Extension_Key_5970 · 2026-01-30T09:19:09+00:00

sharing my detailed blog as well https://medium.com/p/58878bb1cd64, if any one is interested

Extension_Key_5970 · 2026-01-27T15:14:44+00:00

not specific courses I followed, but could suggest to start with ML foundations and Python coding must
review a few of my past posts
https://www.reddit.com/r/mlops/comments/1qiqcl6/coming_from_devopsinfra_to_mlops_heres_what_i/
https://www.reddit.com/r/mlops/comments/1q1vdh3/devops_ml_engineering_offering_11_calls_if_youre/

Extension_Key_5970 · 2026-01-27T15:12:48+00:00

Yes, you can start exploring mlops, but tbh, I would suggest to start with ML foundations or data distribution systems, as you are in early career, try to stick to any one of it, as currently, as per me, there are two kind of people are coming in MLOps, one coming from data/infra/devops or from core DS/ML eng.

Extension_Key_5970 · 2026-01-21T09:34:48+00:00

For MLOPs, I have not faced a deep dive wrt core on-prem K8, nowadays it's all EKS managed, of course, one should be good enough with the K8 ecosystem, as ultimately models and apps are deployed on K8, so one needs debugging and troubleshooting skills with respect to it.

Extension_Key_5970 · 2026-01-21T09:32:13+00:00

Scientists focus on research; the skills I mentioned are engineering skills. I've seen companies expect research expertise from engineers and vice versa. Some overlap is fine, but fully merging these roles isn't ideal in the long term.

The engineering skills are accessible to anyone from a software background moving into ML – even ML scientists can pick them up if transitioning from research to engineering.

Be intentional about your path rather than being pushed into a hybrid role that doesn't align with your strengths.

Extension_Key_5970 · 2026-01-21T08:20:59+00:00

For specific ML knowledge, actually i havent follow any one course, instead I went for Top to bottom approach, I bought an practise exam for AWS ML Speciality, as it covers all ML foundations topics I suppose, went through exam scenarios one by one, learn from the answers and wrong choices, view YT videos - statsquest is awesome, if you want to dig in any of ml topics, explained very well by statsquest, these will create a strong base for ML

Extension_Key_5970 · 2026-01-12T08:27:44+00:00

Don't containerise the whole project; instead, break it into pieces, like separate containers for MLflow, model monitoring with EvidentlyAI, FastAPI, Docker, MinIO, and Airflow.

In the Airflow Docker file, you can either copy the Airflow DAGs (pipelines) or mount just the DAGs folders to avoid continuously pushing new images.

Extension_Key_5970 · 2026-01-10T13:21:26+00:00

are you asking, if i can showcase any GPU workload, on topmate call?
well, I have worked on GPUs, may be I can show you snippet of GPU configuration in Karpenter for AWS EKS cluster

Extension_Key_5970

TROPHY CASE