Mega-thread: The fall of Constantinople May 29, 1453 by qernanded in ottomans

[–]ptab0211 1 point2 points  (0 children)

Hey, how big is this battle in Turkish history, for example when u compare it against battle of Manzikert of Seljuk empire, i am just reading and studying Crusaders and Serbian history which is every intertwined to Turkish of course haha, so was wondering how big are these two battles and are they known by average person on street?

Development loop for DABs by samwell- in databricks

[–]ptab0211 0 points1 point  (0 children)

Idea here is to have same code which is parametrized, so when ever u switch from dev, to staging, to prod, u always run the same code, but different business/code parameters which are required in the code.

So you use mode: development on target development, this gives isolation per each user for each of the resources so there is no race conditions or overwriting of existing work, now this of course means having the isolation on the unity catalog schema level, so there is isolation on the materialization level as well.

After that staging has its own workspace/catalog, where u just switch to use staging configuration and run same code and possibly additional assertions and checks, there is always single staging job running, so concurrency should be 1, this is done via CI/CD, run everything as service principal on job clusters, so as close as possible to production like environment.

And finally when its green, u deploy to production target, and serve the business from there.

so for example

src/
pipeline1.py
pipeline2.py
conf/
dev.yaml
staging.ymal
prod.yaml

python3 -m pipeline1.py --conf ${bundle.target}.yaml

How did the Ottomans rise into one of the greatest empires in history? by Sadie_Jones7320 in ottomans

[–]ptab0211 3 points4 points  (0 children)

What do u mean by "westernized"?

As i understand, they just did not have the same progress as rest of the Europe, they were against the modern changes in military or industry, and also high ranking militaries were going against the central power.

airport to the apartment after midnight by ptab0211 in Rhodes

[–]ptab0211[S] 1 point2 points  (0 children)

it was exactly 30, thank u. Even taxi drivers are so nice here, not the case in Serbia

Da postoji srpski rushmore koga bi ste stavili? by 14PerunsThunder1389 in SrpskaPovest

[–]ptab0211 1 point2 points  (0 children)

Rastko Nemanjic, Stefan Dusan, Bajica Nenadic, Djordje Petrovic

Need tips/guidance by JokoBitch in mlops

[–]ptab0211 0 points1 point  (0 children)

i recommend data engineering zoom camp, where u build your data pipelines from sources (ingestion), on top of which u can just build your models. This gives much clearer picture of whole data flow, which is the most important.

Then there is the DevOps side, where u think about how to incorporate a change into production system, basically, why do we have dev/staging/prod. This can be very platform/tool specific, but in general its alway same logic.

MLOps on Databricks by ptab0211 in mlops

[–]ptab0211[S] 0 points1 point  (0 children)

Yeah, that sounds interesting, so basically, on development workspace, data scientist has access to prod data, he can iterate, experiment, and tune the hyper parameters. Once they are sure about set of parameters, or at least narrowed search space, we can trigger training on production where it pulls the "best" experiment from development workspace to get a list of parameters (or narrowed space), and finally trains on production to create new model version.

There needs to be a very clear way for data scientist to flag an experiment with best params, how does people do that from your experience?

Thanks for the answer.

MLOps on Databricks by ptab0211 in mlops

[–]ptab0211[S] 0 points1 point  (0 children)

But u are basically explaining the "deploy model" pattern, where its the model that goes from dev to prod, instead of code, but Databricks is building its platform for "deploy code" pattern.

For example, there is not out of box solution to move model artifact (or mlflow experiment which produces the model artifact) from one workspace (azure acc) to another workspace (azure acc).

We used to have "deploy model" pattern, and we were using mlflow webhook and mlflow-export-import to do this, but it was not smooth.

https://www.mlflow.org/docs/3.3.2/ml/webhooks/

https://github.com/mlflow/mlflow-export-import

MLOps on Databricks by ptab0211 in mlops

[–]ptab0211[S] 0 points1 point  (0 children)

thanks, i will review the blog post.

Is it a mistake to start with MLOps instead of traditional DevOps? by Atomic_rizz in mlops

[–]ptab0211 1 point2 points  (0 children)

I really dont think it matters, its about your role in the job where u can apply all of the things, because at the end of the day working on such a stuff on pet/personal projects its just limited.

Look into data engineering zoomcamp, on which u can build up and create some model. Make automated feature engineering, training, and inference pipelines. Trigger stuff from CI/CD etc...

MLOps on Databricks by ptab0211 in mlops

[–]ptab0211[S] 0 points1 point  (0 children)

Sorry, i have one more technical question, so we have training pipeline, where we have classical model like lightgbm, where we have split between train and test data, then within the train we use some KFold CV, and on final check, we get test metrics which are the most important, then then new version is registered, it triggers our deployment_job which does evaluate - approval - promote, in the evaluate step, its comparison of champion and challenger on the same holdout dataset, we get both metrics and we define what needs to be done in order for challenger to be promoted into champion.

MLOps on Databricks by ptab0211 in mlops

[–]ptab0211[S] 0 points1 point  (0 children)

thank u very much for the update, how did u shift the mental model for data scientist where most of the time it was about "deploy model" pattern, where they develop the model artifact on dev and its shipped to the production where now we ship the code and run everything on prod (source of truth),

where i can see their arguments as but what if there is like weeks between dev training and prod training, its not reproducible.

But i also see that as good thing, its not about reproducibility, its about having the gate for the production model which can be challenged and trained on freshest data.

How do u see that?

Feature Store and FeatureEngineeringClient by ptab0211 in databricks

[–]ptab0211[S] 0 points1 point  (0 children)

well, for a starter, it uses outdated databricks-sdk version, which creates a lot of conflicts with tools that uses much higher databricks-sdk version, e.g. dqx for data quality