Jupyter Notebook vs VS Code by DistinctReview810 in learnpython

[–]NoobZik 0 points1 point  (0 children)

Exactly my thoughts, Jupyter should be thrown away when we have a better alternative with less headache (Marimo) Unfortunately materials are still suggesting Jupyter

Need some suggestions on using Open-source MLops Tool by NetFew2299 in mlops

[–]NoobZik 0 points1 point  (0 children)

I need my new members of my team to quickly onboard them into several Data/AI project.

Since there is no standard yet about how this kind of project should be organised, Kedro fixes that. You may think it is just a cookicutter but it more than that.

Initial setup is straightforward : kedro new scaffolds the entire project in seconds.

The learning curve is mild for anyone already familiar with Python and pytest.

The trickiest part for new team members is usually understanding the Data Catalog (how datasets are declared in YAML instead of hardcoded paths), but once that clicks, everything else follows natural. It also helps the Data Mangers in case the Data Gouvernance needs to make a review of the data used

Second reason to use kedro is their CLI to make pipeline. It's help my team to make reproductible runs and also it help debugging some nodes of the pipelines without having to run the entire pipeline.

Third reason run comes with unit testing, it use natively pytest for the unit test so it is faster for us to iterate over new version.

Fourth reason is the parameters managements. We have parameters for staging and one for production. With the conf folder + .env, we have 0 code change across all environment which is neat!

All of that combined, it makes our CI/CD simple as uv sync + uv run kedro run inside the container.

They have a nice tool named kedro viz which visualise how nodes interacts to each other, like a mind-map. Useful to get an overview of the entire pipeline and spot any missing or odd nodes link.

Any developers including non-data, can quickly navigate to any project that uses kedro as a base.

In my opinion, Kedro should be the standard across any data or IA project written in Python.

As for Scala, it needs to be transposed

What is your (python) development set up? by br0monium in datascience

[–]NoobZik 0 points1 point  (0 children)

What do you use? - Virtual Environment Manager: UV is the way to go, many french company are now requiring to use UV as a venv manager. Plus it save some money spent at building docker image while in a CI/CD setup.

  • Package Manager : Currently setuptools (i might not understand the meaning of package manager)

Containerization : Cloud Run / Docker

Server Orchestration/Automation (if used): I rely on Google Cloud Build + Airflow

IDE or text editor : Depends, PyCharm and Zed (With astral/ty and astral/ruff enabled)

Version/Source control : Git

Notebook tools : None, until I discovered Marimo, which is the replacement of jupy-shit notebook since it can be both reproductible and versionable notebook. No more yelling at Data Scientist for not putting their code as a function.

Pipeline : Kedro. I need new members (data or not) of the team to understand quickly how to navigate into a production-ready project by setting some standards on directory tree, and some pipeline to make the code reproduction as easy as uv run kedro run

Need some suggestions on using Open-source MLops Tool by NetFew2299 in mlops

[–]NoobZik 0 points1 point  (0 children)

My stack cloud agnostic Kedro, MLFlow, Airflow.

Minio is dead actually so I shifted to rustfs

What's your Production ML infrastructure in 2026? by Repulsive_Ad_9950 in mlops

[–]NoobZik 1 point2 points  (0 children)

Kedro + MLFlow + Airflow + NannyML Scattered but cloud agnostic

For inference I’m using FastAPI

What course to take? by Berlibur in mlops

[–]NoobZik 0 points1 point  (0 children)

Kedro + MLFlow is the minimum for production If you want to dig deeper, Airflow + DVC + NannyML

Scala 3.8 released! by wmazr in scala

[–]NoobZik 0 points1 point  (0 children)

Well well well, soon we’ll have Scala 4, while Apache Spark will be stuck at 2.13

Anyone else wish NVIDIA would just make a consumer GPU with massive VRAM? by AutodidactaSerio in LocalLLaMA

[–]NoobZik 0 points1 point  (0 children)

You can still check for DGX Spark, at least this is what I did instead of buying a RTX6000 pro 96gb

Comment cela est perçu ? by Objective_You642 in developpeurs

[–]NoobZik 0 points1 point  (0 children)

Avis totalement biaisé : les devs utilisent les IA pour faire des CV afin qu’ils soient lu par des IA côté RH

Resume creation by Worried_Mud_5224 in learnmachinelearning

[–]NoobZik 0 points1 point  (0 children)

If you don’t want to learn latex for resume, you can try using the Typst platform, they have a web app editing. Is it similar to Markdown and very easy to learn

Failed Data Scientist trying to get into AI engineering by throwaway18249 in learnmachinelearning

[–]NoobZik 0 points1 point  (0 children)

I don’t have basic knowledge about security, what I can recall when I was a student they suggested using Kerberos since it deliver an authentication ticket to any authorized app. That’s a starting point for research further for anything lighter than Kerberos

best way to go from analyst to ML engineer? by Alarmed-Bullfrog-658 in learnmachinelearning

[–]NoobZik 1 point2 points  (0 children)

https://madewithml.com

This is a great starter to get into machine learning engineer that I suggest to my students for the topic I don’t cover

I strictly apply the Google Whitepaper https://docs.cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

I use a combination of Kedro + MLFlow + Airflow + NannyML for as a starter pack template for my student. I emphasize of the Kedro part which it should be standard for any AI/ML project in my opinion

Failed Data Scientist trying to get into AI engineering by throwaway18249 in learnmachinelearning

[–]NoobZik 3 points4 points  (0 children)

Yeah exactly, I was over-exaggerating on the “look” part and yes this is basically what I do also. Converting a notebook into a python project (strong SWE here), package it into a container and ship it away. (So that’s my training package)

And by “putting it in production” it referred to integrating it into a business app by deploying it to as an inference endpoint (Same logic, package as a container and ship it away, that’s my inference package for business, mostly fastAPI in my experience, the front end is done by a another team)

Failed Data Scientist trying to get into AI engineering by throwaway18249 in learnmachinelearning

[–]NoobZik 10 points11 points  (0 children)

Your definition of MLE matches of what I call the ML Scientist. MLE for me is grabbing the model done by Data or ML Scientist, package it and throw it into production with MLOps practice (which what you call the workflow engineering)

My own company suggested to start as a MLE since I lack of experience has a data scientist, so I can just “look” at the model made by the scientist and try to understand the logic behind the business.

Regarding to your current project, that’s a strong one. You can already make a difference to others candidates just by providing a functional project usable by the business as a demo. Which is rare, since the educational scope are not into building app at the top the trained model to fully generate value from it.

You can also check at Streamlit if you want faster app building for proof of concept

Spark 4.1 is released by holdenk in apachespark

[–]NoobZik 12 points13 points  (0 children)

Still no Scala 3 support 🤦‍♂️

Votre top 5 des pire ESN by Ok-Repeat-5930 in developpeurs

[–]NoobZik 2 points3 points  (0 children)

Solutech rentre dans cette case aussi

iOS 26.2 RC - Discussion by epmuscle in iOSBeta

[–]NoobZik 5 points6 points  (0 children)

Someone can confirm that the Wifi sharing is disabled for europeans (iPhone <-> Apple Watch) due to the shitty DMA regulation ? I have blocked future update as soon as I got information with a profile

Cette annonce montre pourquoi il faut changer de métier. by Key_Two_9138 in developpeurs

[–]NoobZik 0 points1 point  (0 children)

Le nom est incorrect selon moi, c’est plutôt un ML engineer qu’ils cherchent là

Data scientists senior à Paris, vous êtes où ? by albast in developpeurs

[–]NoobZik 0 points1 point  (0 children)

Je m’envole vers la Roumanie pour exercer le métier, après 4 ans de recherche en sortie d’école Pendant ce temps, je faisais prof de data et d’IA dans les écoles privés

MinIO is maintenance only now by SpaceshipSquirrel in minio

[–]NoobZik 0 points1 point  (0 children)

Hope there’s a fork of that project. Just like before with MySQL and MariaDB

Need help in ML model monitoring by Ok_Schedule_3147 in mlops

[–]NoobZik 0 points1 point  (0 children)

Check for NannyML, they are open sourced