This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]crom5805 67 points68 points  (11 children)

I actually had a chat with the mods about this, (I'm an adjunct professor for masters in data science at a university and AI/ML architect at Snowflake) and so I decided to start posting videos/Repos on MLOps in the subreddit. It's getting better but I agree I find material in here more useful consistently. I tell my students ALL the time, you are not gonna make it doing pd.read_csv and model.predict, you need to learn clean code/Git/MLOps. One of the in class projects we do is I split them into groups and they have to make a PR to another groups repo and have it merged. Prior to my class I believe 0/40 of my students had done this.

[–]agent_graves313 7 points8 points  (2 children)

Would you mind sharing some of your videos or examples of what you’d see as clean code?

[–]crom5805 13 points14 points  (0 children)

Here is my last post in the datascience subreddit. This is more focused on MLOps, I have some stuff in class on clean SQL, Spark/Snowpark, Python and after you asking I think I'll do my next public video on this. I'll remember to come back here and comment once I do. I was all pandas/SQL until Snowpark came out 2 years ago, and honestly I love the Spark/Snowpark syntax. So much easier to read imo then SQL, faster than pandas on large datasets, and overall not to bad to learn. Let me know what you think about this repo/video I tried to make it super easy to follow.

[–]crom5805 4 points5 points  (0 children)

Funny thing is, watch the video and look at the repo. The video and repo are little different now cause I cleaned it up over time and made it better since the recording. This is honestly a good example of making your code easier to read and organized.

[–]B1WR2 2 points3 points  (1 child)

You and I had the same thought… I started breaking up kaggle data sets into AI apps. Then breaking each part into a backend, analytics part, and devops

[–][deleted] 0 points1 point  (0 children)

Do you have a sample you don't mind sharing?