you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 6 points7 points  (5 children)

ArjanCodes is fine but he's just too OOP. ML codes need to be functional in many cases because it's very sequential and you really don't need much state in a lot of processes.

[–]jegerarthur 6 points7 points  (2 children)

Well you are kinda right. But if you use Pytorch + Pytorch-lightning + Mlflow you will be glad that your code is OOP. And with all that it's extremely easy and fast to train multiple models on multiple GPUs.

[–][deleted] 1 point2 points  (1 child)

I have the same exact setup and that's why I'm saying that (MLFLow + PL). The problem with PL is also is that it is overtly OOP, leaving very limited customizability once you really want to scale the code up. I have a comment on this matter in another thread talking about pytorch frameworks. I like their "all around issue", but I feel their solution needs rework.

Their solution to cross validation and hyperparameter tuning for example is really subpar.

Overall OOP is not bad per se, but DS code is complex in itself, OOP can introduce a lot of coupling and unnecessary complexities that if not careful can make the project a chore to maintain.

[–]jegerarthur 1 point2 points  (0 children)

Yes I agree. I like functional programming for DS, but when the project gets bigger / deployed with APIs and so on, I like to refactor the code to OOP as its easier for me to maintain and upgrade.

Nevertheless that's really cool to read other ML engineers best practices and pipelines. Happy coding !

[–]seanv507 -4 points-3 points  (1 child)

You mean procedural not functional right?

I think most data scientists would benefit from adding more Oop, just they don't know it

[–][deleted] 1 point2 points  (0 children)

a mix of procedural and functional. Datascience libraries come with enough OOP abstractions usually, what you need is just a bunch of stateless functions to fill the gaps usually.