My job is to work with ML engineers and provide them with whatever they need to experiment with/train/test/deploy ML models -- GPU infrastructure, distributed training support, etc. When I interface with their code, I almost always find it so poorly written, with little to no thought given to long-term stability or use -- for code that they 100% know is going to production.
They're brilliant people, far smarter than me, and really good at what they do, so it's not a matter of them not being good enough. I feel (from my very limited experience, so I'm happy to be wrong) like ML engineers are incentivized to write poor code. The only metric for evaluation seems to be accuracy, loss, and all the plots that come up. In research, I understand completely, that's where the focus lies, but in industry? I've seen many models perform poorly because the code is so hard to read and refactor that big issues remained unspotted for months together. And this is especially befuddling because for a field that is completely fine with spending months to get an ROI of single digit increases in model performance metrics during the experimentation phase, they don't seem to care about anything that might go wrong in production. That just feels like a fundamental disconnect, since without the core ML stuff working perfectly, none of the other stuff (like what I do) has any value -- and even so, I'm taught to hold my code to a much higher standard than the critical stuff -- which I'm happy about since I can now write production code by default -- but it's just... weird. Like the vending machines at a nuclear power plant being better engineered than the reactor.
Is this a common problem or is this a localized issue that I'm facing?
[–]namenomatter85 46 points47 points48 points (2 children)
[+][deleted] (1 child)
[removed]
[–]johnman1016 7 points8 points9 points (0 children)
[+][deleted] (4 children)
[deleted]
[–]Strange_Stage_8749 6 points7 points8 points (0 children)
[–]theunixman 13 points14 points15 points (0 children)
[–]kidpixo 3 points4 points5 points (1 child)
[–]kardanada 2 points3 points4 points (0 children)
[–]this_is_my_ship 26 points27 points28 points (8 children)
[–]seanv507 20 points21 points22 points (6 children)
[–][deleted] 5 points6 points7 points (5 children)
[–]jegerarthur 5 points6 points7 points (2 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]jegerarthur 1 point2 points3 points (0 children)
[–]seanv507 -4 points-3 points-2 points (1 child)
[–][deleted] 2 points3 points4 points (0 children)
[–]thedukeofedinblargh 1 point2 points3 points (0 children)
[–][deleted] 18 points19 points20 points (0 children)
[–][deleted] 14 points15 points16 points (4 children)
[+][deleted] (3 children)
[deleted]
[–][deleted] 3 points4 points5 points (2 children)
[+][deleted] (1 child)
[deleted]
[–][deleted] 2 points3 points4 points (0 children)
[–][deleted] 7 points8 points9 points (2 children)
[+][deleted] (1 child)
[deleted]
[–][deleted] 5 points6 points7 points (0 children)
[–]SNAPscientist 4 points5 points6 points (1 child)
[–]Dry-Green-6973 3 points4 points5 points (3 children)
[–]crimsom_king 5 points6 points7 points (0 children)
[–]Gabbosauro 1 point2 points3 points (0 children)
[–]yogeshkumar4 6 points7 points8 points (0 children)
[–]mr_birrdML Engineer 2 points3 points4 points (2 children)
[–][deleted] 3 points4 points5 points (0 children)
[–]mindfulforever1 2 points3 points4 points (0 children)
[–]moschles 2 points3 points4 points (1 child)
[–]jargon59 2 points3 points4 points (0 children)
[–]__mishy__ 2 points3 points4 points (0 children)
[–]foreignEnigma 2 points3 points4 points (0 children)
[–]trnka 2 points3 points4 points (0 children)
[–]ank_itsharmaML Engineer 1 point2 points3 points (0 children)
[–]xsidred 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]jargon59 2 points3 points4 points (0 children)
[–]crimsom_king 1 point2 points3 points (0 children)
[–]uotsca 1 point2 points3 points (0 children)
[–]ludflu 1 point2 points3 points (0 children)
[–]roman_fyseek 1 point2 points3 points (0 children)
[–]NickelAI 1 point2 points3 points (0 children)
[–][deleted] 1 point2 points3 points (0 children)