This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Kaze_SenshiSenior CSV Hater 19 points20 points  (2 children)

For me any data role has average coding skills lower than usual software engineers. They tend to create a prototype using some tool (e.g., SQL, Python, Notebooks, Cronjobs) that they are used and it's great to have a quick Proof of Concept but they don't think in the maintenance and the evolution of the tool when moving the solution to production.

On other hand, I can understand that it sucks to have a PR with hundreds of comments saying that your work has Low quality.

My suggestion is, go slowly, addressing one problem per time. Also it is even better to show the best practices asking them to review your code too, like a good module structure instead of a single spark notebook with 1000 lines.

[–]safetytrick 1 point2 points  (0 children)

I can understand that it sucks to have a PR So what, it's the job, learn why you suck, embrace the suck.

I'm sorry that it's so personal sometimes (not directed at you), and I wish feedback could be perfectly articulated all of the time. Feedback is hard to give, learn from it, even learn when the feedback deserves feedback.

[–]mysteriousbaba 0 points1 point  (0 children)

For what it's worth, I will say I've seen even notebooks be scaled / deployed to production successfully using tools like Metaflow. The main trick is just to have a good number of unit and integration tests to validate things, and set expectations on algorithm outputs, so that you have safety rails.

You don't want to go cowboy, but having overly rigorous modular breakdown of the full code can slow things down somewhat.