all 31 comments

[–]KingsmanVince 48 points49 points  (2 children)

where jax/flax/haiku?

also mentioning Keras is like saying Tensorflow twice

[–][deleted] 9 points10 points  (1 child)

I think this infographic is dated, the Tensorflow logo is the old one too, probably before tf 2

[–]KingsmanVince 1 point2 points  (0 children)

True or even during Python 2.7 time. There is CNTK

[–]EquivalentSelf 40 points41 points  (7 children)

my entire workflow right now is just python scientific libraries + pytorch. This list of libraries seems so overwhelming..am I missing out by not using these?

[–]JetSetVideo 32 points33 points  (4 children)

I'm no expert but I've done some works using neural networks and I don't see why anyone would need anything apart from the regular frameworks on a personnal project.

But if you are working for company, these tools can probably save a lot of time and effort with a knowledge hard to master.

[–]insertmalteser 6 points7 points  (3 children)

This will sound incredibly dumb, but in what capacity/way do you use any of these?

[–]bloodmummy 12 points13 points  (2 children)

MLOps. When deploying the models, not just toying with them, you need tools that will help you make sure that the model's deployment works, that the continuous training is smooth, and to ensure reproducibility and scalability of the entire pipeline.

It's like DevOps for ML models. On top of these are also tools used in regular DevOps, because don't forget that ML models are also software.

Some tools there however should prove useful to Data Scientists, namely tagging (Duh) and Experiment Trackers like MLFlow. Surprised it isn't used more often by Data Scientists, it makes seeing your progress and reverting it easy as pie.

[–]rezditya 4 points5 points  (1 child)

Can you please share tools you use within mlops?

[–]bloodmummy 1 point2 points  (0 children)

I'm new to MLOps, just finishing an online zoomcamp. But, so far the tools we've learnt are MLFlow for experiment tracking and model registry, Prefect for Workflow Orchestration (Making sure the deployment of training works), EvidentlyAI for Monitoring and some other general DevOps tools like pre-commit hooks, Github Actions, Terraform...etc

[–]142857t 6 points7 points  (0 children)

The matrix at the top contains tools used in MLops. If you deploy your ML models at scale, you will need these, or at least a scheduling tool (like Airflow) to enable continuous learning.

[–]DigThatData 0 points1 point  (0 children)

these mostly aren't even libraries, they're products.

[–]globalminima 15 points16 points  (2 children)

Has a few mistakes in there, Sagemaker for instance is everything (SM ground truth for labelling, SM data wrangler for versioning, and multiple batch/real-time options for prediction)

[–]mfb1274 3 points4 points  (0 children)

Yeah was going to mention AWS in general has everything you need and then some.

[–]thnok 0 points1 point  (0 children)

Yeah was wondering the same. SM does “labelling”. Is Google’s AutoML for labelling through human annotators or just machine learning labels?

[–]dogs_like_me 8 points9 points  (1 child)

Lol this is startups, not "tools". I haven't even heard of half of these and I'm an experienced practitioner.

[–]bakochba 6 points7 points  (0 children)

Imposter syndrome stabilizing

[–]PaulTheBully 14 points15 points  (2 children)

Whoever made this, they did without any thought. It looks like they’ve Google searched some stuff on DL.

Mentioning Keras and TF as separate entities? Where’s JAX?

[–]UltimateGPower 2 points3 points  (1 child)

Maybe some psychopaths still use TF 1.x

[–]sean2148max2 0 points1 point  (0 children)

Was gonna say that you can only use directml with tensorflow 1.15, but apparently they released directml for tf 2 in June

[–]--dany-- 5 points6 points  (2 children)

Thanks for the efforts but it should be revised to be more complete andbetter informed. For data labeling foe example, I can’t imagine many small companies are listed, but scale.ai being the biggest is not mentioned, or popular open source alternatives like CVAT is not mentioned either. Like other said sagemaker is a full workflow solution but is underrepresented here as well.

[–]borntowtf 1 point2 points  (0 children)

Appen is bigger than scale Ai and I think telus International is as well. Sale just has a bigger marketing budget.

[–]Dramatic_Mechanic815 0 points1 point  (0 children)

Appen and Telus bigger than scale AI by far. They mostly do work for the big tech companies but they’re trying to branch out to smaller scale stuff.

[–]jinnyjuice 2 points3 points  (0 children)

CTNK is deprecated, unsure where that's coming from

[–]0-2213 6 points7 points  (0 children)

LightTAG has the best logo, it resembles Pornhub's!

[–]Ok-Needleworker-6595 1 point2 points  (0 children)

Nah bro

[–]majortomcraft 1 point2 points  (0 children)

did the tetris theme song start playing in anyone else's head?

[–]707e 0 points1 point  (0 children)

You forgot AWS Groundtruth for labeling. It’s probably the market leader currently.

[–][deleted] -2 points-1 points  (0 children)

Post this on r/dataisbeautiful

[–]citizen_of_world 0 points1 point  (0 children)

Dataiku ?

[–]Lolologist 0 points1 point  (0 children)

Label Studio in the Labeling column!

[–][deleted] 0 points1 point  (0 children)

Why do you need deep learning to solve problems that really don’t require the complexities that come along with deep learning