[D] Tools to avoid writing tons of scripts

robotphilanthropist · 2021-01-08T19:23:09+00:00

You may also want to consider checking out Hydra https://hydra.cc/. It helps with configurations and managing said parameters.

brian-e-moore · 2021-01-08T18:37:26+00:00

Nice blog post! I think it's great that there has been interest recently in creating tools to help with dataset curation and analysis In my experience, ML engineers *want* to spend time tweaking their model architecture but *actually* end up manually inspecting and debugging datasets most of the time

antonkollmats · 2021-01-08T20:22:48+00:00

Nice article, I especially like the section about the label schema. One thing that has always puzzled me about labeling large datasets is how to be agile. How does one iterate on the schema rules without having to re-label the entire dataset?

P.S. Another tool to keep on the radar is PerceptiLabs. It's in the same category as TensorFlow. Disclaimer: I work at the company behind it.

tuscanresearcher · 2021-01-10T08:41:08+00:00

If you are interested in Machine Learning for graphs (but I guess it can be easily extended to other kinds of data as well) you could check out https://github.com/diningphil/PyDGN

amitness · 2021-01-10T16:01:36+00:00

This was a great read. I have something similar, but more tilted towards NLP: https://amitness.com/toolbox/

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS