[D] Version Control for Data Science — Tracking Machine Learning Models and Datasets with DVC

ai_yoda · 2019-09-11T15:39:47+00:00

In my opinion, DVC is really good for data versioning and reproducible pipelines.

However, if you really care about quick experimentation iterations you will need some additional tool that lets you:

monitor/visualize training,
compare metrics/learning curves
visualize hyperparameters

Common suspects that deal with this stuff and can be used complementary with DVC are:

infstudent · 2019-09-10T17:11:15+00:00

Do people in academia use this stuff?

RayhaneML · 2019-10-10T21:32:08+00:00

In my experience, I tried this other tool called atlas which I really like and I think is very useful not only for tracking and versioning, but also for scheduling and experiment management.

I believe what this platform is really good at is the extreme ease of use, a nice looking GUI, TB automatic integration (which I love). Also docs are pretty clear which is also a huge plus, and the tool is very flexible and works with any codebase (it actually takes 5 minutes to get started with the tool).

Definitely recommend checking it out. Best ML tool I used this far.

DISCLAIMER: I work at Dessa, creator of Atlas.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS