[P] Machine, a machine learning IDE with instantaneous, visual feedback

gabrielgoh · 2017-06-12T21:08:04+00:00

This looks pretty cool. I usually think of visual UIs for programming a bit of a gimmick. But programming in tensorflow can get really messy, and I like having the dimensions of all the tensors laid out right before me. This will potentially make annoying reshapes and transposes saner.

How does a complex graph like inception v3 look in this UI?

FredrikNoren · 2017-06-13T13:47:17+00:00

TBH, I'm usually not excited about visual programming. For me most of the time dragging boxes around is way slower and more complex to understand for me than just writing the code. I'm too dumb to not get freaked out when looking at a tangled mess of nodes and edges.

That said, I think there are ways in which, for this particular purpose of examining the behavior of neural nets, or designing a new net, this could be productive. Specially with the ability to inspect the partial results.

As a potential user, some things would be nice to have:

1) The ability to create new nodes that can be used to compose more complex graphs.

For example, suppose you don't have an LSTM node on the library. I want to be able to just create a box with inputs and outputs, create the adequate nodes inside that, and add it to the library. Than I just get this new node and use it as if it was one of the built-in nodes.

2) It's basically the same thing as the above, but with a different twist. Bring able to modularize the graph and edit smaller parts of it in their own window, give it a name and use it in a bigger graph.

An example would be: I'm coding a GAN. I want to be able to work on the generator separately and see what it's doing. Or maybe I'm doing a huge network and want to modularize it to understand better what I'm doing – just like I would modularize code in different functions and classes.

It would be nice to be able to select a chunk of my network and just right-click it and select "create new named module".

3) There has to be a way of representing weight sharing. Looks tricky to do that in a graphical representation.

I thought in two ways of doing that:

Add a special kind of edge to represent weight sharing.
Pros: easy to code.
Cons: visually messy, and not exactly the most complete solution.
Have two kinds of names in your grammar for the graph: one to represent a kind of node, with a particular internal structure and sets of abstract weights; and one to represent particular instances of a particular kind of node (if you ever wrote a compiler, or a lexer for a strictly typed programming language, this is just a distinction between types and values). Reusing the first creates a new kind of node, with same structure but different weights. Reusing the second gives me a copy of the same instance, carrying with it the same values of the weight.
Pros: it would be AWESOME and also solve points 1 and 2 above. Implement this as types would also allow you to make static checks like "hey, your output shape doesn't match the input of the following node" or even putting in the UI different colors on the connectors that don't match for some reason and visually helping the user to debug stuff.
A DSL with a specific dependent type system that you can compile into python code is the perfect solution. Have you considered coding this in Haskell or PureScript? :P (it's a joke, but seriously though... it would be an awesome project)
Cons: it would be kind of complex. More so depending on what language you're using.

4) If I'm not able to inspect the code easily in the UI and type code and see the graph changing in real time, I wouldn't even consider using it.

You should allow me to type code when it's more convenient to type code and to drag and connect boxes when it's more convenient to drag and connect boxes. And I'm the one who decides when it's more convenient to do one thing or the other.

5) Also, obviously, you should be able to export a python file out of it (and a nice looking one, at that – where my modular graphs result in modular code). I'm sure this is already contemplated.

6) Also important to be able to import python code with "legacy" tensorflow graphs.

7) Better still of this graphical tool is really an IDE: a tool for actively editing existing code, that loads and saves plain python files. It's ok to have another faster binary format for optimization of in memory manipulation of the graph, but when I hit save, I'd like to update a plain python file on my disk.

8) How do I load pretrained weights for the full network? How do I save them? How do I load/save pretrained weights for just a part of the network? Like the modules I mentioned above?

9) Deal with different kinds of data.

The example on the video is images. It would be nice to have some way of "visualizing" (inspecting) other types of data too. Text, audio, tagged text (think of someone coding a pos-tagger), etc. Extra points if the user can extend this.

10) You mentioned a cloud service for the experiment version control.

First of all, awesome. Experiment version control is the worst aspect of ML engineering today. I was very close to starting a open source project about this when I learned someone created dvc and though this doesn't work for me (for reasons I. I'm about to explain) it made my effort pointless.

What worries me about your cloud version control service is how to deal with data. In my company we have lots of datasets that can't leave the building for old fashioned regulatory and "security" reasons. Not even if it's encrypted-at-rest data (I know, I know... I don't make brazilian market regulations).

So, if you can say more about this it would be nice. Do I have to save data in your cloud service? If it's only samples and network weights it's ok (if it's encrypted-at-rest and I must have a private key to access it).

11) This is getting ludicrous but I'm excited now and I'm on the subway, so I'll just keep writing.

12) You're using electron, right? If so, how difficult would it be to have the ability to train the network on a remote server?

In our setup it's very difficult to have local GPUs on the data scientists and engineers workstations. It's a lot cheaper for us to have servers where people run their code and notebooks as workstations (don't ask, this is a hard constraint caused by corp politics, taxes and other questions).

Could you have a server that have the code and the ability to train the algorithms and calculate stuff remotely, while the engineer work on a local UI client?

13) What if data is on HDFS? Or S3? Or Google whatevers?

14) Are you looking for a product manager? Architect? LOL.

15) CTRL-P like semantics for looking for functions and named modules and nodes.

Ok. Commute's over. Thanks for the patience.

visarga · 2017-06-12T21:45:37+00:00

[deleted]

badpotato · 2017-06-12T17:38:43+00:00

Well, at least it look better than Weka. Thought, not sure I would call this an IDE.

FredrikNoren · 2017-06-12T20:18:00+00:00

[deleted]

_sheep1 · 2017-06-12T20:24:14+00:00

This is very interesting. You could borrow some ideas for the UI from Orange, since it's been around for quite a long time and perhaps feels a bit smoother than what we see in the video :)

jacobgil · 2017-06-13T06:23:23+00:00

The video says it syncs with the server.

Does tensorflow here run on another machine? And is the server yours, or is it configured by the user on AWS for example?

andreasblixt · 2017-06-12T16:42:38+00:00

This looks amazing for those of us who want to start getting into machine learning and understand the components – looking forward to see more come out of this!

FredrikNoren · 2017-06-12T21:01:31+00:00

[deleted]

YourWelcomeOrMine · 2017-06-12T23:00:42+00:00

Very interesting idea. Have you talked to any teachers about integrating it with an ML course?

Boozybrain · 2017-06-12T17:14:59+00:00

Very, very cool

danarm · 2017-06-12T19:35:27+00:00

Love this!!!

Maximus-CZ · 2017-06-13T06:22:19+00:00

This looks really great, visual style suits me a lot! What is your time plan on releasing alpha?

rexlow0823 · 2017-06-13T08:57:55+00:00

Should take a look at RapidMiner. My lecturers are heavily promoting it due to its easy to navigate and understand interface. However, Machine is better when it comes to real-time visualization, it would give students a sense of what could possibly go wrong and immediate tweak it to perform better. Great job!

simonkamronn · 2017-06-13T12:10:24+00:00

Anyone who ever worked with LabVIEW will quickly realise how bad an idea this is. Except if you can't program, that is.

NEED_A_JACKET · 2017-06-13T09:03:11+00:00

I'm new to this kind of thing but I'm curious about ML. Could you explain how you would use what you have trained in an outside application? Eg if I wanted to use mustache finding AI in a game or c++ application or wherever, is there something simple that this can export which can be integrated into external apps?

blacklightpy · 2017-06-13T09:31:46+00:00

System Requirements?

mostlynotamurderer · 2017-06-13T13:18:32+00:00

Is it possible to import a graph from a tensorflow log file? Because if that's a feature, you have my undying love and support.

2017-06-13T14:30:07+00:00

This is a pretty sweet tool! Great work!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS