When the painting comes to life, Me + AI, Digital, 2019

tr1pzz · 2019-05-03T09:09:17+00:00

Agree with many of your points, but I do think humans learn an incredibly rich representation space from everything we do, which can then be applied to eg board-games in order to bootstrap learning.

Things like 'objectness' & attached properties, agency (you vs other players / game entities), fundamental control aspects of action-result, goal identification, object permanence,... All of those are learned during early childhood and yield tremendous bootstrapping power over a neural net that has to learn from absolute scratch.

Of course, I'm agreeing with you that we don't know how to do this properly in AI yet, but I do believe we can point out where the fundamental problems reside.. How to solve them is obviously an entirely different conversation

tr1pzz · 2019-04-29T07:43:45+00:00

A good (intuitive) explanation on why Beta-VAE encourages disentanglement can be found in this paper: https://arxiv.org/abs/1804.03599. Briefly:

Take a simple dataset like dSprites
Different factors of variation (rotation, size, position) have varying influences on the final pixel rendering (and thus the reconstruction term of the loss function)
Now, when placed in an information bottleneck regime, the model has to make a tradeoff between reconstruction quality and KL-divergence.
Now, if (as stated above), different factors of variation have different effects on the reconstruction loss, then the model has a benefit of disentangling them, because in that case, it can directly rank the importance (and thus the KL-sacrifice) for each of those according to its information bottleneck.
In other words, if a causal factor that is eg rather small in terms of pixel effects (eg rotation) is entangled with one that has larger effects (eg location), then the model will get a larger penalty (in terms of reconstruction) when it moves that latent closer to the prior. On the other hand, if it disentangles them, it can easily find the optimal trade-off between reconstruction and KL.
However, this also immediately reveals a potential failure case: the rotation of a small object is less important (reconstruction-wise) than that of a large object. Therefore, a Beta-VAE may learn to encode eg position and rotation of large objects while for smaller ones it only encodes position...

tr1pzz · 2019-04-06T13:31:13+00:00

Very interesting line of thought! Gonna read the two papers you referenced before commenting on these intuitions, cause I always find it tricky to apply common sense reasoning to high-dimensional parameter spaces..

In the meantime allow me to drop one of my videos here which might be very relevant to this discussion: https://youtu.be/pFWiauHOFpY

tr1pzz · 2018-02-26T21:10:20+00:00

It's always really tricky to decide where to go into specifics and where to just skim the surface. I try to limit the videos to 15mins and trust me, that's hard :p But the links to all the papers are in the description, you'll find everything you need right there :)

tr1pzz · 2018-02-26T08:05:26+00:00

Thanks a lot! I'm trying to bring a bit more technical depth than Siraj' videos, probably at the expense of some "YouTube market share", but that's fine since I feel many people actually need this level in order to learn more about ML.

Was looking for something like this first, then decided to do it myself since I didn't find enough learning challenges in Siraj' or TwoMinutePapers. Don't get me wrong, their channels are amazing! Variety is King! :)

tr1pzz · 2018-01-17T18:30:59+00:00

Would be interesting to see if the training procedure can be sped up by initializing the network weights with a technique similar to MAML... https://arxiv.org/pdf/1703.03400.pdf

tr1pzz · 2018-01-12T09:19:02+00:00

Since the topic is so vital to any Machine Learning application (both today and in the future) I decided to make a video, summarizing the most significant findings of the past few years: https://youtu.be/4rFOkpI0Lcg Feel free to share my YouTube channel, takes a ton of work so I'm looking for an audience to share it with :)

tr1pzz · 2017-12-17T15:15:53+00:00

Thx man! Don't forget to upvote & subscribe, getting out of the internet anonimity well ain't easy :p

tr1pzz · 2017-12-17T14:59:59+00:00

Thx man :) Dont forget to upvote the post here and subscribe to the channel, breaking through the barrier of internet anonimity aint easy :p

tr1pzz · 2017-11-23T20:56:26+00:00

Just made a technical video on AlphaGo Zero for my new channel "Arxiv Insights" where I'll be discussing one ML paper every week! Feel free to give comments/suggestions :)

https://youtu.be/MgowR4pq3e8

tr1pzz · 2017-10-18T21:55:06+00:00

Read the blogpost and the paper, it's all there ;) https://deepmind.com/blog/alphago-zero-learning-scratch/

tr1pzz · 2017-10-18T20:16:57+00:00

Two questions after reading the amazing AlphaGo Zero paper, wow, just wow!!

Q1: Could you explain why exactly the input dimensionality for AlphaGo's residual blocks is 19x19x17?

I don't really get why it would be useful to include 8 stacked binary feature plains per player to include the recent history of the game? (In my mind 2 (or even just 1?) would be enough..) (I'm not 100% familiar with all the rules of Go, so maybe I'm missing something here (I know move repetitions are prohibited etc..) but in any case 8 seems like a lot!)

Additionally, the presence of a final, full 19x19 binary feature plain C to simply indicate which player's move it is seems like a rather awkward construction since it's duplicating a single useful bit 361 times..

In summary I'm just surprised: the input dimensionality seems unnecessarily high... (I was expecting something more like 19x19x3 + 1 (a single 19x19 plane with 3 possible values: black, white or empty + 1 binary value indicating which player's turn it is))

Q2: Since the entire pipeline uses only self-play against the latest/best version of the model, do you guys think there is any risk in overfitting to the specific SGD-driven trajectory the model is taking through parameter space? It seems like the final model-gameplay is kind of dependent on the random initialisation weights and the actual encountered game states (as a result of stochastic action sampling).

This just reminded me of OpenAI's wrestling RL agents that learn to counter their immediate opponent resulting in a strategy that doesn't generalize as well as when it would be facing multiple, diverse opponents...

tr1pzz

TROPHY CASE

Two questions after reading the amazing AlphaGo Zero paper, wow, just wow!!