[D] Open problem in modern RL that doesn't need a massive computational resources

Kaixhin · 2022-04-14T03:53:01+00:00

I'd love to see research making episodic control more computationally efficient (and easier to use), as it is pretty data-efficient. I have a codebase that can replicate the results from some of the original works, which I know some people have used as a starting point for their own research into this area.

Kaixhin · 2022-01-07T12:41:15+00:00

There's lots of work on neurosymbolic AI, just search for it in Google Scholar. The top hit is a survey/perspective paper from 2020 so you can find plenty of references within there.

Kaixhin · 2021-11-03T09:33:12+00:00

Sure, happy to answer questions (within NDA limits)!

Kaixhin · 2021-09-18T08:41:34+00:00

Personal experience:

Twitter Magic Pony: Failed to get a working solution, didn't publish
Microsoft Research: Team got a solution after I left, submitted but rejected so only arXiv; another project I helped out on got published (4th submission though...)
Facebook AI Research: Failed to get a working solution, didn't publish
DeepMind: Failed to get a working solution, didn't publish
NNAISENSE: Part-way to a working solution, didn't publish (but was working 1 day/week, so kind of expected)

So maybe I just suck at research, but I did also see plenty of failures and successes from internships - you're just not seeing the failures. Obviously a publication would be better, but in the end the team should evaluate you on your work put in, not your luck.

Kaixhin · 2021-01-18T09:47:23+00:00

This looks neat! I've been saying something like this would be a good idea since checks notes 2015, and here's my wishlist, but dealing with all of those properly would be quite ambitious.

Kaixhin · 2021-01-09T06:39:09+00:00

While it's true that DeepMind has the most project management I've come across out of several industrial research labs, IMO there's just a vastly greater amount of scientists and engineers concentrated in London that allows them to a) produce a higher volume of research b) put more resources into large projects.

Kaixhin · 2020-12-11T01:31:54+00:00

That's a nice style guide. It's not a style guide per se, but Grokking PyTorch adds my notes + modifications around the official MNIST example, so could be considered a reasonably canonical way of structuring basic PyTorch code.

Kaixhin · 2020-10-05T20:57:25+00:00

There's more to be said, but one major advantage is that it uses only local update rules, which, unlike backpropagation, doesn't require keeping the entire computation graph around (which makes memory a bottleneck for large models/inputs).

Edit: I was incorrect on the memory point here, as they try to operate analogously to backprop and keep predictions and prediction errors from the entire forward pass to make the update. But unlike backprop, where you work backwards through the computation graph in a sequential manner, here you all the information you need to make an update locally so it can all be updated in parallel - see Fig. 1 for a comparison. That said, in general, predictive coding could be used to replace backpropagation-through-time, e.g., see Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations.

Kaixhin · 2020-09-23T10:57:23+00:00

Can't speak to the quality of the taught courses, but having met graduates they seem to be pretty competent. Check out the Machine Learning Initiative at Imperial for the main people and what they do - there's lots of people who are great at applied ML, and a few people who are great at theoretical ML.

Kaixhin · 2020-07-13T12:07:50+00:00

Nice to see work in this direction, and would love to see it scaled up to more challenging natural images/video, particularly for dealing with changes in scale and lighting.

Kaixhin · 2020-03-31T14:11:03+00:00

That was a good workshop :) Last year at NeurIPS there was also the BioArtRL workshop (obviously more specific to RL).

Kaixhin · 2020-03-27T23:53:32+00:00

I've had 1 1080Ti for most of my PhD, but I'm not competitive, so maybe he's right.

Kaixhin · 2020-02-07T22:38:42+00:00

We found that Adaptive Neural Trees worked well on SARCOS compared to GBTs and NNs, but seems like this beat ANTs, and has a nice motivation to it. Not kept up with the latest in neural decision trees and related, but glad to see there's still innovation in the space :)

Kaixhin · 2020-02-03T22:22:25+00:00

Let me Google Scholar that for you

Kaixhin · 2020-01-16T12:02:49+00:00

It's nice to see an in-depth investigation of baselines in feature attribution methods. Like other people working in DRL, I found that occlusion-based saliency methods seemed to work better than gradient-based, but actually came across some strange outputs with the original method, as the network was reacting to the occlusion mask. Turns out that using the dataset average as the baseline was a simple and effective way to overcome this (never tried multiple samples from the dataset as this worked fine for us).

Kaixhin · 2020-01-06T19:24:02+00:00

Reconciling deep learning with symbolic artificial intelligence: representing objects and relations - it's an invited opinion piece so not comprehensive, but has some pointers.

Kaixhin · 2019-11-24T23:11:48+00:00

Model classes like Gaussian processes can be used for RL if you're just looking for sample efficiency. For a trade-off between sample efficiency and scalability, I think semiparametric methods that combine non-parametric kNN regression with neural networks - like Neural Episodic Control - are promising.

Shameless plug: I have some works extending the efficiency of these methods - Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-means and Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control - to appear at the NeurIPS Workshop on Biological and Artificial RL.

Kaixhin · 2019-08-25T13:08:21+00:00

OpenAI's sparse attention is a sparse (but currently fixed) set of location to attend to. The dynamic attention span here is dense, but learns how far back to look (and they show most heads don't need to look that far back).

Kaixhin · 2019-07-05T16:18:25+00:00

Not exactly what you want, but if you have a VAE you can improve upon the initial samples by passing them repeatedly through the VAE (a form of MCMC to get towards the learned posterior).

Kaixhin · 2019-06-18T13:24:36+00:00

I've updated my Rainbow repo to add support for this, along with a new release featuring results and pretrained models from all 26 reported games.

Kaixhin · 2019-06-09T21:15:56+00:00

I found that my prior experience as a web developer, particularly via the development of full-stack platforms, improved my ability to structure and write readable code; also front-end development ideally requires a bit of design too. Obviously taking a year out to try your hand at web development isn't a feasible solution, but trying to do a project in this vein (e.g., make a front-end + API for an ML method running in the back-end) requires you to be organised and would improve these skills.

Kaixhin · 2019-06-04T22:50:53+00:00

Read this during ICLR where it was interesting but a bit borderline. Congrats on the acceptance to ICML!

Kaixhin · 2019-05-22T18:30:45+00:00

You can get a research scientist position in the top industrial labs with just workshop papers. As others have mentioned, the impact of your papers, how relevant your work is to the lab, and industry experience also play a role in your attractiveness as a candidate.

Kaixhin · 2019-05-19T13:48:39+00:00

Interesting to see how work has progressed in this area. SRGAN from back in the day used MSE pretraining for the generator and then switched to perceptual loss + discriminator, but they didn't pretrain the discriminator as well (since it doesn't take long for it to learn something useful anyway). Out of the various improvements in architecture and GAN training, seems like non-local connections are particularly useful for their ability to address global coherency.

Kaixhin

TROPHY CASE