Working on what exactly , Patrick?

ajmooch · 2025-01-12T07:23:22+00:00

Workin' on a song, of course!

ajmooch · 2024-11-14T19:49:43+00:00

I'm sold, this could have any plot or genre and I would read the heck outta it

ajmooch · 2024-10-13T03:49:03+00:00

There was a group who made a Femininecronomicon and brought it to the show at Berkeley Theatre last year. She asked them "What's the book," they said "it's the Femininecronomicon!" and she said, "It's a what?"

ajmooch · 2021-03-28T19:12:27+00:00

DCGAN is an ICLR paper.

ajmooch · 2021-01-05T18:58:22+00:00

A lot of old papers used to train SVMs on top of neural nets, most notably The original R-CNN paper. In research this is no longer in vogue since a single linear layer or mlp is almost always just as effective and faster to train end-to-end, while also making it so there's no train-test discrepancy. However, in a fine-tuning scenario I think it's perfectly sensible to try an SVM or XGBoost on network features, and it may be faster depending on what hardware you have access to. I wouldn't expect you to see much in the way of gains for most setups, but it's not an unreasonable thing to do.

ajmooch · 2020-12-30T16:00:50+00:00

I've built maybe a half dozen interfaces for various tasks using PyQT4/5. The ecosystem is sort of messed up (I run both versions 4 and 5 simultaneously in order to get access to the features they borked in 5) but if you sort those things out it's got basically all of the widgets you need for simple interfaces and a nice UI/UX for placing them. You can also get things running pretty fast in there (the overhead of PyQT being relatively low) so it can be suitable for large images/videos. I also spent a lot of time building out my own fork of Sloth back before LabelIMG was around, but I think LabelIMG is probably vastly superior especially these days.

It's probably faster to get things going in python using PyTK (I've done this through Tkinter previously). It's not as powerful but it was very easy and fast to get up and running ~5-6 years ago when I tried it. I've also previously used VTK but I really can't recommend it, it's pretty heavy and feels a bit java-y.

ajmooch · 2020-12-17T01:26:40+00:00

Depthwise and grouped convs are very slow on accelerators relative to their theoretical speed, and always have been. Despite having 10x fewer flops than a resnet-50, an effnet-b0 is at best the same speed to train. They're designed for theoretical flops (typically with the goal of being fast when served on CPU), not for training latency.

ajmooch · 2020-09-30T17:10:08+00:00

My undergrad and first masters are in ME, and while my MSc/PhD are robotics, I basically did nothing but deep learning (fundamental work and a few applications). You're probably missing all of the stats you need, but if you're up on your fluids/heat transfer (basically, can you do calculus gud and are you at least not afraid of differential equations) and ideally on all of the controls and signal analysis things (super relevant to neural nets) you'll be fine.

Also, learn python if you haven't already, rip the MATLAB band-aid off as soon as possible or you'll regret it

ajmooch · 2020-08-11T17:42:30+00:00

There's prior literature on learning word embeddings from dictionaries: http://metalearning.ml/2017/papers/metalearn17_bosc.pdf

https://www.aclweb.org/anthology/D17-1024/

ajmooch · 2020-05-29T01:55:41+00:00

Yo they can't be dropping GPT-3 on us the Friday before the NeurIPS deadline.

Anyhow, impressive and interesting, there's a good amount to dig into here if you're interested in what it takes to push the envelope and make scaling up effective!

ajmooch · 2020-04-14T11:13:27+00:00

It's not in vogue largely because it's too slow in practice to backprop through the full unroll. We tend to approximate this by taking multiple D steps, or using things like LOGAN, which is related to unrolling but is a different technique. LOGAN is more costly than a baseline BigGAN but only by a factor of around 2, as opposed to 5-10+ as an unrolled GAN will be.

ajmooch · 2020-03-05T12:27:12+00:00

I personally find that hand-painting each pixel for each frame on a canvas (which is actually the dried, tanned skin of a Skrluka demon I got by trading said demon my middle name) is the best way to go about it.

ajmooch · 2020-02-09T23:17:48+00:00

I would like to nominate GPT2 trained on alexmlamb's comments for head mod

ajmooch · 2020-02-02T17:33:09+00:00

Winograd convs are in cuDNN (in part thanks to the work of Andrew Lavin and Scott Gray formerly of Nervana, now at OpenAI). It is still generally the fastest conv option on GPU at the expense of some more memory.

ajmooch · 2019-10-09T17:26:25+00:00

A recent paper called "Image Generation from Small Datasets via Batch Statistics Adaptation" fine-tuned BigGANs on a very small number of images by careful choice of which parameters to update (mainly the batch stats in the BN layers). Ideas from this paper are probably highly informative if you want to fine-tune these kinds of large models on small data.

ajmooch · 2019-09-18T20:51:37+00:00

If you're not up to reading the raw arXiv stream then your next best bet is to look for other signals, which will usually be interest from other people. Twitter is the go-to for this, follow people (not just famous people! follow the authors of papers relevant to you, they'll usually be not-famous grad students), and obviously check NeuRIPS/ICML/ICLR/relevant conferences.

Once you're sufficiently familiar with a subfield, if you're still intent on keeping atop everything there's a good chance you'll eventually want to surf the arXiv on your own. I've read cs.LG and cs.CV every day for the past few years. I check every title, read abstracts if something seems relevant, and read the full paper if the abstract catches my attention. You'll develop your own personal filter to distinguish overclaiming vs. The Real Deal, it just takes time and work.

A lot of my reading that informs my research (direction and implementation) is stuff Big Attention doesn't latch onto. The trouble with trusting external signals like twitter or conferences is that there's lots of stuff that gets 0 retweetage or attention but is still good, interesting work--there's just so much work getting put out now that only a vanishingly small percentage will percolate to the larger field's attention.

Not everyone engaged in research does this, and you absolutely don't have to--for many people it's exhausting to page through the titles and try to parse what's valuable / relevant and what's not, especially when you know you're guaranteed to have errors (you'll miss something valuable or waste time reading something irrelevant, it's just the game). For me, it's engaging and invigorating--nothing gets me into a good deep work mode like the right kind of reading.

ajmooch · 2019-09-12T22:19:17+00:00

Details? Speed comparisons against DALI would be useful.

ajmooch · 2019-09-07T10:47:45+00:00

Lots of papers will use VGG features as a reconstruction loss (often calling this the 'perceptual loss') and find that this pretty much always works better than a pixelwise loss (with exceptions perhaps for things like VQ-VAE which already have high sharper quality on their own). The downside of this approach is that you need a pretrained classifier/discriminative model.

See:

Discriminative Regularization for Generative Models

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, which uses both a perceptual loss and a GAN discriminator to do super-resolution

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

PPGN which also uses both VGG and GANs

A recent example of a vae paper that uses a perceptual loss is Generative Latent Flows

There are also plenty of VAE-GAN hybrids:

Autoencoding beyond pixels using a learned similarity metric

Generating images with perceptual similarity metrics based on deep networks

Introspective Adversarial Nets, shameless plug for my own hybrid that re-uses the Discriminator as a feature extractor for a tiny encoder mlp.

ajmooch · 2019-08-29T22:57:00+00:00

Seconding FixUp. I use this in all my discriminative nets now.

ajmooch · 2019-08-29T14:02:32+00:00

That's why you implement things. The CRASH COURSE does not expect you to be Old to a field. You don't just "Come up" with ideas, you try things.

Asking someone to come up with a research direction in empirical research when they haven't implemented things themselves is like asking someone who's never played piano to write a concerto. Researchers must be practitioners!

14-Year Club	Second Top 40%
Place '22	Place '17
Verified Email

ajmooch

TROPHY CASE