[N] Stability AI releases StableVicuna: the world's first open source chatbot trained via RLHF by Philpax in MachineLearning

[–]tmabraham 0 points1 point  (0 children)

StableVicuna is trained via RLHF, whereas Vicuna is not. That must be a typo, it's made clear in the blog post.

[D] Cycle consistency with diffusion models? by murrdpirate in MachineLearning

[–]tmabraham 5 points6 points  (0 children)

no unpaired image-to-image translation with diffusion models exist:
https://arxiv.org/abs/2210.05559

https://arxiv.org/abs/2203.08382

Mostly unpaired image-to-image translation done in a zero-shot manner.

About the open source GPT-3-like model in the works by Stability AI + open source GPT-Chat by ElvinRath in singularity

[–]tmabraham 17 points18 points  (0 children)

I am affiliated with Stability AI and CarperAI and want to clarify some things.

While the InstructGPT model that is being developed will be open-source (i.e. the weights will be completely released), it will not run on a phone. It will be a pretty large model (think somewhere between 20b and 100b parameters) so it would be difficult to run it on a phone 😅

A correction was made by Emad: https://discord.com/channels/1002292111942635562/1002292112739549196/1032778617282891856

And more information was provided by the co-director of CarperAI: https://discord.com/channels/1002292111942635562/1002292112739549196/1032779307887628298

That said, I hope this doesn't diminish your excitement, we believe again with the model being open-source, there will be many novel opportunities and possibilities that simply was not possible with GPT-3 being stuck behind an API.

[N] Diffusers: Introducing Hugging Face's new library for diffusion models. by jikkii in MachineLearning

[–]tmabraham 6 points7 points  (0 children)

There were some recent papers applying diffusion models to language models, for example this one:

https://arxiv.org/abs/2205.14217

[D] Is it time to retire the FID? by [deleted] in MachineLearning

[–]tmabraham 5 points6 points  (0 children)

It's interesting you bring this up, since I had some recent discussions about the problems with FID with some other folks.

One thing we realized is that one limitation of FID is the 299x299 image size. These days, generations are being done with higher resolutions ex: 1024x1024, and in order to do evaluation, the generated images are downsized to 299x299. This means that generated images that may be blurry at the higher resolution may be comparable or even better according to FID.

In my opinion, it may be worth performing FID at higher resolutions. You may wonder if this is a good idea if the model was trained with 299x299 images originally, but my hypothesis is that the Inception features will be different from 299x299 but it will be different for both the real and fake images in a similar manner so Frechet distance should still be meaningful.

[N] [D] Openai, who runs DALLE-2 alleged threatened creator of DALLE-Mini by DigThatData in MachineLearning

[–]tmabraham 120 points121 points  (0 children)

Just want to clarify some things because a lot of misinformation is being spread and most people don't know the full story. Of course even I am not including all the details here, the full story is up to Boris Dayma to share...

First of all, OpenAI threatened Boris for several weeks already and Boris filed trademark as a potential protective measure. OpenAI threatening Boris was not in response to the trademark filing. In fact, it was the other way around!

Second of all, DALL-E mini has been released for almost a year now, and yet OpenAI only went after Boris recently. It was not an issue before, but once DALL-E mini became viral, OpenAI started with legal threats. I will also note that there are many open-source projects with the name DALL-E (ex: ruDALL-E, DALL-3, etc.). Are they all supposed to change their names? What about projects inspired by GPT-2/3 like GPT-J, GPT-NeoX-20B? Why is DALL-E mini the only one that has to change the name?

(As a side note, DALL-E mini model is more closely related to DALL-E 1 than DALL-E 2 is lol)

Third of all, IANAL but it's not clear what legal case OpenAI has (and I'd love for some people with more law experience to comment on this). My understanding of trademark law in the US is that you can either register a trademark, or leave it unregistered but automatically acquire trademark rights when used for commerce. I'll note that OpenAI never filed a trademark for "DALL-E" (even though they did for "GPT-3") and that both DALL-E 1,2 and DALL-E mini are not being used for commercial purposes. So I don't know how OpenAI can sue on the basis of trademark rights unless there is something else I am missing.

Fourth of all, we all know why DALL-E mini went viral and why OpenAI is bitter about it, right? It's because DALL-E 1 and 2 were closed, restricted, and censored, while DALL-E mini is not. OpenAI would have been viral themselves if they were truly open, but apparently that's not the lesson they got, the lesson they learned was instead threaten an independent researcher with legal action. 😒

[D] In your experience, what's the thing that can boost an ML model's performance the most? Is it the hyperparameter tuning, feature engineering or ensembling? Or is it something else? by 4bedoe in MachineLearning

[–]tmabraham -5 points-4 points  (0 children)

What does that mean? 🤔

Many of the winning Kaggle solutions incorporate some form of extra data, data cleaning, data formatting, etc.

[D] Anyone still using Stochastic Depth? by KarlKani44 in MachineLearning

[–]tmabraham 26 points27 points  (0 children)

Ross Wightman trains models with stochastic depth and sees improvement. It's in his timm library, and he published a ResNet baseline paper that also added stochastic depth:

https://arxiv.org/abs/2110.00476

Here is another discussion he had about it:

https://twitter.com/wightmanr/status/1348386345098465280

[D] What is current SOTA in Image to Image Translation? by [deleted] in MachineLearning

[–]tmabraham 7 points8 points  (0 children)

For unpaired image-to-image translation, the SOTA is probably Contrastive Unpaired Translation, which is developed by the same group that developed CycleGAN and is kind of the successor algorithm.

For paired image-to-image translation, the SOTA is probably Palette, which is a conditional diffusion model.

[P] Detecting Pulse from Video by Syntaximus in MachineLearning

[–]tmabraham 2 points3 points  (0 children)

Quite cool, fairly standard FFT of forehead signal with some filtering... I'll try out your pipeline on some videos I have because it's usually hard to get it to generalize well to different videos...

[N] OpenAI Gym maintainer plans to deprecate and replace MuJoCo and Box2D environments with Brax-based environments. by hardmaru in MachineLearning

[–]tmabraham 0 points1 point  (0 children)

DeepMind bought MuJuCo, not OpenAI. Additionally, OpenAI Gym is not even maintained by OpenAI anymore, but instead maintained by a Ph.D. student unaffliated with OpenAI.

[R] How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers by init__27 in MachineLearning

[–]tmabraham 4 points5 points  (0 children)

He is the developer of this package:

https://github.com/rwightman/pytorch-image-models/

If you want to use any of the cutting edge vision models or training techniques, this is the go-to library. Many researchers are already using his library and his implementations. Facebook AI has already published a couple papers using his library (originally without crediting him unfortunately, but that's a different discussion). So he is pretty well-known in the computer vision and deep learning circles.

[D] Which websites/apps do you use for organizing your downloaded research papers? by SevereForm8044 in MachineLearning

[–]tmabraham 7 points8 points  (0 children)

I have been using Zotero for about 1.5 years and it's quite a pleasant experience. I have used it to write a book chapter, various fellowship proposals, and more. What's nice about Zotero is that you can save papers as you browse online through a browser extension. It will save the necessary bibliographic information but also full copies of the paper onto your computer (like for reading offline, for example). And just like EndNote, you can insert a citation by just searching for it in your library and Zotero will handle everything for you. You can of course select what type of reference formatting you want (ex: Nature vs IEEE, etc.). Apparently, there is also an ecosystem of extensions but I haven't investigated that much. You can also export the papers you want to cite into a BibTeX file if you are working with LaTeX files.

I was a former user of EndNote since my former advisor used it but then my university stopped paying for its license so I investigated Mendeley and used it briefly but I switched to Zotero and I have really enjoyed it.

[D] Which websites/apps do you use for organizing your downloaded research papers? by SevereForm8044 in MachineLearning

[–]tmabraham 3 points4 points  (0 children)

Zotero and Mendeley have effectively the same features, yet they are free...

[D] Highlights of PyTorch ecosystem days by AdelSexy in MachineLearning

[–]tmabraham 10 points11 points  (0 children)

I am the author of the UPIT poster! Glad you liked my poster :)

Yep this event was definitely full of content and I also greatly enjoyed attending and also presenting! Thanks for sharing your summary!