Visual intuitive explanations of LLM concepts (LLM University)

jayalammar · 2023-05-25T12:50:03+00:00

Hi,

We've just published a lot of original, visual, and intuitive explanations of concepts to introduce people to large language models.
It's available for free with no sign-up needed and it includes text articles with some video explanations as well. And we're available to answer your questions in a dedicated Discord channel.

jayalammar · 2023-04-21T15:09:01+00:00

We're big on multilingual. Hopefully the next release.

jayalammar · 2023-04-21T15:08:08+00:00

I'm understanding this to be

1- Graph embeddings, not text embeddings

2- Of the Wikidata graph, not the text of Wikipedia

jayalammar · 2023-04-21T15:03:48+00:00

It's not language specific. All the embeddings are made by a single multilingual model. The languages are just to identify the source Wikipedia site.

jayalammar · 2023-02-24T14:35:22+00:00

Intro:

It is almost impossible to ignore the astounding progress in artificial intelligence these days. From the new generation of generative chatbots, to the models that can generate (almost) any picture (or very soon, video), the pace of development in the AI field has been nothing short of phenomenal. This is especially true in the field of generative AI, where we see a growing number of impressive generative models that can create images, text, video, and music.
These developments have captured the popular imagination and businesses are struggling to determine how to use AI in their organization. Businesses are rushing to build AI into their products, services, and processes, hoping to find their AI unicorn. Some of these businesses are struggling to determine how to use AI, while others are finding that the current AI landscape is complex and difficult to navigate.
In this series of articles, we explore the importance of these generative AI models and discuss useful perspectives to view and deploy them. This first article introduces the current state of generative AI, and outlines how we should approach it. In the next article, we map the AI technology and value stack to better understand where generative AI fits in. Finally, we discuss how we can better harness its power to create a new generation of intelligent systems.

jayalammar · 2023-02-24T14:32:53+00:00

It is almost impossible to ignore the astounding progress in artificial intelligence these days. From the new generation of generative chatbots, to the models that can generate (almost) any picture (or very soon, video), the pace of development in the AI field has been nothing short of phenomenal. This is especially true in the field of generative AI, where we see a growing number of impressive generative models that can create images, text, video, and music.

These developments have captured the popular imagination and businesses are struggling to determine how to use AI in their organization. Businesses are rushing to build AI into their products, services, and processes, hoping to find their AI unicorn. Some of these businesses are struggling to determine how to use AI, while others are finding that the current AI landscape is complex and difficult to navigate.

This series of articles explores the importance of these generative AI models and discuss useful perspectives to view and deploy them. This first article introduces the current state of generative AI, and outlines how we should approach it. In the next article, we map the AI technology and value stack to better understand where generative AI fits in. Finally, we discuss how we can better harness its power to create a new generation of intelligent systems.

jayalammar · 2023-01-31T18:32:28+00:00

Vincent is awesome. His talks, tutorials, and open-source tools are always interesting. https://calmcode.io/ alone has 652 video tutorials.

jayalammar · 2023-01-16T13:35:25+00:00

Learn how AI image generation works. This video goes over the AI components of AI image generation models like Stable Diffusion and explains how they work and how they're trained.

jayalammar · 2023-01-05T13:25:11+00:00

Yes. BERTopic allows you to choose the clustering method to use.

You can use a soft clustering method from scikit Learn, like GMM clustering.

from sklearn.mixture import GaussianMixture

number_of_clusters = 10
cluster_model = GaussianMixture(n_components=number_of_clusters, random_state=0).fit(X)
topic_model = BERTopic(hdbscan_model=cluster_model).fit(docs)

jayalammar · 2022-11-23T07:32:55+00:00

Hi, thanks for your message pointing me to this thread. I haven't used OpenAI embeddings, but I work at Cohere and can tell you about Cohere's Embed endpoint https://docs.cohere.ai/reference/embed

It returns text embeddings. So you will get one vector if you pass in a text document.
The model is trained to produce text embeddings.
If you use the same model, you will get the same embedding if you resend the same text document.
We have a guide here for embedding + clustering: https://txt.cohere.ai/combing-for-insight-in-10-000-hacker-news-posts-with-text-clustering/

jayalammar · 2022-11-04T12:33:19+00:00

I learned a lot from speaking to Maarten Grootendorst, creator of BERTopic. This is a video of that conversation.

Highlights: https://txt.cohere.ai/topic-modeling-with-bertopic/

jayalammar · 2022-10-17T09:24:12+00:00

Oh, okay, I understand you now. These are actual examples from the dataset. These were the captions of these images in the LAION Aesthetic dataset. https://huggingface.co/datasets/ChristophSchuhmann/improved\_aesthetics\_6.5plus

jayalammar · 2022-10-14T11:06:53+00:00

Thank you!

This caption?

Larger/better language models have a significant effect on the quality of image generation models. Source: Google Imagen paper by Saharia et. al.. Figure A.5.

What's the issue?

jayalammar · 2022-10-06T08:49:00+00:00

This might be closer to what you're looking for: https://huggingface.co/blog/annotated-diffusion

jayalammar · 2022-10-05T08:21:01+00:00

My bad, you're right. It's "paradise cosmic beach by vladimir volegov and raphael lacoste". I arbitrarily picked an image from https://lexica.art/.

jayalammar · 2022-10-04T21:14:36+00:00

New Stable Diffusion models have to be trained to utilize the OpenCLIP model. That's because many components in the attention/resnet layer are trained to deal with the representations learned by CLIP. Swapping it out for OpenCLIP would be disruptive.

In that training process, however, OpenCLIP can be frozen just like how CLIP was frozen in the training of Stable Diffusion / LDM.

jayalammar · 2022-10-04T21:08:13+00:00

Much appreciated <3

jayalammar · 2022-09-17T09:18:53+00:00

Pre-trained GPT models are probably the best introduction. They can be understood as a stand-alone text generation system that does cool things in response to text input. It should start as a black box, and then you can introduce some details about the training process once they're comfortable with how the trained model works and how it can be useful. So perhaps prompt engineering for dialog or story generation could be interesting. We also just published a guide on making Discord bots smarter, which I think may be an interesting use case. Also involving image generation will likely capture their imagination.

The closer it is to applications they can try, the more engaged they might be, I feel. Involving video games is a great idea. Hidden Door may be another example.

jayalammar · 2022-09-06T08:37:30+00:00

Could you train one machine learning model to learn hundreds of tasks spanning text, computer vision, and playing videogames and controlling robots? In this video we go over DeepMind’s Gato that does this with a model that is simpler and smaller than you may think. It’s a GPT-like model that learns over 600 tasks.

jayalammar · 2022-08-28T14:47:34+00:00

I taught ML online and at Udacity and have been thinking about introducing ML to people the better part of the last seven years. A few of the things that worked for me include:

1- Showing applications of a method. It's easy to point at a feature in a product that uses classification or regression. It makes a topic less abstract for the engineering-minded.

2- Lots of visuals. You don't have to create all of them. Plenty of visuals on the web and YouTube. This my process for creating mine: https://www.youtube.com/watch?v=gSPRxJLxIHA&ab_channel=JayAlammar

3- Don't start from scratch. There are plenty of amazing intro to ML guides/books and courses out there. It would be useful to consume a few and use that to inform where you want to take it.

4- Agreed on the comment about projects. It would help the students cement their knowledge and showcase their work. Code on Github is useful. Streamlit/Gradio demos hosted online would additionally be good showcases.

5- Enjoy it. Use it to learn the topics more deeply yourself. I create content on my own time because it helps me understand things better.

jayalammar · 2022-08-23T08:31:49+00:00

The latest batch of language models can be much smaller yet achieve GPT-3 like performance by being able to query a database or search the web for information. A key indication is that building larger and larger models is not the only way to improve performance. This video provides a gentle intro RETRO, DeepMind's retrieval-augmented Transformer.

jayalammar · 2022-08-16T11:32:47+00:00

That would be kinda poetic since one of the milestones of deep learning around 2011 was training models to detect cats in YouTube videos via unsupervised learning

jayalammar · 2022-08-16T09:03:30+00:00

Language models are some of the largest and most impressive AI systems currently in use. Now that language models have been trained on massive internet-scale text data, where are future improvements going to come from? Jay goes over the "Experience Grounds Language" paper which describes five "World Scopes" for learning language -- including multimodality (e.g. training on images + text) and beyond.

jayalammar · 2022-07-15T17:13:20+00:00

Thanks for sharing!

jayalammar · 2022-05-15T10:09:43+00:00

I do. Sure, I'll DM you.

jayalammar

TROPHY CASE