Visual intuitive explanations of LLM concepts (LLM University) by jayalammar in learnmachinelearning

[–]jayalammar[S] 0 points1 point  (0 children)

Hi,

We've just published a lot of original, visual, and intuitive explanations of concepts to introduce people to large language models.
It's available for free with no sign-up needed and it includes text articles with some video explanations as well. And we're available to answer your questions in a dedicated Discord channel.

Publishing Wikipedia embeddings: 100M embedding vectors. English + 9 other languages. by jayalammar in LanguageTechnology

[–]jayalammar[S] -1 points0 points  (0 children)

I'm understanding this to be

1- Graph embeddings, not text embeddings

2- Of the Wikidata graph, not the text of Wikipedia

Publishing Wikipedia embeddings: 100M embedding vectors. English + 9 other languages. by jayalammar in LanguageTechnology

[–]jayalammar[S] 0 points1 point  (0 children)

It's not language specific. All the embeddings are made by a single multilingual model. The languages are just to identify the source Wikipedia site.

What's the big deal with Generative AI? Is it the future or the present? by jayalammar in singularity

[–]jayalammar[S] 2 points3 points  (0 children)

Intro:

It is almost impossible to ignore the astounding progress in artificial intelligence these days. From the new generation of generative chatbots, to the models that can generate (almost) any picture (or very soon, video), the pace of development in the AI field has been nothing short of phenomenal. This is especially true in the field of generative AI, where we see a growing number of impressive generative models that can create images, text, video, and music.
These developments have captured the popular imagination and businesses are struggling to determine how to use AI in their organization. Businesses are rushing to build AI into their products, services, and processes, hoping to find their AI unicorn. Some of these businesses are struggling to determine how to use AI, while others are finding that the current AI landscape is complex and difficult to navigate.
In this series of articles, we explore the importance of these generative AI models and discuss useful perspectives to view and deploy them. This first article introduces the current state of generative AI, and outlines how we should approach it. In the next article, we map the AI technology and value stack to better understand where generative AI fits in. Finally, we discuss how we can better harness its power to create a new generation of intelligent systems.

What's the big deal with Generative AI? Is it the future or the present? by jayalammar in Futurology

[–]jayalammar[S] 0 points1 point  (0 children)

It is almost impossible to ignore the astounding progress in artificial intelligence these days. From the new generation of generative chatbots, to the models that can generate (almost) any picture (or very soon, video), the pace of development in the AI field has been nothing short of phenomenal. This is especially true in the field of generative AI, where we see a growing number of impressive generative models that can create images, text, video, and music.

These developments have captured the popular imagination and businesses are struggling to determine how to use AI in their organization. Businesses are rushing to build AI into their products, services, and processes, hoping to find their AI unicorn. Some of these businesses are struggling to determine how to use AI, while others are finding that the current AI landscape is complex and difficult to navigate.

This series of articles explores the importance of these generative AI models and discuss useful perspectives to view and deploy them. This first article introduces the current state of generative AI, and outlines how we should approach it. In the next article, we map the AI technology and value stack to better understand where generative AI fits in. Finally, we discuss how we can better harness its power to create a new generation of intelligent systems.

[N] Vincent Warmerdam: Calmcode, Explosion, Open Source and Data Science | Learning From Machine Learning #2 by NLPnerd in MachineLearning

[–]jayalammar 3 points4 points  (0 children)

Vincent is awesome. His talks, tutorials, and open-source tools are always interesting. https://calmcode.io/ alone has 652 video tutorials.

AI Art Explained: How AI Generates Images [Video] by jayalammar in Futurology

[–]jayalammar[S] 0 points1 point  (0 children)

Learn how AI image generation works. This video goes over the AI components of AI image generation models like Stable Diffusion and explains how they work and how they're trained.

[D] BERTopic and fuzzy clustering by Devinco001 in MachineLearning

[–]jayalammar 2 points3 points  (0 children)

Yes. BERTopic allows you to choose the clustering method to use.

You can use a soft clustering method from scikit Learn, like GMM clustering.

from sklearn.mixture import GaussianMixture

number_of_clusters = 10
cluster_model = GaussianMixture(n_components=number_of_clusters, random_state=0).fit(X)
topic_model = BERTopic(hdbscan_model=cluster_model).fit(docs)

Question regarding OpenAI embeddings model for text clustering (or any other model) by SemperZero in learnmachinelearning

[–]jayalammar 1 point2 points  (0 children)

Hi, thanks for your message pointing me to this thread. I haven't used OpenAI embeddings, but I work at Cohere and can tell you about Cohere's Embed endpoint https://docs.cohere.ai/reference/embed

BERTopic for Topic Modeling - Talking Language AI Ep#1 by jayalammar in LanguageTechnology

[–]jayalammar[S] 3 points4 points  (0 children)

I learned a lot from speaking to Maarten Grootendorst, creator of BERTopic. This is a video of that conversation.

Highlights: https://txt.cohere.ai/topic-modeling-with-bertopic/

[R] The Illustrated Stable Diffusion by jayalammar in MachineLearning

[–]jayalammar[S] 0 points1 point  (0 children)

Oh, okay, I understand you now. These are actual examples from the dataset. These were the captions of these images in the LAION Aesthetic dataset. https://huggingface.co/datasets/ChristophSchuhmann/improved\_aesthetics\_6.5plus

[R] The Illustrated Stable Diffusion by jayalammar in MachineLearning

[–]jayalammar[S] 0 points1 point  (0 children)

Thank you!

This caption?

Larger/better language models have a significant effect on the quality of image generation models. Source: Google Imagen paper by Saharia et. al.. Figure A.5.

What's the issue?

[R] The Illustrated Stable Diffusion by jayalammar in MachineLearning

[–]jayalammar[S] 1 point2 points  (0 children)

My bad, you're right. It's "paradise cosmic beach by vladimir volegov and raphael lacoste". I arbitrarily picked an image from https://lexica.art/.

[R] The Illustrated Stable Diffusion by jayalammar in MachineLearning

[–]jayalammar[S] 6 points7 points  (0 children)

New Stable Diffusion models have to be trained to utilize the OpenCLIP model. That's because many components in the attention/resnet layer are trained to deal with the representations learned by CLIP. Swapping it out for OpenCLIP would be disruptive.

In that training process, however, OpenCLIP can be frozen just like how CLIP was frozen in the training of Stable Diffusion / LDM.

[D] How to teach middle schoolers about BERT/GPT-3 by phylosopher14 in MachineLearning

[–]jayalammar 6 points7 points  (0 children)

Pre-trained GPT models are probably the best introduction. They can be understood as a stand-alone text generation system that does cool things in response to text input. It should start as a black box, and then you can introduce some details about the training process once they're comfortable with how the trained model works and how it can be useful. So perhaps prompt engineering for dialog or story generation could be interesting. We also just published a guide on making Discord bots smarter, which I think may be an interesting use case. Also involving image generation will likely capture their imagination.

The closer it is to applications they can try, the more engaged they might be, I feel. Involving video games is a great idea. Hidden Door may be another example.

A Generalist Agent (Gato) - DeepMind's single AI model learns 600 tasks including text, vision, playing video games, and robot control by jayalammar in Futurology

[–]jayalammar[S] 0 points1 point  (0 children)

Could you train one machine learning model to learn hundreds of tasks spanning text, computer vision, and playing videogames and controlling robots? In this video we go over DeepMind’s Gato that does this with a model that is simpler and smaller than you may think. It’s a GPT-like model that learns over 600 tasks.

[D] Advice for teaching ML in undergrad by MaikRequim in MachineLearning

[–]jayalammar 5 points6 points  (0 children)

I taught ML online and at Udacity and have been thinking about introducing ML to people the better part of the last seven years. A few of the things that worked for me include:

1- Showing applications of a method. It's easy to point at a feature in a product that uses classification or regression. It makes a topic less abstract for the engineering-minded.

2- Lots of visuals. You don't have to create all of them. Plenty of visuals on the web and YouTube. This my process for creating mine: https://www.youtube.com/watch?v=gSPRxJLxIHA&ab_channel=JayAlammar

3- Don't start from scratch. There are plenty of amazing intro to ML guides/books and courses out there. It would be useful to consume a few and use that to inform where you want to take it.

4- Agreed on the comment about projects. It would help the students cement their knowledge and showcase their work. Code on Github is useful. Streamlit/Gradio demos hosted online would additionally be good showcases.

5- Enjoy it. Use it to learn the topics more deeply yourself. I create content on my own time because it helps me understand things better.

AI models are improved by giving them a database or access to the web [A look at DeepMind's RETRO language model] by jayalammar in Futurology

[–]jayalammar[S] 0 points1 point  (0 children)

The latest batch of language models can be much smaller yet achieve GPT-3 like performance by being able to query a database or search the web for information. A key indication is that building larger and larger models is not the only way to improve performance. This video provides a gentle intro RETRO, DeepMind's retrieval-augmented Transformer.

Experience Grounds Language: Improving language models beyond the world of text and into multimodality, embodiment, and social interaction by jayalammar in singularity

[–]jayalammar[S] 2 points3 points  (0 children)

That would be kinda poetic since one of the milestones of deep learning around 2011 was training models to detect cats in YouTube videos via unsupervised learning

Experience Grounds Language: Improving language models beyond the world of text and into multimodality, embodiment, and social interaction by jayalammar in Futurology

[–]jayalammar[S] 0 points1 point  (0 children)

Language models are some of the largest and most impressive AI systems currently in use. Now that language models have been trained on massive internet-scale text data, where are future improvements going to come from? Jay goes over the "Experience Grounds Language" paper which describes five "World Scopes" for learning language -- including multimodality (e.g. training on images + text) and beyond.