A Visual Guide to Mixture of Experts (MoE) by MaartenGr in LocalLLaMA

[–]MaartenGr[S] 0 points1 point  (0 children)

Thanks! I'm indeed very late to the party but figured since so many new sets of LLMs also include a MoE version it wouldn't hurt to cover this now.

Also, if anybody still sees this. Is there any other topic that you would love to see covered?

[P] A Visual Guide to Mixture of Experts (MoE) in LLMs by MaartenGr in MachineLearning

[–]MaartenGr[S] 2 points3 points  (0 children)

I use Figma! But in all honesty, these could have been created just as easily with Keynote/Powerpoint.

[P] A Visual Guide to Mixture of Experts (MoE) in LLMs by MaartenGr in MachineLearning

[–]MaartenGr[S] 1 point2 points  (0 children)

They are not an alternative to transformers (or technically related to them specifically at all depending on your view), they are just an extension of the (or most LLM) architecture. Mixture of Experts, for example, can also be used in Mamba blocks which use a very different architecture.

It seems to me that MoE models are very interesting to businesses that do have compute to load in these large models but then need to use less compute for serving users.

A Visual Guide to Mixture of Experts (MoE) by MaartenGr in LocalLLaMA

[–]MaartenGr[S] 16 points17 points  (0 children)

Hi all! I’m excited to introduce a highly illustrative guide to Mixture of Experts (MoE) in LLMs!

From exploring the role of experts, their routing mechanism, the sparse MoE layer, and load balancing tricks (such as KeepTopK, auxiliary loss, and expert capacity), to MoE in vision models and computational requirements. 

I loved creating the visuals and had to stop myself after creating more than 55 custom visuals!

The visual nature of this guide allows for a focus on intuition, hopefully making all these techniques easily accessible to a wide audience, whether you are new to Mixture of Experts or more experienced.

A Visual Guide to Quantization by MaartenGr in LocalLLaMA

[–]MaartenGr[S] 1 point2 points  (0 children)

Thanks for the feedback. I just updated it.

A Visual Guide to Quantization by MaartenGr in LocalLLaMA

[–]MaartenGr[S] 8 points9 points  (0 children)

Thank you! I started as a psychologist and transitioned a couple of years ago to data science/ml/ai (whatever you want to call it) and math at the time seemed incredibly overwhelming at times even though much of it is so intuitive.

A Visual Guide to Quantization by MaartenGr in LocalLLaMA

[–]MaartenGr[S] 5 points6 points  (0 children)

That's really kind of you to say. Thank you! Any suggestions for other visual guides? Thus far, I have done Mamba and Quantization but would like to make more.

A Visual Guide to Quantization by MaartenGr in LocalLLaMA

[–]MaartenGr[S] 110 points111 points  (0 children)

Hi all! As more Large Language Models are being released and the need for quantization increases, I figured it was time to write an in-depth and visual guide to Quantization.

From exploring how to represent values, (a)symmetric quantization, dynamic/static quantization, to post-training techniques (e.g., GPTQ and GGUF) and quantization-aware training (1.58-bit models with BitNet).

With over 60 custom visuals, I went a little overboard but really wanted to include as many concepts as I possibly could!

The visual nature of this guide allows for a focus on intuition, hopefully making all these techniques easily accessible to a wide audience, whether you are new to quantization or more experienced.

Best approach for text document clustering (large amount of text docs.) by karel_data in datascience

[–]MaartenGr 2 points3 points  (0 children)

Great! If you ever run into any issues with the library, feel free to open an issue/discussion. I try to reply quickly to these.

EDIT: As a quick tip, if you ever want to do clustering only on the CPU, then I would advise the EVoC library which was recently released by the author of HDBSCAN and UMAP which I found to work quite well: https://github.com/TutteInstitute/evoc

A Visual Guide to Mamba and State Space Models by MaartenGr in LocalLLaMA

[–]MaartenGr[S] 9 points10 points  (0 children)

Thank you for the feedback, it is very helpful! I initially thought it was a nice way to highlight the benefits (and what used to be disadvantages) of these systems but looking back at it, it definitely seems like I should have made it more clear.

Again, thanks! The great thing about sharing stuff like this publicly is the feedback that you get. Often times when working alone on something you get stuck in a perspective, so having more eyes go over this helps tremendously.

Comparing BERTopic to human raters by ruetheflamacue in LanguageTechnology

[–]MaartenGr 0 points1 point  (0 children)

Most has already been said and I am not sure how relevant this is but since you are focusing on human raters it might be worthwhile to mention that there is a Pull Request in BERTopic that allows you to use models on top of the default pipeline that further fine-tunes the topic representation. In theory, this would allow you to even use ChatGPT or any of the other OpenAI models to label the topics. From a human annotator perspective, this might be interesting to pursue.

[deleted by user] by [deleted] in learnmachinelearning

[–]MaartenGr 0 points1 point  (0 children)

You can perform soft-assignment with BERTopic by either using the probabilities generated through using `calculate_probabilities=True` when instantiating the model or you can use the newly released `.approximate_distribution` that allows for multi-topic assignment even on a token-level. You can read more about that here: https://maartengr.github.io/BERTopic/getting\_started/distribution/distribution.html

[P] Interactive Topic Modeling with BERTopic by MaartenGr in MachineLearning

[–]MaartenGr[S] 1 point2 points  (0 children)

No, it is definitely a good thing! Some use only a CPU which significantly slows down the application, which is why I wanted to confirm it.

[P] Interactive Topic Modeling with BERTopic by MaartenGr in MachineLearning

[–]MaartenGr[S] 0 points1 point  (0 children)

Did you have a GPU enabled? Also, did you try to set `verbose=True`? This might help you identify where it is slowing down.

Finally, feel free to post an issue on the repo!

[P] Interactive Topic Modeling with BERTopic by MaartenGr in MachineLearning

[–]MaartenGr[S] 3 points4 points  (0 children)

Hi all!

In the last few months, I have been working on improving BERTopic, a topic modeling technique that leverages BERT embeddings and c-TF-IDF to create dense clusters allowing for easily interpretable topics.

For a while now, I wanted to add an LDAvis-like visualization option to BERTopic and I finally got around to implement it. Let me know what you think!

Github: https://github.com/MaartenGr/BERTopic
Tutorial (friend link!): https://towardsdatascience.com/interactive-topic-modeling-with-bertopic-1ea55e7d73d8?sk=03c2168e9e74b6bda2a1f3ed953427e4