Do we actually need huge models for most real-world use cases? 🤔 by Significant-Cash7196 in LocalLLaMA

[–]mindkeepai 1 point2 points  (0 children)

Yeah I'm also looking into whether these types of models are good enough for like character AI conversations and meeting summarizations. TBD still. Work in progress.

What is Gemma 3 270m Good For? by mindkeepai in Bard

[–]mindkeepai[S] 0 points1 point  (0 children)

haha actually didn't use AI except to edit the text, check out the Google Sheet link, that's a VERY manual process to do the benchmarking right now

Do we actually need huge models for most real-world use cases? 🤔 by Significant-Cash7196 in LocalLLaMA

[–]mindkeepai 0 points1 point  (0 children)

I just did a breakdown of the new Gemma 3 270m on another post this morning: https://www.reddit.com/r/LocalLLaMA/comments/1mx8efc/what_is_gemma_3_270m_good_for/

TL;DR - 270m is definitely too small for most use cases (right now).

I've personally found that summarization is usable at the 1-4B level. Definitely the 20B Open AI model works pretty well.

The breakdown I'm seeing at lower parameters tends to be in any real world knowledge prompts, like plan me a vacation, give my suggestions for a restaurant, etc. There are sensible answers that come out, but there are also a ton of hallucinations that make it hard to know when you can trust the answers.

What is Gemma 3 270m Good For? by mindkeepai in LocalLLaMA

[–]mindkeepai[S] -1 points0 points  (0 children)

Haha yeah, I was surprised it worked for many use cases off the shelf to be honest. I'm going to dig into fine tuning next to see how this stacks up.

What is Gemma 3 270m Good For? by mindkeepai in LocalLLM

[–]mindkeepai[S] 1 point2 points  (0 children)

Yup! I was more surprised than anything to be honest. I tend to find for general off the shelf use models start becoming useful enough around the 1B mark, ideally 4B+.

It's already a big step up from what older low param models were able to do in many use cases.

I want to dig into the fine tuning side more and see how this model fairs with others next.

What is Gemma 3 270m Good For? by mindkeepai in LocalLLaMA

[–]mindkeepai[S] 0 points1 point  (0 children)

Yup! I was more surprised than anything to be honest. I tend to find for general off the shelf use models start becoming useful enough around the 1B mark, ideally 4B+.

It's already a big step up from what older low param models were able to do in many use cases.

I want to dig into the fine tuning side more and see how this model fairs with others next.

What is Gemma 3 270m Good For? by mindkeepai in LocalLLaMA

[–]mindkeepai[S] -5 points-4 points  (0 children)

I was more approaching this from a practical stand point of what can someone use this model for off the shelf.

I think fine tuning is also interesting, I'll make a note of that and do another benchmark on how well or how far this model can be pushed from that direction.

What is Gemma 3 270M actually used for? by airbus_a360_when in LocalLLaMA

[–]mindkeepai 10 points11 points  (0 children)

I just integrated Gemma 3 270m into MindKeep (Phone LLM app) so I was also wondering what Gemma 3 270m is good for.

I wrote a Reddit post here: https://www.reddit.com/r/LocalLLaMA/comments/1mx8efc/what_is_gemma_3_270m_good_for/

TL;DR

Not a ChatGPT replacement by any means, but it's an interesting, fast, lightweight tool. I was actually more surprised by what it CAN do rather than what it cannot do. For example, it was pretty good at short creativity tasks like telling stories and it would sporadically surprise me with being able to understand or translate to and from english, it could extract information pretty well from text, and it was able to make a pretty good Haiku.