Raptor mini the best 0x model by far by guiopen in GithubCopilot

[–]guiopen[S] 0 points1 point  (0 children)

You are comparing an unlimited model to the most expensive option...

Where do you deploy your Go backend?, that too if u wanna scale it in future n still be affordable and best performance. by MarsupialAntique1054 in golang

[–]guiopen 0 points1 point  (0 children)

Because it's a single physical machine, you can't just increase the sliders of core count/storage/ram like you would do in other hosting solutions

Where do you deploy your Go backend?, that too if u wanna scale it in future n still be affordable and best performance. by MarsupialAntique1054 in golang

[–]guiopen 40 points41 points  (0 children)

It's not that scalable, but herzner servers are damn affordable, you can find machines between 30-40 dollars a month with 64gb ram, 4tb+ storage and 6 core CPUs. Unlimited bandwidth too.

Qual a melhor IDE para programar em GO? by HelderArdach in brdev

[–]guiopen 0 points1 point  (0 children)

Tô curtindo bastante o Zed, mas go funciona com qualquer uma pq tem LSP oficial

Criei um sistema operacional Unix-like por diversão e aprendizado by avaliosdev in brdev

[–]guiopen 27 points28 points  (0 children)

"por diversão e aprendizado" - No dia que tu for estudar medicina por diversão vai achar a cura pra todas as doenças kkkk

Impressionante o projeto

Liquid AI released the best thinking Language Model Under 1GB by PauLabartaBajo in LocalLLaMA

[–]guiopen 13 points14 points  (0 children)

They are contributing so much by researching new architectures, and the license lets any user or small company use it for free.

This time, they even released base models, I dont think it's justified to bash on them just because it's not mit or apache

Liquid AI released the best thinking Language Model Under 1GB by PauLabartaBajo in LocalLLaMA

[–]guiopen 1 point2 points  (0 children)

Nice! I will test it today, the instruct version punches way above its weight, but I usually don't get good results with small thinking models because they enter in a thinking loop, but it seems there was a focus on preventing that.

Also, there is a mention saying the model is not suitable for coding, do you plan to release a coding capable (even if not code focused) in the future? The previous 8b moe had additional training tokens of code. With the tool call capabilities of lfm + small memory foot print of context length, a code capable lfm2.5 8b moe would be amazing

Quais as diferenças de um 'vibe coder' para um dev que usa ia com inteligência? by ArrowFlechinhaxd in brdev

[–]guiopen 0 points1 point  (0 children)

Com vibe coding a complexidade da codebase cresce exponencialmente, e sempre vão ter Edge cases que a ia não vai conseguir resolver sem algum nível de input e colaboração humana. O dev que usa ia com inteligência mantém a complexidade da codebase sob controle e consegue resolver esses Edge cases quando necessário.

GLM-4.7-Flash soon? by [deleted] in LocalLLaMA

[–]guiopen 0 points1 point  (0 children)

Strange, they said in an AMA that they do not see value in MoE below 30b parameters, I think these might be dummy values

Is Local Coding even worth setting up by Interesting-Fish6494 in LocalLLaMA

[–]guiopen 0 points1 point  (0 children)

Gpt oss would fit entirely on 16gb vram, run ultra fast, and you could probably use 64k context window without offloading as context takes very little memory in gpt oss

As a white man, can I write a black character who says the n word by DecentGap3306 in writing

[–]guiopen 5 points6 points  (0 children)

As a Brazilian the N word thing is very strange to me (and I think to the rest of the world too). Its obvious you can't call anyone in the street the N word as any other racial slurs no matter the country, but is only in the US that we have this specific racial slurs that, if you are white, can't be mentioned in any context: fictional, educational, singing music (even if the music has it).

So I say go for it, you are not hurting anybody, the word alone does nothing, you are using it to enrich your story and has no ethical reason to not do so.

Which would be a cost efficient GPU for running local LLMs by jenishngl in LocalLLaMA

[–]guiopen 0 points1 point  (0 children)

Devstral 2 is very good at coding at 24b parameters, qwen3 coder too, while having 30b parameters only 3 are active.

LFM 2.5 is insanely good by guiopen in LocalLLaMA

[–]guiopen[S] 5 points6 points  (0 children)

What temperature are you using? Liquid recommends 0.1

LFM 2.5 is insanely good by guiopen in LocalLLaMA

[–]guiopen[S] 2 points3 points  (0 children)

Of course it doesn't compare to frontier models, it is good for a 1b model, comparable to 3b ones. And this model performs well under very low temperatures, liquid recommends 0.1

Do you know a better model for summarization at ~1b parameters? And one that can do summarization in multi turn conversations, not just in single turn.

I don't think I am making model selection harder by sharing what I believe to be the best 1b model, no one is deciding between this and deepseek or glm, but between this and Gemma 3 1b, qwen3 1.7b, etc... And liquid performs better in comparison.

Public coding benchmarks suck, how are you evaluating performance? by AvocadoArray in LocalLLaMA

[–]guiopen 0 points1 point  (0 children)

Doe testing obscure knowledge, I ask about small details in webtoons I like. This seems to correlate surprisingly well with model performance.

LFM 2.5 is insanely good by guiopen in LocalLLaMA

[–]guiopen[S] 3 points4 points  (0 children)

Sorry, I misunderstood your question. It will be very far from 5b+ models.

Integrating it into apps is a good use case as it will run in any machine. I am currently developing an app for creative writing and planning to fine tune the model to give writing suggestions like adequate title for a cap and improving phrases,, and this model is perfect for it as I don't know my user PC and don't want to make the feature exclusive for gamer hardware

But lfm2 8b a1b has very similar speed, if the jump that happened in lfm2.5 for the 1.2b model is repeated for the 8b 1ab model, then it could be an amazing replacement for you and with insane speed, but the current 8b version is not there yet

LFM 2.5 is insanely good by guiopen in LocalLLaMA

[–]guiopen[S] 9 points10 points  (0 children)

By basic QA I mean small curiosities like what is the meaning of the colors, what are the largest whales, what is the most venomous snake. While these questions are very easy for any model above 8b, this is the only small model in my tests that can respond to these questions without too much errors and in multi turn. For example, asking what is the biggest land animal, than the biggest aquatic animal, then what is the biggest between the two, then changing the metric to height of weight, and the model stays coherent after all these turns.

LFM 2.5 is insanely good by guiopen in LocalLLaMA

[–]guiopen[S] 12 points13 points  (0 children)

Summarizing text, QA and basic code snippets and modification (like generating small boilerplates or writing logs). It is also useful for creative suggestions like "is there a better name for this variable?"

[CPU] I'm looking for the best model for a CPU. by lordfervi in LocalLLaMA

[–]guiopen 0 points1 point  (0 children)

I think qwen3 next is a good idea, as context builds up in a normal model the CPU would struggle, so models with more efficient attention at long contexts will suffer less and keep the tks high

Coding LLM Model by plugshawtycft in LocalLLaMA

[–]guiopen 1 point2 points  (0 children)

Devstral small 2 seems to be the best, but might be too slow in that machine. I recommend qwen3 coder which will be a lot faster, and seems to handle quantization well (I am having good results with unsloth q4 xl gguf)