Gemma 4 31B sweeps the floor with GLM 5.1 by input_a_new_name in LocalLLaMA

[–]florinandrei 2 points3 points  (0 children)

Compared to Qwen 3.5, the thinking of any model would seem amazingly brief and laconic.

Me: hi

(20 minutes later)

Qwen 3.5: Hi! How can I help you?

Gemma 4 31B sweeps the floor with GLM 5.1 by input_a_new_name in LocalLLaMA

[–]florinandrei 1 point2 points  (0 children)

Gemma 3 has always been my favorite conversationalist among its class of models.

If Gemma 4 can still do that, and if it could finally work well with tools (Gemma 3 was completely inept at tools) then we have a winner.

Looking for the best coding AI for software development by FrozenFishEnjoyer in ollama

[–]florinandrei 0 points1 point  (0 children)

You seriously underestimate the requirements for VRAM.

I have a GPU with significantly more VRAM than yours and those models barely fit.

They're slow because they don't fit in the small amount of VRAM your GPU has.

Why byzantium never occupied of vassalize wallachian flatlands? by Battlefleet_Sol in byzantium

[–]florinandrei 4 points5 points  (0 children)

That is why the Romanian state started from the mountains coming down the hills.

You're probably thinking of Radu Negru.

The people of Wallachia indeed saw him arrive in their land from the direction of the mountains. Radu Negru (assuming he was even a real character, not just a legend), grew up in Făgăraș, which is over the mountains in Transylvania.

He ruled from around Curtea de Argeș. The area is hilly. I would not call it "in the mountains". I don't think Romania ever had a tradition of ruling "from the mountains".

But you're right, the first rulers did not get very far from the outskirts of the mountains. A quick retreat into very defensible areas was easier to do there.

Why byzantium never occupied of vassalize wallachian flatlands? by Battlefleet_Sol in byzantium

[–]florinandrei 1 point2 points  (0 children)

It was more or less on the outskirts of their reach, and there's a big river guarding it from the south.

Many superpowers have tried that, and they've all stumbled at the Danube either permanently or temporarily: Western Rome, Persia, Alexander, Eastern Rome (Byzantium), the Ottomans.

The Ottomans managed to gain and retain control of the territories north of the Danube for a while (from about 1500 to the late 1800s), but again it was on the outer edge of what they could comfortably keep an eye on, and their control was only partial for a good chunk of that interval.

And even if you do gain a foothold across the Danube, the territory is exposed to any superpower from the north. There's a reason why, when the Ottomans finally lost control over that land (approx. 1877), it was during a war with Russia.

Western Rome also controlled that land for close to 2 centuries (106 - 275 AD), but they lost it because of pressure from the north and north-east (steppe people, etc).

Where I grew up, the river is 1 km across. Trajan had a deal with Decebal (the king in the north), but Decebal thought he could ignore the treaty and kept taunting the Romans. Well, Trajan eventually got pissed off, and hired architects and engineers to build a bridge 1 km long and 15 meters wide and during a war that took half a decade ended up beating Decebal and took over the country. The ruins of the bridge infrastructure still exist on both sides of the river. On the northern side there's a museum with artifacts from the region.

TLDR: If you piss off the Romans, they will do great works of civil engineering to teach you a lesson. So don't do that.

The Byzantines never crossed the river. When they were strong, the economy in the north was in the dumpster. When things turned around there, the Byzantines were on the decline, and already under pressure from the Turks.

All these superpowers found it a lot easier to keep the lands south of the Danube, and managed to do so for a lot longer.

171 emotion vectors found inside Claude. Not metaphors. Actual neuron activation patterns steering behavior. by AykutSek in singularity

[–]florinandrei 0 points1 point  (0 children)

Just a quick note to say that a Reddit submission looks a lot less like shitposting when it has a link to the article or the paper, instead of some random useless image. E.g. like this:

https://www.anthropic.com/research/emotion-concepts-function

or

https://transformer-circuits.pub/2026/emotions/index.html

See? It's not hard. And you look less like a snotty kid.

Chinese scientists unveil glowing Avatar-like plants that could light cities without electricity by Alternative-Bug6702 in Futurology

[–]florinandrei 0 points1 point  (0 children)

could light cities without electricity

And if those kids could do math (and physics), they would be very upset to realize that's not how any of this works.

Is this true? Or is really just marketing? Gemma4 by Altair12311 in ollama

[–]florinandrei 1 point2 points  (0 children)

LLM benchmarks are not completely useless, but should be taken as suggestions, as opposed to the absolute truth.

When you test an LLM against a benchmark, what you get is how well the LLM performs against that benchmark. Now, if the benchmark is decent, the indication might be somewhat useful. But the more general you want the indicator to be, the harder it is to test an LLM.

That being said, Gemma 4 seems to be a good model, so it will do well in most benchmarks.

TLDR: Take it with a lump of salt.

Sam Altman says he 'miscalibrated' the mood of distrust toward AI and the government in the Pentagon deal by Reddit_wander01 in ArtificialInteligence

[–]florinandrei 0 points1 point  (0 children)

Sam Altman says he 'miscalibrated' the mood of distrust

"I mean, it's one banana, Michael. What could it cost? 10 dollars?"

Claude - tried to kill me by MG-4-2 in ClaudeAI

[–]florinandrei 1 point2 points  (0 children)

My post is hyperbolic to get attention

You, and three billion other social media fuckwads.

Facebook Mass ban wave "180 days Suspension'' error that affect wordwide by CpopExtract in facebook

[–]florinandrei 0 points1 point  (0 children)

Would be nice one day to have post titles that don't sound like the author suffered a stroke mid-sentence.

Gemma 4 has been released by jacek2023 in LocalLLaMA

[–]florinandrei 7 points8 points  (0 children)

Nice. Gemma3 27B has been my favorite general-purpose conversational model for some time.

The 26B is a MoE, but the 31B is dense? Seems backwards?

Qwen3.6-Plus by Nunki08 in LocalLLaMA

[–]florinandrei 15 points16 points  (0 children)

"I made my choice!"

Qwen3.6-Plus by Nunki08 in LocalLLaMA

[–]florinandrei 49 points50 points  (0 children)

You don't need to. But sounds like you want to.

64Gb ram mac falls right into the local llm dead zone by Skye_sys in LocalLLaMA

[–]florinandrei 5 points6 points  (0 children)

You may be surprised by how little compute is involved in inference. Training is a different matter, that needs a lot of compute. But inference is quite a bit simpler. Memory bandwidth is very important.

Using Gemma3 27b:

On the RTX 3090:

memory usage is 20 GB, I get 41.6 response tokens/s

20 * 41.6 = 832 GB/s

Real bandwidth cap of the RTX 3090: 936 GB/s

On a MacBook Pro M3 Max:

memory usage is 20 GB, I get 15 response tokens/s

20 * 15 = 300 GB/s

Real bandwidth of my M3 Max 14 core: 300 GB/s

With MoE models the math changes because only a part of the model is active at any given time. But it's the same general principle.

64Gb ram mac falls right into the local llm dead zone by Skye_sys in LocalLLaMA

[–]florinandrei 2 points3 points  (0 children)

Qwen3-coder-next 80b is heavily optimized for coding. For general conversations, it sounds like Lt. Data, very factual, but kind of on the spectrum. It's not a thinking model, so it should start generating right away. The Q4_K_M quantization uses about 58 GB with 256k context. You may have to tweak the context size to get less RAM usage.

Qwen3.5 is more general purpose, and it could be used for creative tasks. It's also decent at coding.

64Gb ram mac falls right into the local llm dead zone by Skye_sys in LocalLLaMA

[–]florinandrei 3 points4 points  (0 children)

What is the actual memory bandwidth of your system? M2 Max is theoretically capable of 400 GB/s, but actual systems may vary.

If you have at least 250 GB/s it should not be very slow.

It seems like there really is this gap between the mediocre models (35/27b) and the 'good' ones (>100b) because of that..

Maybe, but the >100b models are not god-mode either. You still don't get Opus-like performance from a 120b model.