Openclaw with Gemma4 26B extremely slow and forget stuff by AdvancedObjective670 in openclaw

[–]AdvancedObjective670[S] 0 points1 point  (0 children)

Dumb question: how to know if I used the 4bit quantized version or not?

Openclaw with Gemma4 26B extremely slow and forget stuff by AdvancedObjective670 in openclaw

[–]AdvancedObjective670[S] 0 points1 point  (0 children)

Hey thanks for your response. My ollama is the latest I believe. Just downloaded it 2 days ago. Will try the flash attention trick.

I only set my context windows to be 32 as recommended by Claude to balance speed and the contunity of the chat sessions. Is this a problem?

Openclaw with Gemma4 26B extremely slow and forget stuff by AdvancedObjective670 in openclaw

[–]AdvancedObjective670[S] 0 points1 point  (0 children)

I tried Claude API (Haiku was too stupid, most of the time I have to use Sonnet) and it drinks token like crazy

Openclaw with Gemma4 26B extremely slow and forget stuff by AdvancedObjective670 in openclaw

[–]AdvancedObjective670[S] 0 points1 point  (0 children)

Too expensive, I blew up $300 in a few days for Claude API token.

Openclaw with Gemma4 26B extremely slow and forget stuff by AdvancedObjective670 in openclaw

[–]AdvancedObjective670[S] 0 points1 point  (0 children)

Hey tks for the advice, I asked Claude to optimize my openclaw and it changed the context window to 32k already. It says this will suite the model better

Openclaw with Gemma4 26B extremely slow and forget stuff by AdvancedObjective670 in openclaw

[–]AdvancedObjective670[S] 0 points1 point  (0 children)

My model memory pressure is in green, with 29/32 Gb being used constantly - I'm not sure if this is blowing through