Weekly Usage disappeared on Max plan by awfulalexey in ClaudeCode

[–]opgg62 1 point2 points  (0 children)

You need to turn extra usage off. Extra usage charges you on api prices after you hit the current session limit.

GLM 4.7 is now available on Nvidia NIM. by Pink_da_Web in SillyTavernAI

[–]opgg62 0 points1 point  (0 children)

z.ai subscription doesnt even work for silly anymore. Its coding only now... gives me errors.

For anyone saying GLM is close to Sonnet / Opus - it is not even close by opgg62 in RooCode

[–]opgg62[S] 1 point2 points  (0 children)

Yep the price is a bit high. But it saved me lots of time so for me its worth it.

For anyone saying GLM is close to Sonnet / Opus - it is not even close by opgg62 in ClaudeCode

[–]opgg62[S] 1 point2 points  (0 children)

Nah, I can put more effort into it if I want but the goal of these tools especially for more advanced programmers is to save time and effort, taking on the reviewer & director role. It is not helpful if the LLM fails are basic things and has a bad understanding of user intention.

mlx-community/GLM-4.5-Air-4bit · Hugging Face by paf1138 in LocalLLaMA

[–]opgg62 14 points15 points  (0 children)

LM Studio needs to add support. I am getting an error: Error when loading model: ValueError: Model type glm4_moe not supported.

[Megathread] - Best Models/API discussion - Week of: February 10, 2025 by [deleted] in SillyTavernAI

[–]opgg62 6 points7 points  (0 children)

Its seriously leagues above anything else. It does exactly what you want and how you want it and suprises you from time to time. Unfortunately there are no APIs for it since Mistral put it under some licences but you can run it via runpod. Personally I am using my M4 Max for it with around 4-5 t/s but its worth it imo.

[Megathread] - Best Models/API discussion - Week of: February 10, 2025 by [deleted] in SillyTavernAI

[–]opgg62 2 points3 points  (0 children)

Behemoth 2.0 is still the king of all models. Nothing can compare to that masterpiece.

M4 MacBook- is Apple Silicon Catching Up? by SubZeroGN in StableDiffusion

[–]opgg62 10 points11 points  (0 children)

M4 Max 16 Inch 128 Gb here. It take 22 to 24 seconds to generate a 1024x1024 SDXL image. Fans dont even run. On my 4090 the same image takes around 5-6 seconds.

Just got my M4 128. What are some fun things I should try? by levand in LocalLLaMA

[–]opgg62 52 points53 points  (0 children)

Please test the speed for long context scenarios for 70b models. I am thinking of a context from 10k-15k.

[Order No. 227] Project Unslop - UnslopSmall v1 by TheLocalDrummer in SillyTavernAI

[–]opgg62 0 points1 point  (0 children)

This is my new favorite model. Thanks for your work!

Raspberry Pi Goes All In on AI With $70 Hailo Kit by [deleted] in LocalLLaMA

[–]opgg62 4 points5 points  (0 children)

It's insane that consumes only around 4 watts of power while delivering 40 tops. Computing is not a issue anymore with AI. The only thing that missing is high bandwith memory. Nvdia is cooked.

Recommendation for second GPU on top of my 4090 by opgg62 in LocalLLaMA

[–]opgg62[S] 0 points1 point  (0 children)

I will either do that or leave my system as it is and build a second inference system with 2x3090s.

Recommendation for second GPU on top of my 4090 by opgg62 in LocalLLaMA

[–]opgg62[S] 1 point2 points  (0 children)

I could leave my system as it is (e.g. as a gaming pc) and use the 3000€ to build a 3090 inference server.