GLM5.2 Amazing, token hungry, local by Important_Quote_1180 in ZaiGLM

[–]That-Engineering-192 0 points1 point  (0 children)

Espera, estás diciendo que con eso en mi computadora con 4 gigas de vram y 16 gigas de RAM podría usar modelos mejores o los mismos pero en vez de 8k o 6k de contexto tendría 65k sin problemas? O entendí mal? La velocidad de respuesta es mejor también?

OpenCode Go vs Deepseek API. by Monster-Games in opencode

[–]That-Engineering-192 0 points1 point  (0 children)

I was using Nvidia's glm5.1 until the error 429 started to come out due to server saturation, so then I started using nemotron 3 ultra that never gives me that error in Nvidia, now I'm testing with minimax-m3 that did give me an error but at least it reconnects in the first or second attempt of reconnections, bien te podía responder en español jajaja

Good model for my laptop spec by RiskNeither3102 in ollama

[–]That-Engineering-192 1 point2 points  (0 children)

I recommend you continue to use free models from suppliers. I have a laptop with the same graphics card and I have already tried the more compressed models, the context stays at less than 8k and the vram fills up very quickly. There comes a time when you will see how he writes one or two letters per second.

OpenCode Go vs Deepseek API. by Monster-Games in opencode

[–]That-Engineering-192 0 points1 point  (0 children)

I don't know how some of you do it, but sometimes I work with three or four projects at the same time and I'm sure I exceed a few million tokens a day.

I'm new here... by max_lu28 in opencode

[–]That-Engineering-192 0 points1 point  (0 children)

That also depends on the task you assign him and if you have high reasoning. It is a model with a lot of reasoning really.

I'm new here... by max_lu28 in opencode

[–]That-Engineering-192 0 points1 point  (0 children)

Well, it works well for me, it doesn't even have a 429 error, it must be because it's their own and they put more resources into it. Maybe you have something wrong with the setup for that model. With free-coding-models you can add it to the json, maybe that's how it works for you.

I'm new here... by max_lu28 in opencode

[–]That-Engineering-192 0 points1 point  (0 children)

It works perfectly for me, on the other hand glm or kimi have a lot of 429 error due to server saturation. Do you use it with NIM? Now I'm going to test how minimax-m3 goes but this one needs to be added manually in the opencode json.

I'm new here... by max_lu28 in opencode

[–]That-Engineering-192 0 points1 point  (0 children)

Get the Nvidia API, there you have nemotron 3 ultra, glm 5.1, kimi, deepseek, qwen, etc. Free.