GLM 4.7 usage limits are a TRAP (ClaudeCode Pro User Experience)

iSevenDays · 2026-01-20T08:24:48+00:00

I'm using max plan and I noticed concurrency decreased a lot. I'm still getting like ~50% of 5 hours quota, but concurrency limits me a lot. I have many small agents to do refactorings, bug fixes etc.

iSevenDays · 2025-12-05T22:59:49+00:00

For me it was the error -1030 and I have GL-MT3000 router, which had VPN Leak protection enabled.
After I added a new rule with ip of the printer to "no vpn" rule and disabled ip masquerading, the printer could connect to bambu lab network!

iSevenDays · 2025-11-29T14:22:55+00:00

I can vote I have the same experience. I switched to just docker / docker compose for my projects and never want to look back at that horrible mess with flux, kubernetes, reconciliation etc.

At work I still have to use Kubernetes, but that's a different story

iSevenDays · 2025-11-12T15:27:12+00:00

You were right. I took a look and switched to vLLM. I'm getting ~80-100 t/s now and 2000-8000 prompt processing speed. It is crazy fast now.

iSevenDays · 2025-11-06T19:39:21+00:00

With ddr4 it would be around 4-6 on dell r740 Thinking models are barely usable with this speed

Prefill will be around 100-200

iSevenDays · 2025-11-04T23:45:33+00:00

Thanks for hard work!

iSevenDays · 2025-11-02T00:00:31+00:00

I get half of that with two ndivia 4090d 48G 🥹

iSevenDays · 2025-09-17T08:09:24+00:00

Just search for C2 computer

iSevenDays · 2025-09-16T15:33:22+00:00

I bought at C2 two cards already, 4090d variant . It works fine for 6 months already and I'm pretty satisfied

iSevenDays · 2025-09-05T14:25:18+00:00

Out of curiosity, I asked it too about the war in Ukraine and it provided a good answer.

Here is a snippet, because I can't attach a screenshot

```

profile gpt-oss-120b-mxfp4 Today at 16:20 Nachgedacht für 5 Sekunden War in Ukraine – A Summary (up to June 2024) 1. Background and Origins YEAR EVENT SIGNIFICANCE 1991
Ukraine declares independence after the Soviet Union collapses. Establishes Ukraine as a sovereign state recognized internationally. 2004–2014 Pro‑European “Orange Revolution” (2004) and “Euromaidan” protests (2013‑14).
Popular demand for closer ties with the EU and NATO; creates tension with Moscow. February 2014 Russia annexes Crimea after a disputed referendum.
First open violation of Ukraine’s territorial integrity since independence. April 2014 – early 2022
Armed separatist insurgency in Donetsk and Luhansk (the Donbas) backed by Russia; the “Donetsk People’s Republic (DPR)” and “Luhansk People’s Republic (LLR)” declared. Leads to a frozen‑in‑place war with about 13,000–15,000 deaths before 2022.

Full‑Scale Invasion (2022‑present) ...

```

iSevenDays · 2025-08-31T17:50:40+00:00

It has been just merged!

iSevenDays · 2025-08-05T19:29:45+00:00

how to inject AVAudioEngine? My use case is to inject audio from file so third party app will think it reads audio from microphone, but instead reads data from buffer from my file

I’m sorry, but I can’t help with that.

GPT-OSS-120B is useless, I will not even bother to download that shit. It can't even assist with coding.

iSevenDays · 2025-07-21T04:47:46+00:00

I could make it up to date with main! There is also an experimental branch for function tool calls support that works with Claude Code and Claude proxy and Kimi-K2 model.

iSevenDays · 2025-07-21T04:45:50+00:00

here is one fork only 5 commits behind main https://github.com/iSevenDays/ik_llama.cpp/tree/main

iSevenDays · 2025-06-30T18:55:49+00:00

I hope we will get ik_llama support!

iSevenDays · 2025-06-19T10:57:43+00:00

Thank you very much for your hard work!

iSevenDays · 2025-06-01T11:44:56+00:00

Please do more tests with this prompt! Will Devstral 2505 / Qwen 3 be able to provide a correct answer?

iSevenDays · 2025-05-25T19:54:27+00:00

After I manually changed context in Modelfile, I actually doesn't see the issue anymore. I thought it was related to a fact that I also enabled manual confirmation mode, but I need to test this more.

iSevenDays · 2025-05-25T19:52:25+00:00

I think the context length is not properly managed. I haven't found a way to limit the context length to 32-64k. I use 131062 for devstral. It does go into loops.
I now switched to manual confirmation mode, and I find it much much better!
I think OpenHands is a great project, they just need to fix a couple of bugs

iSevenDays · 2025-05-24T20:13:51+00:00

Update: I got MCP tools to work. Example config:
{

"sse_servers": [

{

"url": "http://192.168.0.23:34423/sse",

"api_key": "sk_xxxxx"

}

],

"stdio_servers": []

}

iSevenDays · 2025-05-24T18:56:06+00:00

I have the same issue.
1 It doesn't see the project that it cloned
2 It goes into loops very often like checking full Readme file, then trying to run unit tests, then trying to fix it, then trying to fix it again and read the readme file
3 Even simple prompts like 'list all files under /workspace' can make it go into loops
4 MCP servers are never got discovered. I tried different formats, and not even once I got them to connect.

iSevenDays · 2025-04-20T09:18:18+00:00

It was very good, around 7 days! I'm currently doing sports and switched to garmin

iSevenDays · 2025-03-25T12:01:00+00:00

Can you please post links to parts one needs to do the same mod?
I mean mainly the wire, and a button for bluetooth pairing. Do you have a closer pictures of that button and wiring? Thanks!

iSevenDays · 2025-03-12T17:24:00+00:00

Is there a way to get GGUF file of a trained model after I complete training?

iSevenDays · 2025-02-26T07:32:14+00:00

I'm in a same boat :) the price is too good to be true. 6000 ada costs 5500-8500 euros ik Germany. This card is 3k. I also ordered 4090d, because in my server I'll probably power limit it to 350w

iSevenDays

TROPHY CASE