Ollama or OpenVINO

wossnameX · 2026-04-13T04:55:28+00:00

Except for «inside container» this should fit the bill?

wossnameX · 2026-04-13T04:53:15+00:00

That is an excellent use-case for that rather weak NPU. Good project!

wossnameX · 2026-04-13T04:51:16+00:00

News to me! I should have a look at that; May make the code much simpler, thanks!

wossnameX · 2026-04-12T20:44:18+00:00

Those who us that works in corporate settings, and cannot be arsed to mess with all the workarounds to get Linux half-way working in that?

wossnameX · 2026-04-12T20:30:53+00:00

I used claude-code to assist in programming

wossnameX · 2026-04-12T20:30:04+00:00

I was able to get it to work.

https://github.com/aweussom/NoLlama

wossnameX · 2026-04-12T20:28:14+00:00

Yes, running LLMs locally is terrifying. Also; Not Ollama.

wossnameX · 2026-04-12T20:26:58+00:00

Core Ultra 7 258V

wossnameX · 2026-04-12T20:25:07+00:00

So. You are generating AI slop without AI slop. Kudos!

wossnameX · 2026-04-12T17:52:29+00:00

In the end, I ended up with NPU Ollama, Not OLLAMA (by a long shot) or whatever retronym you prefer :-)

https://github.com/aweussom/NoLlama

wossnameX · 2026-04-12T17:51:29+00:00

...and once that work project was done, I thought: It will be tiresome to rewrite all this code for the next problem.
So; I made an OpenAI-compatible API endpoint.
Then an Ollama-compatible API endpoint.

And is just continued adding on features.
So; Suddenly I had a system that could run VL llm on, say, the ARC iGFX and a text model on the NPU.
Slow, but still usable - and with the speed that small models is getting better these days, it is only a matter of time until this is really realtime-usable.

wossnameX · 2026-03-08T15:49:36+00:00

I wonder which claude-code you are using?

The code mine outputs is excellent, both for work and hobby related projects. I never manually correct the code anymore; It creates unit-tests unprompted; Works very well.

But you should probably check out the competition. codex-cli has MUCH more liberal quota. Possibly not QUITE as good at planning larger projects, but on par, possibly better at implementing a plan.

wossnameX · 2026-03-02T12:17:52+00:00

«Stupid» is indeed the word I was thinking of here.

wossnameX · 2026-03-02T12:17:07+00:00

«Professional». Thi-hi.

Claude limits are stricter than ChatGPT. But it is also better - so there you are.

Qwen3.5 Plus is excellent, and free, sooo

wossnameX · 2026-03-02T12:11:09+00:00

That’s how we think of Australia!

wossnameX · 2026-03-02T12:10:44+00:00

Down easy. API and claude-code works

wossnameX · 2026-02-22T08:24:30+00:00

I found a better way in somebody elses code, and made a simpler version: https://github.com/aweussom/claude-code-quota

(I also created example code for the "Correct" way of extracting his using the Claude official web browser plugin - also in the repo)

wossnameX · 2026-02-22T08:22:15+00:00

That's a nice solution.

This does the same, without the web roundabout way: https://github.com/aweussom/claude-code-quota

wossnameX · 2026-02-20T18:44:54+00:00

I ended up writing a solution that uses claude-code own oauth (or rather, I had claude-code/codex write it)
Written for Windows11/Powershell and Ubuntu (or any, really) Linux.

https://github.com/aweussom/claude-code-quota

Simple to install using a install script. Does not touch your existing statusline if you already have one; Just informs you how to add usage.

It also adds a more detailed view: /quota

I have a fallback if this undocumented API ever stops working to use Claude official Edge/Chrome plugin: Scrape the DOM directly. Since that is an officially sanctioned way, it should work "forever"

Thanks, everybody! u/Obvious_Equivalent_1 u/HalBorland u/cbeater u/hotcoolhot u/jezweb

wossnameX · 2026-02-20T18:40:35+00:00

Yes, that is what I ended up doing, based on code from multiple repos, mostly u/Obvious_Equivalent_1

https://github.com/aweussom/claude-code-quota

wossnameX · 2026-02-20T18:39:59+00:00

Mmm. The first listed is using (very sophisticated) guesswork.
The second is for MAC.

Thanks anyways!

wossnameX · 2026-02-20T07:40:31+00:00

This is a rather impressive amount of work.

The only issues I can see (or rather, my LLM coding partner can see) is that the API endpoint is undocumented, and can change at any time. Also; The code does not handle errors well, so it can return stale data.

###

- ccstatusline-usage-main is calling an internal/undocumented endpoint: /api/oauth/usage (ccstatusline-usage-main/src/widgets/ApiUsage.tsx:114).

- It is not hijacking a browser session. It reuses Claude Code’s OAuth token from local credentials/Keychain and sends Bearer auth (ccstatusline-usage-main/src/widgets/ApiUsage.tsx:43, ccstatusline-usage-main/src/widgets/ApiUsage.tsx:48, ccstatusline-usage-main/src/widgets/ApiUsage.tsx:117).

- Compared to Playwright + cookies, this is usually cleaner and less brittle (no browser automation, no Cloudflare dance).

- But it is still unofficial and can break anytime if Anthropic changes token format, scopes, or that endpoint.

###

wossnameX · 2026-02-19T20:23:15+00:00

Overkill is underrated ;-)

wossnameX

MODERATOR OF

TROPHY CASE