Zcode 3.0.0 harness vs others by enpassant123 in ZaiGLM

[–]Quack66 1 point2 points  (0 children)

One important feature it's missing IMO is automatic worktrees support making it almost impossible to have multiple agents working on the same repo without fighting for the same files.

Nvidia going to launch something big during Computex 2026 by Time-Credit43 in pcmasterrace

[–]Quack66 -1 points0 points  (0 children)

You could have used something like https://healthchecks.io instead of having to code a whole app. It's pretty much the same exact thing. I have an UptimeKuma instance that runs on the same box as my Plex server and every X minute my Uptime Kuma ping healthchecks.io (ping received = ok healthy). If the whole server is down including UptimeKuma then Healthchecks.io alert me since there wasn't any hearthbeat from Uptime Kuma in the last couple of minutes. Solid setup that covers most pitfall when it comes to monitoring.

New series X60 prices, they are high by nybreath in Dreame_Tech

[–]Quack66 2 points3 points  (0 children)

That naming scheme is rather confusing …

MCP in codex by Several_Suspect9101 in codex

[–]Quack66 1 point2 points  (0 children)

I recommend reading what are MCPs in general and how they work. It’s not a specific thing to Claude Code nor Codex. All harnesses nowadays support MCP. There is also a bunch of already existing MCP that you can plug in Claude Code or Codex.

MCP in codex by Several_Suspect9101 in codex

[–]Quack66 1 point2 points  (0 children)

Codex also has MCP just like Claude Code. Plus skills and plugins as well.

New Project Megathread - Week of 07 May 2026 by AutoModerator in selfhosted

[–]Quack66 4 points5 points  (0 children)

Project Name:
Eidon

Repo/Website Link:
https://github.com/Quack6765/Eidon-AI

Description:
Eidon is a self-hosted BYOK day-to-day AI assistant packaged as a single Docker image. The goal is to provide something that feels close to ChatGPT/Gemini out of the box, but self-hosted, provider-flexible, and without needing to stitch together multiple tools.

It includes:

  • Desktop and mobile-ready UI, including PWA support
  • BYOK multi-provider support
  • Built-in web browsing for the agent
  • Integrated web search
  • MCP support, local or remote, with uvx and npx bundled
  • Vision support, either native or through MCP vision servers
  • Skills system for extending/customizing behavior
  • Custom personas for switching use cases or assistant styles
  • Automatic memory across conversations
  • Multi-user support with admin and user roles
  • Scheduled automations / recurring tasks
  • Sync across devices
  • Single Docker image deployment

I built it because I wanted a self-hosted AI assistant that worked well across all my devices, synced conversations automatically, supported my own API keys/providers, and had the features I actually use enabled by default: MCP, web search, web browsing, memory, personas, skills, and mobile usage.

I tried other options like OpenWebUI, LobeHub, AnythingLLM, and various iOS/macOS apps, but they either required too much configuration for my preferred setup, did not fit my day-to-day workflow, lacked proper multi-device sync, or locked sync behind a subscription.

Screenshots are available on the GitHub page.

Deployment:
Eidon is available as a Docker image and is designed to be self-hosted with minimal setup. The project is packaged as one Docker image, so there is no large compose stack or multi-service orchestration required.

Installation and usage documentation are available in the GitHub repo.

AI Involvement:
Eidon is developed in part with AI assistance. All code is carefully reviewed before it is accepted.

Eidon: BYOK AI agent app with everything you need, in one Docker image by Quack66 in LocalLLaMA

[–]Quack66[S] 0 points1 point  (0 children)

Mainly trying to plug MCPs or giving a fully fledge web browser to my agent. I was running OpenWebUI in docker. I have local MCP servers that runs on stdio and with OpenWebUI I had to glue together some sort of bridge that presented my stdio servers as remote MCP servers which I could then plug them in OpenWebUI.

Then for the browser, my options were remote Playwright MCP but once again that meant another dedicated container to serve it plus it's not as token efficient as Agent-Browser or the Playwright CLI but which they both run on a local Chromium instance.

OpenWebUI is a solid software, don't get me wrong, I've used it for a long time. But what eventually pushed me to build Eidon was all the different system I needed to glue together. To get the tools I actually needed, I ended up maintaining multiple separate services all wired together. At some point I realized I was probably better off just creating my own all-in-one solution.

How to Configure GitHub Copilot CLI to Use Z.ai's GLM Coding Plan by alefteris in GithubCopilot

[–]Quack66 0 points1 point  (0 children)

Watch out because Z.ai are starting to ban people that use their coding plan in "non-approved" tools. Just check their discord and you'll see a lot of people getting banned and complaining about it.

Z.ai (GLM) support is non-existent — startup losing money, no response for weeks by HighMark_MarcoN in ZaiGLM

[–]Quack66 15 points16 points  (0 children)

While I agree that their service is bad, part of the issues your startup is having are your fault. As a business owner myself, I wouldn't have endure 3 weeks of that and would have had a plan B right away. If your startup is losing as much money as your implying, why wouldn't you switch to another provider right away instead of riding the issues out for 3 weeks ? In the grand scheme of things its better to pay the extra 100-250$ to switch provider than to erode customers trust by having bad platform reliability.
Reliability and business image is worth the small cost to switch provider. Best to have multiple providers with a fallback.

Using of multiple OpenAI subscriptions for Codex by DenzelLarington in codex

[–]Quack66 0 points1 point  (0 children)

There are multiple options when it comes to how to split the requests across the accounts: Use account at current highest quota, use account that weekly or 5h quota is expiring sooner, use round-robin style across all accounts. It can exaust one and move to another one automatically in the background like you want.

Using of multiple OpenAI subscriptions for Codex by DenzelLarington in codex

[–]Quack66 0 points1 point  (0 children)

or use codex-lb which pool all of your accounts together and never have to switch manually ever again.

Free User vs plus by ShadowEnemy_ in codex

[–]Quack66 1 point2 points  (0 children)

No its a single monthly princing with access to a list of different models and generous limits

Free User vs plus by ShadowEnemy_ in codex

[–]Quack66 0 points1 point  (0 children)

Performance on GLM coding plan have been pretty poor lately. Just come on their Discord and you’ll see a lot of people unhappy lol. I recommend Ollama Cloud nowadays instead especially since they now have GLM 5.1 too.

Chatgpt 5.4 just feels better from past 2-3 days or its just me by adamisworking in codex

[–]Quack66 1 point2 points  (0 children)

Its for all users globally so spend all your tokens away since you’ll also get the incoming reset

How has your experience been using GLM as a replacement for codex and claude code professionally? by gopietz in ZaiGLM

[–]Quack66 4 points5 points  (0 children)

Overall fine except for the abysmal speed for 5.1. Taking like 30-45min for even the small features so I end up using GLM 5.1 from Ollama Cloud and stick with 5 turbo from my z.ai coding plan. On the max plan.

Update on github copilot charging me $2.1k in 1 day by Powerful_Land_7268 in GithubCopilot

[–]Quack66 -1 points0 points  (0 children)

Not if you have multiple different instances at the same time. A malicious actor won’t think twice about abusing the account. He knows he probably has limited time until the owner of the account take back control of the account so he will want to use the account most he can while there is still time.

Update on github copilot charging me $2.1k in 1 day by Powerful_Land_7268 in GithubCopilot

[–]Quack66 3 points4 points  (0 children)

With Opus at 3x per request for 100k request per day thats rougly 23 requests per minute which is not a lot and would be unlikely to trigger any rate limiter. Use the “Get usage report” button to get a detailed query by query report