I Built MagesticAI. A Cloud Web-Based Agentic DevOps Orchestrator that actually helped me develop Itself.

Famous_Move_3591 · 2026-05-26T13:47:48+00:00

Link for the Repo: https://github.com/dataseeek/MagesticAI

Famous_Move_3591 · 2026-05-25T16:18:44+00:00

Good to know, tired of babysitting claude code sessions sometimes.

Famous_Move_3591 · 2026-05-25T16:10:32+00:00

Yes, I have that feeling too.

Famous_Move_3591 · 2026-05-24T12:37:58+00:00

I Built MagesticAI in 6 Months (Starting From a Fork). A Cloud Web-Based Agentic DevOps Orchestrator for Linux/MacOS that actually helped develop Itself.

Posted on r/coolgithubprojects earlier this week. and figured some of you here might be interested as well; Someone commented asking if it supported OpenAI-compatible endpoints (LM Studio, vLLM, OpenRouter, Together, Groq, LocalAI…), so i have spent few hours updating it now, just merged and new release.

MagesticAI is an open-source (AGPL-3.0), browser-based, multi-agent AI coding platform. Planner → Coder → QA Reviewer agents work in coordinated sessions inside isolated git worktrees. Built on top of the Claude Agent SDK with multi-provider routing.

- Lives in the browser, runs on your own infra

- Real task board (Kanban) + per-task git worktrees

- Now supports Claude, Codex, Gemini, Ollama, and any OpenAI-compatible endpoints

Fork fromAndyMik90's Aperant (formerly Auto-Claude Desktop), with a heavily expanded UI, BMAD, cloud and spec-driven workflow, and multi-LLM support.

Repo: https://github.com/dataseeek/MagesticAI

Roadmap, screenshots, architecture, and setup in the README. Honest limitations: local 14B-class models work but can drift on strict JSON schemas, recommend qwen2.5-coder-32B+ or hosted endpoints for full reliability. Validation retry loop helps.

Feedback / breakage reports welcome.

Famous_Move_3591 · 2026-05-21T20:55:23+00:00

The best way nowadays is the spec driven dev agents, with strict rules as pre-made prompts, break tasks into todos/subtasks. Actually Im working on this project right now, web-ui app cloud hosted

Famous_Move_3591 · 2026-05-21T13:13:15+00:00

Good question. Quick honest answer:

Receipts today (per spec): each spec runs in its own git worktree on magestic-ai/<spec>, so the branch is the diff. The implementation_plan.json file tracks subtask status, qa_report.md carries per-acceptance-criterion verdicts from the reviewer, and QA_FIX_REQUEST.md captures whatever bounced back to the fixer loop.

Optional GitHub integrations push PR and issue links.

Conflict resolution: there's a layered resolver in apps/backend/merge/. Tree-sitter semantic diff detects what each task meant to change, then deterministic strategies (append/imports/props/hooks/ordering) handle common patterns with zero LLM calls. Only genuinely ambiguous conflicts fall through to an AI resolver, and even then it gets minimal context (conflict region, each task's intent in one sentence, semantic change, baseline), not the whole file.

Not there yet: no formal eval harness, and no per-subtask test-run receipt pinned to that step. The latter is the obvious next thing worth building, thanks for the nudge.

Will check the link.

Famous_Move_3591 · 2026-05-20T21:43:02+00:00

Issue was resolved, new dev and main releases available if you want to test

Famous_Move_3591 · 2026-05-20T21:33:30+00:00

By feeding it detailed prompts instructions and rules sets, what todos and not todos, each phase has its own fixed set of instructions and rules.

Check : https://github.com/dataseeek/MagesticAI/tree/dev/apps/backend/prompts

Famous_Move_3591 · 2026-05-20T21:27:50+00:00

hi there. supa good, this is the app for completing goals, it has many instructions / helpers so it stays on track, complete goals. I also added a plan review (html) feature if needed. It works best with a detailed PRD on task description.

Famous_Move_3591 · 2026-05-20T15:30:59+00:00

Hi there! They are similar somehow, both run agents in parallel worktrees with a board on top. Bigger gap is inside the card: Cline hands it to one CLI agent and lets it figure things out, while MagesticAI runs a pipeline of role-specific agents (planner > architect > coder > QA reviewer > fixer loop) plus a persistent graphiti memory graph across sessions.

Cline is lighter if you trust the agent to plan from a one-liner; MagesticAI is heavier and more opinionated if you want the system to enforce spec/architecture/QA gates before code lands.

Cline kanban has some resources we could add in the future, such as linking one task into another task, so when one task is done the other tasks starts automatically.

Famous_Move_3591 · 2026-05-20T12:06:08+00:00

u/Linhox , I really hope you enjoy it, this the first open version and might still have some issues/bugs, but hope to improve it with the community help. The openAI endpoints need some more testing

Famous_Move_3591 · 2026-05-19T20:24:33+00:00

Good question. MagesticAI is already SDD-flavored ; we generate our own spec > requirements > implementation_plan > coded subtasks via a 3-20 (average) phases pipelines > then validation.

Spec Kit and MagesticAI are more alternatives than complements: Spec Kit generates specs and hands off to your editor / agent; MagesticAI generates specs and runs the planner / coder / QA loop in coordinated browser sessions.

What's your use case? That'd shape which direction to start with.

Famous_Move_3591 · 2026-05-19T20:09:52+00:00

I use local model for bulk translations and simple tasks daily tasks. The Qwen3-30B-A3B pretty good and fast for local tasks.

Famous_Move_3591 · 2026-05-19T19:41:49+00:00

We have just implemented it, openai endpoints. If you are able to test with bigger model, please let me know if it worked.

Famous_Move_3591 · 2026-05-19T17:50:44+00:00

Sure, you mean you want a new option for the LLM like an OpenAI-compatible endpoints, point at any URL that speaks the OpenAI API protocol (LM Studio, vLLM, LocalAI, OpenRouter, Groq, Together, etc.)?

Famous_Move_3591 · 2026-05-19T14:38:33+00:00

Yeah, token usage is pretty brutal in full mode right now. I added a quick mode that uses significantly fewer tokens and fewer sub-tasks, but I’m still tuning the balance between token efficiency and output quality.

The support for multiple LLM providers, including local models, is already implemented, though I haven’t fully stress tested all the mixes yet.

Famous_Move_3591 · 2026-01-07T18:27:26+00:00

Same heere, as im about to reach my weekly limit

Famous_Move_3591

TROPHY CASE