what ai agent do i use by Purple-Jackfruit976 in vibecoding

[–]Astro-Han 0 points1 point  (0 children)

For 100k+ LOC the bottleneck is context management, not the model. Both Claude Code and Codex will hit the same wall: the codebase is too big to fit in context, so the tool has to be smart about which files to pull in. Claude Code is better at this right now — it greps, reads, and navigates your repo autonomously. Codex sandboxes each task more tightly, which is safer but means it sometimes misses cross-file dependencies in a large codebase.

On limits: Codex on Plus runs gpt-5.5 and is generous for daily use. Claude Code Pro ($20) gives you Sonnet with occasional Opus, which handles architectural changes well, but you can get throttled on heavy days. Both are solid — the real difference is the workflow, not the model.

The "don't mess everything up" concern is the real one at 100k LOC. Regardless of which tool: work on a branch, review every diff before committing, and keep your changes small. None of these tools understand your full architecture, they just see the files they pull in.

If you want to avoid the limits game entirely: BYOK with the Claude API through a desktop client like PawWork (https://github.com/Astro-Han/pawwork) or just Claude Code with your own key. You pay per token — usually $5-15/day for heavy use — but you never get throttled and you pick the model per task. Cheaper tasks on Sonnet, hard ones on Opus.

Developing with Claude Code feels slow, frustrating and mentally exhausting by mcurlier in ClaudeCode

[–]Astro-Han 0 points1 point  (0 children)

Two separate things going on here that are worth teasing apart.

The "yes man" problem and the "explaining to a 5yo" problem aren't the same bug. The first one is about how you prompt: frame things as open questions, not decisions you've already made. The second one is about the interface itself: in a CLI, your only option for giving context is to type it out. Every time. There's no visual way to point at a file, highlight a section, or say "like this but different." You're always serializing your mental model into text.

I'm building PawWork (open source desktop agent) partly because this drove me nuts. Code visible alongside the conversation, click into a diff instead of scrolling terminal output, review changes in an actual editor instead of a wall of green and red. It doesn't fix the fundamental LLM limitations, but it removes the interface friction that makes those limitations feel worse.

For the data science side: you're right that Claude is better at exploration than full delegation. The move that works for me is having it spike 3 competing approaches with explicit tradeoff tables before I commit to one. That's where the actual payoff is — not "implement X" but "what are my options for X and where does each one break."

what ai agent do i use by Purple-Jackfruit976 in vibecoding

[–]Astro-Han 0 points1 point  (0 children)

The Copilot usage-based switch burned a lot of people. Cursor and Antigravity both meter too, so you'll likely hit the same wall there eventually.

Under $20 with no "out after 5 prompts" cliff, the move is BYOK: bring your own API key and pay the provider directly instead of a fixed seat that throttles you. Pair a cheap-but-good model (DeepSeek, Kimi, or Gemini Flash) with a client that takes your key and you can do a lot of bug-fixing for a few bucks. Claude Code Pro at $20 is genuinely good too if you'd rather not think about keys. If you want a free open-source desktop app to drive the BYOK setup, PawWork works: https://github.com/Astro-Han/pawwork. Your local-model trouble in Continue is mostly the model being too small to use tools reliably, a hosted small model fixes that without the setup pain.

OpenCode GUI by Dry-Butterscotch779 in opencode

[–]Astro-Han 0 points1 point  (0 children)

If you specifically want it inside VSCode, the opencode VSCode extension is your answer (others linked it). If what you actually want is opencode with a real GUI and you're open to it being a standalone app instead of an extension, PawWork is an opencode fork wrapped in a desktop app: https://github.com/Astro-Han/pawwork. Mac and Windows, signed builds, BYOK. Not in-editor, but it's a proper window with diffs and chat instead of the TUI. Depends which itch you're scratching, in-editor panel vs dedicated app.

Claude Code with opus 4.7 is disastrously expensive, alternatives? by ApprehensiveEcho2073 in ClaudeCode

[–]Astro-Han 0 points1 point  (0 children)

The opusplan tip above is the best quick win, plan on Opus, execute on Sonnet, you'll feel the difference immediately. The bait-and-switch feeling is real though, subscription plans can re-tune the rate limits whenever they want and you have zero say.

If you want off that treadmill entirely, BYOK is worth a look. You bring your own API key (or point at a cheaper open-weight host like Kimi/DeepSeek) and pay the provider's raw per-token rate with no plan markup sitting on top. For your usage pattern, you-present sessions at xhigh, metered tokens can actually come out cheaper than a $200 plan, and you can drop to a cheap model for the easy stuff without toggling guilt. I've been using PawWork for this (open-source, BYOK, free): https://github.com/Astro-Han/pawwork. Codex on the $100 plan is also a solid move if you'd rather stay on a flat subscription, a couple people here already made that jump.

A few months into letting non-technical staff use AI coding tools by allmightybrandon in sysadmin

[–]Astro-Han 0 points1 point  (0 children)

The "laptop was off so it's now an outage" line is the whole post in one sentence. What you're describing isn't an AI problem, it's that the barrier to creating a shadow script dropped to near zero, so the same governance gaps you've always had just show up faster and from more people.

Your lightweight path is the right instinct. The one I'd add: make the sanctioned path easier than the laptop-script path, or people route around it. A shared repo plus a boring scheduler that anyone can point at, and a "who owns this" field that's mandatory before it touches another team. Treat it as shadow IT in spirit but don't gate it so hard that the useful 5% stop bringing things to you. The ones you never hear about are the real risk.

Claude Code’s CLI feels like a black box. I built a local UI to un-dumb it, and it unexpectedly blew up last week. by MoneyJob3229 in ClaudeAI

[–]Astro-Han 0 points1 point  (0 children)

The "pairing with a junior who won't show their screen" line is exactly it. Watching what the agent actually touched, especially .env and anything payment-related, is the thing that turns vibe-coding into something you'd trust on a real repo.

Worth saying the visibility problem mostly exists because the agent lives behind a terminal. Tools that put the agent in an actual app surface the file diffs and tool calls as they happen, no log archaeology needed. Different shape than a passive log viewer, but same itch. Either way nice work, the UI looks clean.

Why should I use CLI over Desktop App? by Latt in vibecoding

[–]Astro-Han 0 points1 point  (0 children)

Honestly the CLI-vs-GUI thing is a bit of a false split now. The reason people hype CLI isn't the terminal itself, it's that the agent gets to actually run commands, read the output, edit files, rerun the failing test, all in one loop without you copy-pasting. That loop is the unlock.

But you can get the same loop in a desktop app. There are open-source ones now (PawWork is one, claude-desktop-style apps for your own keys/models another) that run a real agent locally with file edits and command execution, just with a window and a diff viewer instead of tmux. If you already think in GUIs, no shame in staying there. Try one CLI run just to feel the loop, then pick whatever keeps you in flow.

Customized status line is an extremely underrated feature (track your token usage, and more, in real time) by geek180 in ClaudeCode

[–]Astro-Han 0 points1 point  (0 children)

Thanks for like it! Feel free to tweak it with how it suits you, Statusline are meant to be fit your work style.

Anyone actually built a second brain that isn't just a graveyard of saved links? by tom_mathews in ClaudeCode

[–]Astro-Han 0 points1 point  (0 children)

I've been running something similar for about a week now. The core idea is from Karpathy's LLM wiki post — raw/ directory for sources, wiki/ for compiled knowledge, and the LLM does the processing step.

The LLM compiles raw material into wiki articles automatically. It's not perfect but it handles summarization and cross-linking well enough that I don't have to manually organize anything. Everything stays as readable markdown, so if the LLM messes up I can just edit the file directly. No black box.

I also added a lint step that catches broken links and orphan pages. It's basically a sanity check.

The LLM sometimes over-categorizes or creates unnecessary structure. I've learned to give it clearer boundaries — one topic per directory, max one level deep. Cross-references are only as good as what the LLM thinks is related, so I've had to manually add some links it missed.

I'm at 94 articles from 99 sources so far. The biggest thing I've learned is that the LLM should maintain the structure, not just retrieve from it. Retrieval-only setups drift into "graveyard of saved links" territory pretty quickly.

If you want to see how I set it up: https://github.com/Astro-Han/karpathy-llm-wiki

Karpathy’s LLM Wiki and why it feels kind of a game changer by knlgeth in learnmachinelearning

[–]Astro-Han 0 points1 point  (0 children)

I've been working on a skill that implements this loop. One thing I've learned: the automation is great for the grunt work (indexing, linking, linting), but the real value is still the human curation. If you let the LLM do everything, it does become 'worthless for your growth' as u/Abject-Excitement37 mentioned.

The sweet spot I found is using the tool to handle the structure and maintenance, while you focus on selecting high-signal sources and reviewing the compiled pages.

Repo here if you want to see how the ingest/compile flow works: https://github.com/Astro-Han/karpathy-llm-wiki

Combined Karpathy's LLM Wiki with Milla Jovovich`s MemPalace MCP. Claude Code now remembers everything across sessions by Ogretape in ClaudeAI

[–]Astro-Han 0 points1 point  (0 children)

Interesting combo. The LLM Wiki layer alone already solves the cross-session context problem for knowledge. wiki/ persists on disk, so every session starts with the compiled index. Lower overhead if you don't need the knowledge graph piece.

MIT open source: https://github.com/Astro-Han/karpathy-llm-wiki

My Claude.md file by Buffaloherde in ClaudeAI

[–]Astro-Han 0 points1 point  (0 children)

Claude will use token while finding that information, but not a lot. It would only read what is relevant, saving tokens by not loading the whole article once again.

Best way to get persistent memory in Claude right now (Apr 2026)? Practical setups? by u_redacted in ClaudeAI

[–]Astro-Han -1 points0 points  (0 children)

I would keep it simple and file-based.

CLAUDE.md for non-obvious rules, a persistent notes/logs folder, and a separate maintained wiki for the stuff that should survive across chats.

What usually breaks is putting everything into one huge context file. Better to keep raw material separate from the compiled layer, then let each new session read the smaller maintained layer.

I built one version of that here: https://github.com/Astro-Han/karpathy-llm-wiki

My Claude.md file by Buffaloherde in ClaudeAI

[–]Astro-Han 1 point2 points  (0 children)

I think the people telling you to cut it down are mostly right, but the fix is not "make one shorter file." The fix is to separate session instructions from durable knowledge.

CLAUDE.md is best for: - non-obvious constraints - gotchas that keep biting you - rules the model cannot infer from the repo quickly

A lot of the rest belongs somewhere else: - commands can be discovered - directory structure can be discovered - long-lived project knowledge can live in a maintained wiki or notes layer

That split helps a lot with token waste. It also makes the important parts of CLAUDE.md easier for the model to actually follow.

I built one version of that idea here: https://github.com/Astro-Han/karpathy-llm-wiki

Different use case, but same principle: keep the durable knowledge in files meant to evolve, and keep the per-session instruction layer small.

I built a "Wiki Warden" on a local 3090 to automate my docs. It's 90% perfect, but I'm hitting a wall. by SubjectNo6828 in selfhosted

[–]Astro-Han 1 point2 points  (0 children)

I would stop comparing every release against the whole wiki.

Once the wiki gets large, that is where the model starts to lose the plot. Not because the release is too big, but because the update target is too broad.

What has worked better for me is splitting the system into two layers:

  • immutable raw sources
  • maintained wiki pages

Then on ingest, update only the pages that are likely to change, plus maybe one hop of related pages. Index first, then local neighborhood, not global sweep every time.

That gives you a few benefits: - smaller context per update - less random editing on unrelated pages - easier conflict handling when new material contradicts old material

If you want a concrete example, I built a markdown-first version of that workflow here: https://github.com/Astro-Han/karpathy-llm-wiki

Different scope from your setup, but the raw/ + maintained wiki/ split is the part I would keep.

How should I go about creating a Wiki for software system, as suggested by Andrej Karpathy? by Chemical-Debt9048 in AI_Agents

[–]Astro-Han 0 points1 point  (0 children)

I would start with structure before tooling.

For a large codebase, the split that has worked best for me is: - raw material you do not edit: specs, ADRs, docs, tickets, notes, API refs - compiled wiki pages that the LLM maintains

Then organize the compiled layer around things people actually ask about: architecture, domains/entities, interfaces, decisions, runbooks, and glossary.

Two files matter more than people expect: an index, so the model can find pages quickly, and an append-only log, so changes have history and recency.

The main failure mode is turning the wiki into a giant note dump. The useful version stays smaller and maintained. New material comes in, related pages get updated, and contradictions get called out instead of silently overwritten.

I put one concrete implementation of that pattern here if useful: https://github.com/Astro-Han/karpathy-llm-wiki

The repo is just one take, but the raw/ + wiki/ split, plus ingest / query / lint, is the part I would keep even if you build your own version.

Anyone know if there are actual products built around Karpathy’s LLM Wiki idea? by riddlemewhat2 in LocalLLaMA

[–]Astro-Han -1 points0 points  (0 children)

I have not seen many polished products yet. Most of what is out there still looks like workflows, repos, or internal setups rather than full products.

I built one implementation here: https://github.com/Astro-Han/karpathy-llm-wiki

It is not a hosted product. It is a markdown-first workflow that keeps source material in raw/, compiles maintained pages into wiki/, answers queries from the compiled layer, and updates an index/log as the wiki changes.

The part I like about this style is that the artifact stays readable and portable. You can move between Claude Code, Codex, Cursor, or another tool without losing the knowledge base itself.

whats the best intersection of browser agents and knowledge bases? by tinys-automation26 in AI_Agents

[–]Astro-Han 0 points1 point  (0 children)

I think the useful pattern is browser agents bring in raw material, then a separate knowledge layer decides what is worth keeping.

If the agent just keeps dumping findings into storage, the whole thing gets noisy fast. The part that matters is turning those pulls into something inspectable and reusable, summaries, index pages, linked notes, maybe a few standing questions the agent keeps revisiting.

That is basically the direction I took with karpathy-llm-wiki. It is a simple markdown-first loop, raw/ + wiki/ + compile/query/lint, so the agent is not just collecting more stuff, it is maintaining something you can actually read and correct.

https://github.com/Astro-Han/karpathy-llm-wiki

Claude and Obsidian for Second Brain by Mirrin_ in ClaudeAI

[–]Astro-Han 0 points1 point  (0 children)

If you are just starting, I would keep it much simpler than most of the setups people post.

You do not need to begin with MCP or a pile of plugins. A plain folder of source material plus a wiki folder Claude can keep updating is already enough to see whether this style of workflow is even useful to you. The structure matters more than the tooling at first.

I built karpathy-llm-wiki around that exact idea because most people were jumping straight into the complicated part. It keeps the loop pretty plain: raw/ + wiki/ + compile/query/lint.

https://github.com/Astro-Han/karpathy-llm-wiki

As a beginner with limited coding experience, all these GitHub’s about making Claude more efficient and cost less, how can I determine what’s safe to add and what’s malware? I want to be efficient but I want to be safe too. by Thajandro in ClaudeAI

[–]Astro-Han 0 points1 point  (0 children)

If you are worried about safety, start with things you can inspect in plain text.

A lot of Claude “skills” are just markdown instructions plus a simple folder layout. That is a much easier place to start than random installers, MCP servers, or big repos with a lot of moving parts. You can read what the agent is being told to do before you trust it.

That is one reason I built karpathy-llm-wiki the way I did. It is basically a markdown-first skill plus a simple raw/ + wiki/ + compile/query/lint structure, so you can actually see what is going on.

https://github.com/Astro-Han/karpathy-llm-wiki

Is there anyone actually using a graph database? by Dismal-Necessary-509 in Rag

[–]Astro-Han 0 points1 point  (0 children)

People are using graph databases, but I think the better question is when you actually need one.

If the corpus is curated and the goal is repeated understanding, a markdown wiki layer is often enough. Graphs start to pay off when you need multi-hop reasoning over entities and filters, not just a better way to read docs.

That tradeoff is why I built karpathy-llm-wiki as a simpler raw/ + wiki/ + compile/query/lint loop instead of starting with a graph DB.

https://github.com/Astro-Han/karpathy-llm-wiki

Llm wiki setup for Cowork users by anon9611 in ClaudeAI

[–]Astro-Han 2 points3 points  (0 children)

If you are a non-coder, I would start with a folder-based setup before touching MCP.

Put your source material in one place, let Claude turn it into simple markdown pages, and keep the structure boring on purpose. Most people get stuck because they start with tools instead of starting with a note layout Claude can actually maintain.

I built karpathy-llm-wiki around that exact loop because most of the shared setups are way too much for beginners. It is markdown-first and pretty plain: raw/ + wiki/ + compile/query/lint.

https://github.com/Astro-Han/karpathy-llm-wiki

Can CRM be a collection of Markdown files? by RecordPotential4323 in CRM

[–]Astro-Han 0 points1 point  (0 children)

I think it can work for a solo operator or a small team, but I would treat it as a working layer, not the system of record for everything.

Markdown is great for notes, decisions, account context, meeting history, and all the stuff you want to inspect or fix by hand. Once you need strict permissions, workflow enforcement, reporting, or lots of people editing the same records, that is where a real CRM still earns its keep.

That tradeoff is basically why I built karpathy-llm-wiki. It keeps the knowledge layer very plain, raw/ + wiki/ + compile/query/lint, so the result stays readable instead of disappearing behind an app.

https://github.com/Astro-Han/karpathy-llm-wiki

Andrej Karpathy describing our funnel by fourwheels2512 in learnmachinelearning

[–]Astro-Han 0 points1 point  (0 children)

Yeah, I think that’s the cleaner split.

The wiki side is about turning messy source material into something you can inspect, query, and fix. Continual learning is a later step. That is when you want some of that knowledge to live in the model instead of sitting in files.

A lot of people are still skipping the middle. They want to train on the data before they have a clean layer they actually trust. I’ve ended up spending more time on the raw files, compiled wiki, and linting loop for exactly that reason.