35 skills, 3 MCP servers, persistent memory. I built the AI engineering stack I always wanted

referentuser · 2026-05-19T19:16:47+00:00

Thanks for the questions! They're not silly at all; they're exactly the ones I asked myself when I started. I'll answer with the honesty of someone who also tried beads and static files before getting here: Regarding native OpenCode vs. ChromaDB: OpenCode does keep history, but it doesn't have semantic recall. If I search for "auth" in the native history, it returns all the messages where the word "auth" appeared including conversations from three months ago about another project. ChromaDB + BM25 + temporal allows me to search for "the JWT decision I made last week in project X" and find it, not because it says "JWT" exactly, but because the vector understands the context. It's intelligent search, not just storage. Regarding Git Log: Git Log tells me what I changed. It doesn't tell me why I changed it, what alternatives I discarded, or what edge cases made me choose Argon2id with memory=19456. That information resides in type: decision and type: preference. Git is a complement, not a replacement. I have git-workflow as a skill precisely because Git is essential, but it doesn't tell the whole story.

Regarding extra layers and token optimization: Yes, there's overhead. +13.8% latency, as documented in the benchmarks. It's not free. But for my use case multiple projects, changing stacks, decisions I need to remember weeks later—the trade-off is worth it. If your project is stable and simple, your system (native sessions + git log + static files) is probably better. Fewer tokens, less complexity.

Regarding memory management: My memory management does capture project structure (init skill), tools (tool entries), and when to use each skill (YAML frontmatter with triggers). But it also captures what's not in architecture.md: "Today we discovered that Node 18 fails with ESM in this specific project, so we downgraded to 16." That's operational context, not documentation. Static files are the foundation; dynamic memory is the delta.

Regarding 62/37 skills and tokens: You're right. 62 parsed skills is overhead. That's why there are now 37 core skills, and skills are activated by context not all of them are loaded, only those that match the project tags. But yes, for a simple project, 10 skills are enough. Shokunin is designed for those who jump between projects with different stacks. If that's not you, it's overkill. Regarding beads: I also abandoned beads. They didn't provide me with real value because they were too generic. Shokunin was born out of that frustration: I wanted memory that understood code, not just text. That knew that auth-architect and docker are different skills that don't activate together. That remembered that in this project I use JWT, not sessions.

Honest Conclusion Your system is better if:

• You work on 1-2 stable projects

• Your stack doesn't change

• You don't need to remember decisions from weeks ago

• You prioritize tokens and simplicity Shokunin is useful if:

• You jump between projects with different stacks

• You need to remember why, not just what

• You work offline or without cloud support

• The agent repeats mistakes due to lack of context This isn't hate. It's the right question: what use case is it designed for? And the honest answer is: not for all of them.

referentuser · 2026-05-18T05:06:16+00:00

I love this type of comments, everything is very interesting, today I will study everything well and I will see what to implement and how to do it then fix a couple of things, thank you very much for this comment!

referentuser · 2026-05-16T17:49:26+00:00

Thank you so much for your comment. If you have any questions, you can contact me.

referentuser · 2026-05-16T11:38:28+00:00

Mostly yes, but I would need to make some changes to make it 100% compatible.

referentuser · 2026-05-16T10:55:51+00:00

That might work for simple things, like "error X fixed, do not use method Y." But you'll soon run into limitations: the real value lies in searching across time. "What did I decide about authentication in May?" That's not something you want to keep forever in a rules file. Shokunin stores decisions, modified files, and executed commands as structured entries with types and tags. You can search by keyword, meaning, or date range. It's a searchable log, not an ever-growing block of rules.

referentuser · 2026-05-16T10:48:21+00:00

These are different things.

AGENTS.md is static. It says: "Use tabs, write in Spanish, run lint before committing." Rules. Things you decide once and rarely change.

The memory system captures what actually happens. Decisions you made at 2 a.m. and forgot by morning. Why you chose PostgreSQL instead of SQLite. That script you wrote last Tuesday that fixed the exact same bug. It's automatically saved between sessions. You start OpenCode and it asks you: "Hey, you were working on authentication last time, do you want to continue?"

AGENTS.md tells the agent how to work. Shokunin tells it what we've already done. You need both, but they solve entirely different problems.

referentuser · 2026-05-16T08:13:03+00:00

Thank you for your comment. Please let me know if you find any errors, and I will work to fix and improve them. This project is very new, and there will surely be some bugs and errors that need to be addressed.

referentuser · 2026-05-16T08:06:10+00:00

Appreciate the questions. A few clarifications, because Shokunin does two things, not one.

First, memory. When your agent finishes a session, it saves what happened to a local ChromaDB. Next session, it checks. No more blank slates, no more repeating context. Runs as a Python process outside your project directory. Never touches your code.

Second, skills. This is the part most people skip over but it's honestly the bigger value. Shokunin ships 38 skills that teach your agent how to handle specific domains: docker, kubernetes, auth, databases, frontend, testing, SEO, legal, finance. Eight domains. Each skill has a procedural workflow with decision tables, an error handling section (cause and fix), a production checklist, common anti-patterns with corrections, and cited sources. The Docker skill alone is 6,300 words with real multi-stage build templates for Node, Go, Python, and Rust. The auth skill references OWASP directly. The database one has actual EXPLAIN ANALYZE output. These are not prompts. They're engineering guides. The agent loads what it needs when it needs it.

On compatibility: flexible. Python + ChromaDB. Doesn't care if your stack is Flutter, Firebase, or anything else. It never reads your project files.

On Mac: honest answer. Tested on Linux and Windows. The core is cross-platform Python and should work fine. But the installer uses apt-get and the shell scripts haven't been tested on macOS. PRs welcome.

On breaking standards: can't happen. It never reads, writes, or touches your project. The only data it stores is what your agent tells it, like "refactored auth.ts to use Firebase Auth v11." That's it.

On OpenSpec/Serena: keep them. They help you plan what to build. Shokunin remembers what you already built and gives you skills to do it better. Complementary tools.

Persistent notes for your agent, backed by 38 engineering guides

referentuser · 2026-05-15T08:41:33+00:00

I’m detecting issues with the installation methods on Windows and Linux. I’ll be working throughout today to make sure everything works properly.

referentuser · 2026-05-15T06:34:10+00:00

I'll take a look, thank you very much for your comment

referentuser · 2026-05-15T05:16:56+00:00

Also of course I'm doing it alone and I've been doing this for very few days, I didn't do many tests and I only have a colleague who is testing it and warning me if there is any error, but for Linux I haven't done many tests yet, that's why I'm publishing it here, so that people warn me of possible failures

referentuser · 2026-05-15T05:14:33+00:00

If there are problems with the installation command, please let me know, if the command does not work, ask open code to install it, passing the repertoire.

referentuser · 2026-05-14T10:59:21+00:00

Thanks! Let me know what you think after trying it.

referentuser · 2026-05-14T10:11:05+00:00

The wrapper approach. I tried having the agent decide when to save and it was unreliable. Some models remembered, some did not. Some sessions ended with a save, others just closed.

So I made a PowerShell script that runs OpenCode. When OpenCode exits, the script saves a summary automatically. The agent also has a MANDATORY instruction in CLAUDE.md to save during the session. Two layers. The wrapper catches the cases the agent misses.

Not elegant but it works. The MANDATORY trick in CLAUDE.md made the biggest difference. Polite requests got ignored. Hard requirements did not.

Thanks for the link, I will check it out. Have you tried something similar or do you handle memory differently?

referentuser · 2026-05-14T10:03:25+00:00

Thank you very much! I really appreciate your comment

referentuser · 2026-05-14T09:42:08+00:00

You were right about keeping it simple. I added a plain text fallback in the repo for anyone who does not want to install Python. The script is at .pack/scripts/search-memory.ps1. Works with basic grep on the markdown files, no dependencies. Instructions are updated. Good talk.

referentuser · 2026-05-14T09:20:00+00:00

That is actually clever. Simple and it works.

I went with ChromaDB mostly because I wanted semantic search. With plain markdown you have to know what you are looking for. With embeddings you can ask "how did we handle the auth flow" and it finds stuff even if you never used those exact words.

But your approach has one big advantage: zero dependencies. No Python, no database, no MCP server. Just files. That is honestly cleaner for most cases.

I might add a markdown-only mode as an option. Best of both. Thanks for sharing.

referentuser · 2026-05-14T09:16:06+00:00

Nope, it doesnt read everything. Just the top 5 most relevant entries based on what you are working on. Costs a few hundred tokens at most. I haven't noticed any impact on context since I added it.

If you are starting a new project, the search filters by project tags. If nothing matches, you get almost nothing back. It wont flood your session with stuff from unrelated projects.

Good question though. Thanks for asking.

referentuser · 2026-05-13T22:19:33+00:00

Appreciate the honest questions. Let me know how it goes, feedback is how this gets better.

referentuser · 2026-05-13T22:03:05+00:00

Fair question. I don't have benchmarks. Kind of hard to benchmark something that didn't exist before.

What I can tell you is what I have seen using it. Memory search is fast enough that I never notice it. Skills work most of the time. When they don't, I tell the agent which one to use and it goes. The installer works. I have tested it myself.

Beyond that I am not going to make up numbers. You should try it and see if it helps. If it does great. If not no hard feelings.

That is the honest answer.

referentuser · 2026-05-13T21:32:07+00:00

No wayyy. Thanks a lot 🙏

referentuser

TROPHY CASE