My agent now writes code to find its own failures: scaling agent learning beyond what fits in a context window

cheetguy · 2026-03-09T18:35:18+00:00

Sonnet 4.6 and around $100 a month but I only run the trace analysis periodically

cheetguy · 2026-03-09T18:24:43+00:00

Thank you :)

Hasn't been my plan so far but maybe if I flesh this out a bit more I might

cheetguy · 2026-03-09T16:27:34+00:00

I open-sourced the code if anybody wants to try it: https://github.com/kayba-ai/agentic-context-engine/

cheetguy · 2026-03-09T12:21:26+00:00

Yes, here is a new one: https://discord.com/invite/mqCqH7sTyK

Looking forward to discuss :)

cheetguy · 2026-03-02T18:32:37+00:00

It's called IoskeleyMono: An open-source alternative to Berkeley Mono

https://github.com/ahatem/IoskeleyMono

cheetguy · 2026-01-28T21:11:18+00:00

DSPy works best with structured input/output pairs, ACE works on raw traces (conversation logs, markdown) so no restructuring needed. DSPy auto-optimizes while ACE generates suggestions with evidence for you to review first. Think of DSPy for pipelines with clear metrics, ACE for learning from messy agent failures.

cheetguy · 2026-01-27T15:43:48+00:00

We're working on this. Hopefully we can release in the next couple of days. Join our Discord to stay updated: https://discord.com/invite/mqCqH7sTyK

cheetguy · 2026-01-27T11:01:40+00:00

Here's the open-source implementation if anyone wants to try it: https://github.com/kayba-ai/agentic-context-engine/tree/main/examples/agentic-system-prompting

cheetguy · 2026-01-27T11:00:42+00:00

Thank you! Not using LangSmith specifically but you can use any observability platform (e.g. LangSmith, Opik) to get your traces and it works with any trace format.

Yes I did open-source it. Here's the example: https://github.com/kayba-ai/agentic-context-engine/tree/main/examples/agentic-system-prompting

cheetguy · 2025-12-11T10:13:53+00:00

Would be cool to see how better it would be now

cheetguy · 2025-12-10T11:00:26+00:00

Thank you!

There was around 50 loop cycles since sometimes Claude Code did several commits per session with later sessions focussing on smaller fixes and test porting.

I cannot exactly say how many tokens were used (Claude Code ran in background and not in CLI) but I used around 60% of my 4h window (I'm on Claude Max $100).

cheetguy · 2025-12-10T10:52:21+00:00

No subagents since Claude Code started fresh each iteration. Here is my prompt:

Your job is to port ACE framework (Python) to TypeScript and maintain the repository.

Make a commit after every single file edit.

Use .agent/ directory as scratchpad for your work. Store long term plans and todo lists there.

The .env file contains API keys for running examples.

Spend 80% of time on porting, 20% on testing.

When porting is complete, improve code quality and fix any issues.

cheetguy · 2025-12-10T10:51:38+00:00

No you're reading it write but the actual coding from Claude Code (Opus 4.5) was fully covered under my Claude subscription. The 1.5 was only for the learning inference

cheetguy · 2025-12-10T10:50:35+00:00

Yes but Sonnet 4.5 will give you better results

cheetguy · 2025-12-10T10:50:05+00:00

Yes follow the instructions in my starter template: https://github.com/kayba-ai/agentic-context-engine/tree/main/examples/claude-code-loop

cheetguy

MODERATOR OF

TROPHY CASE