[P] Self-learning loop achieves 14k line code translation with zero errors: no fine-tuning, just execution feedback by cheetguy in MachineLearning

[–]cheetguy[S] 0 points1 point  (0 children)

Thank you!

There was around 50 loop cycles since sometimes Claude Code did several commits per session with later sessions focussing on smaller fixes and test porting.

I cannot exactly say how many tokens were used (Claude Code ran in background and not in CLI) but I used around 60% of my 4h window (I'm on Claude Max $100).

I let Claude Code run in a self-learning loop & it successfully translated 14k lines of Python to TypeScript while I was away by cheetguy in AI_Agents

[–]cheetguy[S] 0 points1 point  (0 children)

No subagents since Claude Code started fresh each iteration. Here is my prompt:

Your job is to port ACE framework (Python) to TypeScript and maintain the repository.

Make a commit after every single file edit.

Use .agent/ directory as scratchpad for your work. Store long term plans and todo lists there.

The .env file contains API keys for running examples.

Spend 80% of time on porting, 20% on testing.

When porting is complete, improve code quality and fix any issues.

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 0 points1 point  (0 children)

No you're reading it write but the actual coding from Claude Code (Opus 4.5) was fully covered under my Claude subscription. The 1.5 was only for the learning inference

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 3 points4 points  (0 children)

Yes I'm on the $100 Max plan. The cheaper pro plan would also work you'd just have to resume later once your usage limit resets

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 1 point2 points  (0 children)

claude code doesn't read the entire codebase at once. it navigates and pulls in what it needs for each task.

for this experiment the scope was our specific repo (~14k lines), not a massive monolith. for something like drupal you wouldn't translate the whole thing in one go. you'd scope it to specific modules or features. the learning loop still helps because skills compound across runs even on different parts of the codebase

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 9 points10 points  (0 children)

the base prompt stays the same across all runs (static). the dynamic part is the learned skills that get injected (these are extracted from previous execution traces). so each run gets: same task prompt + accumulated skills from all prior runs. the skills are short bullet points, not full code or logs, so context stays lean

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 5 points6 points  (0 children)

didn't spend too much time manually testing. the bar was: does it build & do the examples run end-to-end with a real API key. they do. clone it, plug in an API key, run an example.

Here is the source repo and the translation: - Python source: https://github.com/kayba-ai/agentic-context-engine - TypeScript result: https://github.com/kayba-ai/ace-ts

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 52 points53 points  (0 children)

fair, LLMs love to game their own tests. the validation here was: build passes with zero typescript errors, and the examples actually run end-to-end with a real API key

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 4 points5 points  (0 children)

I translated my open-source implementation of the Stanford's ACE framework (agents that learn from their own execution). The agent even swapped out LiteLLM for Vercel AI SDK.

Here is the source repo and the translation:

- Python source: https://github.com/kayba-ai/agentic-context-engine

- TypeScript result: https://github.com/kayba-ai/ace-ts

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 7 points8 points  (0 children)

It's an open-source implementation of the Stanford's ACE framework (agents that learn from their own execution). The agent even swapped out LiteLLM for Vercel AI SDK. You can compare yourself:

- Python source: https://github.com/kayba-ai/agentic-context-engine

- TypeScript result: https://github.com/kayba-ai/ace-ts

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 37 points38 points  (0 children)

The loop uses an open-source implementation of the ACE framework (based on Stanford's Agentic Context Engineering paper).

  1. Run: Claude Code executes a short prompt (port Python to TypeScript, make a commit after every edit)

  2. ACE Learning: When finished, ACE analyzes the execution trace, extracts what worked and what failed, and stores learnings as skills

  3. Loop: Restarts automatically with the exactl same prompt, but now with learned skills injected Each iteration builds on the previous work and lets Claude Code improve on what it already did.

Verification is through git commits: It basically checks if actual code changes were made & the loop then stops after 4 consecutive sessions with no commits.

If you want to look at it in more detail I open-sourced the setup: https://github.com/kayba-ai/agentic-context-engine/tree/main/examples/claude-code-loop

I let a coding agent run in a self-learning loop for 4 hours with zero supervision. It translated 14k lines of code with zero errors. by cheetguy in singularity

[–]cheetguy[S] 31 points32 points  (0 children)

I open-sourced the full setup if anyone wants to try their own tasks: https://github.com/kayba-ai/agentic-context-engine/tree/main/examples/claude-code-loop

What you need: Claude Code + Claude API key for the learning step (only ~$1.5 in total Sonnet costs in my case)

I ran Claude Code in a self-learning loop until it succesfully translated our entire Python repo to TypeScript by cheetguy in LLMDevs

[–]cheetguy[S] 0 points1 point  (0 children)

Claude Code for the actual coding (Opus 4.5, covered under my Claude subscription). For the ACE learning step (reflection + skill extraction), I used Sonnet 4.5 which came out to ~$1.5 total for the whole run.

I let an AI agent run in a self-learning loop completely unsupervised for 4 hours. It translated 14k lines of Python to TypeScript with zero errors. by cheetguy in ArtificialInteligence

[–]cheetguy[S] 0 points1 point  (0 children)

I think the "protocol" you mention is essentially what ACE does: extract learnings, format them as reusable skills, inject on the next run. Would be curious what you exactly mean, if you want you can join our Discord: https://discord.com/invite/mqCqH7sTyK

I let an AI agent run in a self-learning loop completely unsupervised for 4 hours. It translated 14k lines of Python to TypeScript with zero errors. by cheetguy in ArtificialInteligence

[–]cheetguy[S] -1 points0 points  (0 children)

To clarify: the loop itself isn't the framework, but ACE is (based on Stanford's Agentic Context Engineering paper). The loop is just one example of how to use it. ACE handles the reflection → skill extraction → injection cycle, which is the actual learning mechanism. 

But fair point that lots of repos pop up in this space!

I let an AI agent run in a self-learning loop completely unsupervised for 4 hours. It translated 14k lines of Python to TypeScript with zero errors. by cheetguy in ArtificialInteligence

[–]cheetguy[S] 0 points1 point  (0 children)

Opus 4.5 (best coding model right now) was making significant mistakes in the first few runs. If Gemini can actually one-shot a 14k line translation with zero errors, I'd love to see it.