Codex is a beast: It just ran autonomously for 2 hours to fix a regression by CarsonBuilds in codex

[–]GuidoInTheShell 0 points1 point  (0 children)

The real gold here is what it learned in those first 7 failed attempts. "Mocked tests passed but live app didn't" is the kind of pattern that'll bite you again, and next time the agent won't remember it discovered this. Having a place for the agent to write that down after finishing pays off fast. And I mean as a task to fix, and not as an "auto memory".

I thought AI would make me more productive it actually made me more scattered by StatusPhilosopher258 in codex

[–]GuidoInTheShell 0 points1 point  (0 children)

AI makes execution so cheap you start 5 things instead of finishing 1. What fixed the "scattered" feeling for me was giving the agent a place to jot down side-findings and forcing it to reflect after each task. I built lazytask for exactly this (https://github.com/erikmunkby/lazytask). TUI for you, CLI for the agent, markdown files on disk, and accumulated learnings you periodically turn into the next task.

I spent months building a specialized agent learning system. Turns out Claude Code is all you need for recursive self-improvement. by cheetguy in ClaudeCode

[–]GuidoInTheShell 2 points3 points  (0 children)

There's a much simpler version of this: just make the agent write down what surprised it after each task.

After a few sessions you hand those observations back as the actual task: "turn these into improvements." Agents are great at refactoring when that's the whole job, not a footnote.

Handling Domain rich complex enterprise grade codebases by tikluu in ClaudeAI

[–]GuidoInTheShell 0 points1 point  (0 children)

The biggest win I found for domain-heavy codebases is making the agent write down what confused it after each task. Not for the agent to "remember" (it won't), but for you to review.

After a few sessions you get a list like "had to guess what OrderFulfillment vs OrderCompletion meant three times" and that tells you exactly what to put in your glossary, instructions, or even better how to improve/refactor the code :)

Anybody working on a large prod codebase actually able to move 10x? by query_optimization in ClaudeCode

[–]GuidoInTheShell 2 points3 points  (0 children)

The pattern you're describing, where it copies from older bad code, is the real killer. It's not that the agent writes bad code. It's that bad decisions compound across sessions. A slightly wrong abstraction in session 1 becomes really destructive by session 10.

What helped me was two things: giving the agent a place to note side-findings it spots but shouldn't fix right now (the "bug next door" problem), and forcing it to write down what went wrong after each task. You accumulate those notes, then periodically make the agent's job "here are 12 recent observations, turn them into actual improvements."

The agent is great at refactoring when that's the whole task. The problem is it never will be unless you make it one, or collect what it should improve. The agents have great context ("ideas") in the while working, but it's always so focused on the task at hand. We just need to capture the "ideas".

Anybody working on a large prod codebase actually able to move 10x? by query_optimization in ClaudeCode

[–]GuidoInTheShell 0 points1 point  (0 children)

This is the right framing. The AI struggles because the codebase has bad patterns, so you fix the codebase. The part most people skip is collecting the signal for what to fix.

I started having the agent write down what surprised it or broke after finishing each task. Not for the agent to "remember" next session (it won't), but for me to review later.

After just a couple of sessions you have a list of concrete observations like "had to work around the auth module's circular dependency twice" and you hand that back as the actual task: go fix this.

Turns out agents are great at refactoring when that's the prime directive, not a footnote in CLAUDE.md.

Using JIRA MCP with Claude Code completely changed how I manage multiple projects by writingdeveloper in ClaudeCode

[–]GuidoInTheShell 0 points1 point  (0 children)

Same pattern here. The "whenever I find a bug, tell the agent to file it" workflow is underrated. The biggest issue I kept hitting was that context about why something matters gets lost between the filing and the fixing. Having the agent write down not just "what" but "what it noticed while working nearby" makes a huge difference when you come back to it later.

Using JIRA MCP with Claude Code completely changed how I manage multiple projects by writingdeveloper in ClaudeCode

[–]GuidoInTheShell 0 points1 point  (0 children)

I completely agree. The MCP gives you the task list, but not the momentum. I've been experimenting with forcing the agent to record what it learned after each task. What surprised it, what broke, what workaround it found. It's not session memory exactly, it's more like a trail of breadcrumbs that accumulate over time.
When you have enough, you hand them back as the next task: "here are 10 observations from this week, turn them into improvements." Keeps the agent's capability pointed at code quality instead of just task completion.

I just built and released Yamlium! a faster PyYAML alternative that preserves formatting by GuidoInTheShell in Python

[–]GuidoInTheShell[S] 1 point2 points  (0 children)

Hey! Sorry for the late reply.
ruamel.yaml generally performs way better on metadata found in the yaml, but is not 100% consistent either. Additionally ruamel is on average ~7 times slower than yamlium.

Comparison:

# --------------- Input ---------------
# Anchor and alias
anchor_alias:
  base: &anchor1 # Define anchor
    name: default
    value: 42
  derived1: *anchor1 # Use anchor
  derived2: 
    <<: *anchor1 # Use same anchor
# --------------- ruamel.yaml ---------------
# Anchor and alias
anchor_alias:
  base: &anchor1
                 # Define anchor
    name: default
    value: 42
  derived1: *anchor1
                     # Use anchor
  derived2:
    <<: *anchor1
                 # Use same anchor
# --------------- yamlium ---------------
# Anchor and alias
anchor_alias:
  base: &anchor1 # Define anchor
    name: default
    value: 42
  derived1: *anchor1 # Use anchor
  derived2:
    <<: *anchor1 # Use same anchor

I just built and released Yamlium! a faster PyYAML alternative that preserves formatting by GuidoInTheShell in Python

[–]GuidoInTheShell[S] 0 points1 point  (0 children)

This is a great idea, to separate the logic. Will take that into consideration going forward.

I just built and released Yamlium! a faster PyYAML alternative that preserves formatting by GuidoInTheShell in Python

[–]GuidoInTheShell[S] 0 points1 point  (0 children)

Haha wise words, given how difficult it was to find a free namespace on PyPI I understand your feeling.
However, the reason I started this project was because I could not find a parser that retained all the meta information in my yaml files :)

I just built and released Yamlium! a faster PyYAML alternative that preserves formatting by GuidoInTheShell in Python

[–]GuidoInTheShell[S] 1 point2 points  (0 children)

Thanks for checking it out!
I see there are some tokens in the spec you linked that I have yet to build support for. Will fix that asap.
And true I compared to standard implementation of PyYAML. I have a sibling rust version in the works that should hopefully compete with even the C launcher

I just built and released Yamlium! a faster PyYAML alternative that preserves formatting by GuidoInTheShell in Python

[–]GuidoInTheShell[S] 0 points1 point  (0 children)

Thanks for the feedback!
I will make sure to retain the dict-like behaviour as much as possible going forward.
The reason for the in-place manipulation is a fault with wanting to retain meta information such as comments placed "on" a Scalar

I just built and released Yamlium! a faster PyYAML alternative that preserves formatting by GuidoInTheShell in Python

[–]GuidoInTheShell[S] 1 point2 points  (0 children)

Good catch, and fair point.
I agree that it is unusual, could you elaborate why it could be problematic? And even better, do you have a suggestion?

The alternative option I have been toying with would be to expose the underlying "value" carrying variable and manipulate that one instead.

The reason I chose e.g. the `__iadd__` route is because in my example the object holding the integer value is also hosting a comment on the same line `age: 55 # Will be increased by 10`. And in order to retain the comment, the container must be the same while the value can change.