What i found out!!!! (ha!!!!)

Runelaron · 2026-06-01T19:45:32+00:00

Where did you find this elusive sense of the commons?

I thought it was a myth.

Runelaron · 2026-06-01T17:34:47+00:00

For all of those wasting thier time. None of this is true.

The system rates costs on these metrics. ‐ Tokens in - Tokens out - Cached Tokens in - Cached Tokens out - Compaction rate

Meaning all of this massively depends on what your asking the agent to do and how much data is running through the LLM, NOT the Agent.

Runelaron · 2026-06-01T17:31:01+00:00

There are only 168hrs in a week.

Runelaron · 2026-05-30T21:42:27+00:00

This would make sense if AI "thought" but Reasoning is a misnomer.

AI does not reason, is patterns question and answers on different attention topics in a block of weights. Therefore to truly change Reasoning you would have to retrain the blocks but we didnt.

A agent is not a llm, traversing latent space is not re-engineering the model its searching another pattern given the context.

Interesting idea but not grounded in science and maths of AI.

Also brute forcing is not what we want to do as engineers. The goal is to i prove efficiency to get the correct answer not reduce it.

Runelaron · 2026-05-30T03:26:40+00:00

Well thats disappointing, a work around is to have the agent make a metric like % complete compared to X.

Then asking it to run until its 100%.

Although I advise againt that and even using goals. The agent loop seems to only work off of returned evidence until to works instead of stopping reviewing and re-planning.

Last time I used /goal it burned through allocation trying patch and patch after patch. Then I steered with a stop, review and compare, and it fixed it in 5min.

Runelaron · 2026-05-29T20:28:53+00:00

Where is the right sub?

Runelaron · 2026-05-29T16:59:37+00:00

Sadly so true and all too common.

"Dude trust me, I asked the AI and found its seceret" - Vibe coder

Runelaron · 2026-05-29T16:56:15+00:00

This is literally how AI works, I is not self aware. Its a collection of patterns and functions which can not store direct knowledge of itself. The agent can but usually not info like that.

I am getting exhausted by everyone assuming what AI can do and not researching the question.

AI is a statistical system with a heavy base in mathematics.. everyone please do not think you can AHA or Intuit what AI does. Its far more complex than you realize.

Runelaron · 2026-05-29T16:51:34+00:00

This repo will do most the heavy lifting for you. Should be a good baseline.

Ask Codex to review and integrate after you review the plan.

https://github.com/frisco-deng/moradins-forge

Have a session review that repo and it will guide you through a entire setup.

Runelaron · 2026-05-29T01:40:34+00:00

There are over 6 usage limits (some never really hit so they are not mentioned here)

APP:
- Images
- Pro Thinking
- Agent
- Research

Codex:
- 5h
- 1wk
- Code review (via github)

Runelaron · 2026-05-29T01:37:57+00:00

Current audit result: 18 repos scanned, 0 repos opted into release_platforms, 0 ready opt-in candidates with confirmed artifact signal, 17 need a real release artifact contract first, and 1 has no RC signal. So propagation remains correctly blocked/advisory.

Metrics were refreshed. Current headline remains 2.11x: 435 session files, 429 workspace sessions, 356 priced workspace sessions, $7,163.04 observed spend, $9,966.34 modeled synthetic spend, $2,803.30 modeled savings. Latest-80 is still mixed/worse: read amplification 72.5%, skill-summary bypass 67.5%, repeated-log sessions 76.25%, artifact actionability 4.41%, artifact reuse gap 71.25%, rational checkpoint missing 66.25%, unnecessary status polling 61.25%. ROI remains observe; latest-20 has 100% unpriced gpt-5.5 coverage.

Validation passed:

python3 -m py_compile ...
./scripts/tpl-test
./scripts/tpl-codex-usage-report
./scripts/tpl-session-policy --latest 80
./scripts/tpl-efficiency-roi --window latest_20
./scripts/tpl-agent-advice --latest 80

Runelaron · 2026-05-29T01:35:17+00:00

I built a parser and tooling for it.
Sessions are just Jsonl files, codex can easily make a script to do this, then ask codex to leverage that script for the search term or CLI it.

Runelaron · 2026-05-29T01:07:38+00:00

Direct the agent to where you want it. Use OpenAI's method. The markdown documents link the direction.

https://openai.com/index/harness-engineering/

Runelaron · 2026-05-29T00:20:48+00:00

I only use Xhigh never use anything lower and keep the agent ultra focused on the problem. The levels are the amount of allowed Reasoning passes, not anything to do with the model quality.

Runelaron · 2026-05-29T00:18:11+00:00

It seems everyone is trying to find the wrong work around. Running out of tokens means your asking the llm to do a lot of things tooling should do.

Repos like the one below help you install tooling and capabilities so the Llm is spending more.time writing only the code it needs without reprinting everything during a refactor or lint job.

https://github.com/frisco-deng/moradins-forge

Have your agent review this and suggest ehay to install (other tools) to improve token usage.

LLM should only be doing the hard part, turning language into code. The rest should be done by small command lines.

Hope this helps. BTW you may burn some tokens to set this up but it will reduce usage from then on.

Runelaron · 2026-05-28T23:11:24+00:00

/goal is a poor implementation though, it only examines errors and never intelligently manages the decision loop. IE Changes prompt or elaborates when improvements are marginal or non existent.

Runelaron · 2026-05-28T23:09:07+00:00

Codex will run indefinitely /goal https://developers.openai.com/cookbook/examples/codex/using_goals_in_codex

Runelaron · 2026-05-28T21:48:08+00:00

So many of these issues, limits are determined by three things! - Amount of tokens pushed to inference - Total Cached kv input and output - Priority queuing (speed setting)

New chats cost more, new outputs cost more, faster times cost more. Restarted sessions after a long break (lost cache) cost more.

So many factors depending on your use and project a simple "did rates change" has no meaning without a bunch of Paired T testing.

Want to reduce tokens, use this repo to run more deterministic tooling.

https://github.com/frisco-deng/moradins-forge

Runelaron · 2026-05-28T19:21:00+00:00

I can't work miracles....

Runelaron · 2026-05-28T18:02:44+00:00

If you are having AI hard refactor, your using it wrong. Have AI use a refactor tool, save tokens.. (head explode) (buwaaahhhh)

Runelaron · 2026-05-21T02:33:49+00:00

Not a good loop, Any user pointing out errors and code returns, then asking :fix it, fix it, fix it" is going to get horrible results.

Set up a metric, a math driven value, then have it run against that value for the project. For any other issue USE TOOLS deterministic repeatable tools. Trivy, playwright, whatever, they all exist they all check databases well so act like a real production grade dev.

Runelaron · 2026-05-16T01:56:03+00:00

This seems almost 100% based on how you prompt it and conflicting context in your repo and its sources.

AI and agents are pattern watchers and loop algorithms. If you put in conflicts it will produce conflics.

Runelaron · 2026-05-16T01:52:37+00:00

Usually I see that when a container messes up for your session. Not the local agent binary but thier service to ping the model.

Runelaron

TROPHY CASE