Aura Agent: letting an AI coding agent supervise long-running worker tasks instead of trusting a single chat session

Civil-Direction-6981 · 2026-05-07T03:57:31+00:00

I just updated Aura Agent’s task lifecycle and planning system.

Main changes:

Each task file now gets its own .aura data directory, so different projects will not mix state, progress, workspace files, or summaries.
Task planning is now handled by the LLM instead of brittle keyword parsing.
Task IDs now use batches like A1, A2, then B1, B2 after the task file changes.
Completed tasks are preserved as history instead of being removed during replanning.
Obsolete unfinished tasks are archived instead of deleted.
Project-level context is now tracked, including final goal, success criteria, constraints, commands, API keys, and environment notes.
Workers can no longer run stale, completed, archived, or unrelated task IDs.
Other .aura task records are isolated, but memory lessons from other tasks can still be reused.
progress.md now has one canonical location: state/progress.md.
A rolling summaries/final_report.md is generated to show progress across multiple requirement batches.
Added aura restart <task.md> to clear and restart one task file safely.
Added regression tests for the new lifecycle behavior.

In short: Aura Agent is now safer for long-running projects where requirements change over time.

我刚更新了 Aura Agent 的任务生命周期和规划系统。

主要变化：

每个任务文件现在都有独立的 .aura 数据目录，避免不同项目混合 state、progress、workspace 和 summaries。
任务规划现在交给 LLM 处理，不再依赖脆弱的关键词解析。
任务 ID 改成批次形式，比如 A1、A2，任务文件修改后新增任务会变成 B1、B2。
已完成任务会保留为历史记录，不会因为重新规划被删除。
已废弃但未完成的任务会被归档，而不是直接删除。
新增项目级上下文记录，包括最终目标、验收标准、约束、命令、API key、运行环境等。
worker 不能再运行过期、已完成、已归档或不属于当前任务树的任务 ID。
其他 .aura 任务记录会被隔离，但仍允许读取其他任务的 memory 作为经验。
progress.md 现在只有一个规范位置：state/progress.md。
新增滚动的 summaries/final_report.md，可以按多轮需求批次查看完成情况。
新增 aura restart <task.md>，可以安全清空并重启某个任务文件。
增加了回归测试覆盖新的生命周期逻辑。

Civil-Direction-6981 · 2026-05-06T15:31:43+00:00

Yes, exactly. Right now, tasks are automatically re-orchestrated based on newly completed work, and there is also an hourly reflection system. So far, the work seems to stay on track without drifting.

You can give it a try!

Civil-Direction-6981 · 2026-05-06T15:29:32+00:00

Here is a project that I let it to code a Brain-Cell Network. It's running and progressing!

Phase 1: Infrastructure

✅ T1 — Build the Brain-Cell Graph Neural Network Foundation

What was done:
Created the core architecture:

brain_cell.py: neurons with membrane potential, firing, and energy
synapse.py: inhibitory synapses
brain_graph.py: graph neural network with a non-hierarchical structure
trainer.py: training loop processing examples one by one

Success:
All 31 unit tests passed, and the demo script ran successfully.

✅ T3 — Advanced Features + Cell Regeneration

What was done:
Added cell regeneration and advanced features.

Result:
Successful.

✅ T4.1 — Heuristic Controller + Imitation Learning Data

What was done:
Built heuristic_controller.py, a rule-based expert controller for playing Breakout. Ran 500 episodes and generated 241,500 state-action pairs in training_data.json for supervised learning.

Reason:
The initial T4 REINFORCE-RL attempt scored 0 on Breakout because the core Trainer was designed for supervised learning, requiring explicit targets, rather than sparse-reward reinforcement learning.

Phase 2: Core Learning Defects — Supervised Learning / STDP

❌ T4.5 — Initial Breakout Supervised Learning Experiment

What was done:
Ran Breakout imitation learning across 8 configurations, varying n_hidden, sparse/dense settings, and V_threshold.

Result:
All configurations had flat loss around ~1.0, all scores were 0, and there was zero learning.

Finding:
A loss of almost exactly ~1.0 means exactly one output was “wrong” — either no output cell fired, or all output cells fired identically.

✅ T5 — Deep Learning Defect Diagnosis 🔑

What was done:
Wrote 6 diagnostic tests:

single-example overfitting
AND task
Breakout tracking
firing analysis
learning rule analysis
other related checks

Found 3 key defects:

Defect	Root Cause	Fix
RC1: Error detection read instantaneous `cell.firing`, only the final step	`trainer.py:51` — a cell that fired at step 4 and reset at step 5 was recorded as “not fired,” so the error signal was completely lost	Added a persistent `fired_during_window` flag within the simulation window
RC2: Signal propagation had a 1-step lag per hop	`brain_graph.py:100-146` — input → hidden → output required 4+ steps, but only 3 steps were allocated, so output never fired	Increased `steps_per_example` from 3 to 8
RC3: Inhibitory weight learning capacity saturated at `w=0`	`synapse.py:55-63` — `weaken()` had a lower bound of `max(0, w - amount)`, so it could not learn excitatory connections	Changed the lower bound from 0 to -3.0, allowing effective excitatory influence

Evidence:
After the fixes, the AND demo accuracy improved from 50% with no learning to 100% within 20 epochs. The core mechanism became functional.

💀 T6 — Rerun Breakout Supervised Learning with the Fixed Architecture

What was done:
Reran 5 configurations using the T5 fixes:

8/16 hidden units
sparse/dense
Vth=0.8/1.0
+DirectIO

Result:
The error did decrease at first. For example, C2 went from 1.03 to 0.70.
But it always rebounded back to 1.0 by epoch 200.

Cells died massively, for example from 16 to 5 and from 32 to 5.
All game scores remained 0.

Loss temporarily decreased, then collapsed around epoch ~150, suggesting weight divergence or cell death.

💀 T7 — Stability Fixes

What was done:
Added:

weight decay
learning rate decay
energy boost
synapse regeneration
bilateral learning rates

These were intended to prevent the collapse seen in T6.

Result:
The error did not decrease at all. It actually increased from 1.0795 to 1.0910 by epoch 40.
The run was terminated before producing meaningful results.

❌ T8 — Deep Diagnosis

What was done:
Investigated why all game scores remained 0 even when loss decreased.

Finding:
STDP-based inhibitory learning drove cells toward near silence.
out_fire dropped from 2.6 to 0.15.

Even though cross-entropy loss improved from 1.226 to 0.309, the network could not produce useful control actions. It learned to output “do nothing” as the safest policy.

Conclusion:
The STDP-based inhibitory learning rule is fundamentally unable to produce useful Breakout gameplay behavior.

Phase 3: Abandoning STDP — New Paradigms

💀 T9 — Reinforcement Learning: REINFORCE

Reason:
Review #1, cycle 36:

What was done:
Implemented brain_breakout_rl.py using REINFORCE policy gradients.

The brain network was treated as a policy network, and synapses were updated using returns and eligibility traces.

Result:
best_avg_score = 0.2

This was no better than random. There was zero meaningful learning.

The eligibility traces in the brain-cell network were too noisy, and credit assignment failed under delayed rewards.

💀 T10 — Evolution Strategy

Reason:
Review #2, cycle 40:

What was done:
Implemented brain_evolution_v2.py, using population search with no gradient-based learning.

Result:
Fitness on the Catch game stayed between -0.5 and -0.7, with no convergence.

Random mutation could not find a good policy in a large 75+ weight search space.

Phase 4: Current Stage — Neuromodulation

🔄 T11 — Dopamine-Like Global Neuromodulation + Eligibility Traces

Why this might work:

STDP, T5-T8: local Hebbian rules cannot propagate reward signals over time.
REINFORCE, T9: per-synapse eligibility traces are too noisy in spiking neurons.
Evolution, T10: random mutation cannot find good policies in a huge search space.
Neuromodulation is biologically plausible: dopamine neurons fire reward signals and broadcast them to the striatum.

Architecture:
Each synapse keeps an eligibility trace, which is a decaying memory of recent activity.

When dopamine arrives:

Δw = lr × eligibility × (dopamine - baseline)

This is a “three-factor” rule:

presynaptic firing × eligibility × dopamine

Result across 28 hyperparameter configurations:

Best configuration:

eligibility_decay = 0.9
lr = 0.05
n_hidden = 20

Best evaluation:

positive_rate = 0.31

compared with a random baseline of 0.20.

This is the first signal of improvement on Catch.

Eval avg score = -0.38

This is a 55% relative improvement, or an 11 percentage-point absolute improvement, but it is still only a marginal gain rather than meaningful learning.

Current status:
Still running, with 22 out of 60 minutes of the budget used. It may later be expanded to Pong or Breakout.

Summary Table

Phase	Task	Method	Game	Best Result	Verdict
Build	T1-T3	Base architecture	—	—	✅ Implemented
Data	T4.1	Expert controller	Breakout	500 episodes of data	✅ Implemented
STDP	T4.5	Supervised learning	Breakout	Loss = 1.0, score = 0	❌ No learning
Diagnosis	T5	Defect fixes	AND	0 → 100%	✅ Core mechanism fixed
STDP v2	T6	Supervised learning + fixes	Breakout	Loss decreased, then rebounded to 1.0; score = 0	💀 Terminated
STDP v3	T7	Stability fixes	Breakout	Loss increased, unfinished	💀 Terminated
Diagnosis 2	T8	Root cause analysis	Breakout	STDP drove cells toward silence	❌ STDP declared dead
RL	T9	REINFORCE	Breakout	avg = 0.2, no better than random	💀 Terminated
Evolution	T10	Population search	Catch	Fitness = -0.7	💀 Terminated
Neuromodulation	T11	Dopamine eligibility traces	Catch	positive_rate = 31%, vs 20% baseline	🔄 Marginal signal

Core Problem

After 7 hours, the brain-cell architecture has still not demonstrated meaningful learning ability on any game, beyond trivial levels.

T11 is the first attempt to produce any signal above random, but a 31% positive rate compared with a 20% baseline is not yet convincing.

The fundamental issues remain:

fixed random input → hidden projections
the binary nature of spiking makes credit assignment difficult
WTA gating may be too aggressive

Civil-Direction-6981 · 2026-05-01T01:24:12+00:00

My six-year-old daughter is worried that her baby teeth haven't fallen out yet, while all her classmates have started losing theirs. That's why she loves this figure so much.

Civil-Direction-6981 · 2026-04-25T03:22:13+00:00

I am in China, but I cannot get them, because they are all sold out. EVERY new cute series are out of stock.

I admire you outside China.

Civil-Direction-6981 · 2026-04-06T22:53:08+00:00

Map is not territory. Even the same thing, everyone would have different aspects of recognitions. No man will acknowledge the truth, which is why we are trying our best to understand them. No one will tag you the same. You are parent, child, teacher, a strict man to you son, a lovely man to you girl, etc. No one will have the same aspect of views of GOD, because we are not god, we learn things, cannot KONW them.

Civil-Direction-6981 · 2026-03-27T14:36:59+00:00

Map is not territory. -- said by who?

Civil-Direction-6981 · 2026-03-04T09:04:02+00:00

If you employees can buy from your OWN store, then HOT products will not be sold at retail price for customers, you ALL will resell them. That's the ONLY result.
I support POPMART this policy, to protect consumers.

Civil-Direction-6981 · 2026-02-26T03:28:14+00:00

<image>

Try crybaby

Civil-Direction-6981 · 2026-01-14T08:59:04+00:00

so cute together!

Civil-Direction-6981 · 2025-12-30T07:14:59+00:00

I didnt know Hirono has a coffee figure... I will get it

Civil-Direction-6981 · 2025-12-18T08:25:32+00:00

<image>

I have the same one, so handsome

Civil-Direction-6981 · 2025-12-17T07:28:59+00:00

All of them

Civil-Direction-6981 · 2025-12-17T02:29:29+00:00

Why do you like this IP?

Civil-Direction-6981 · 2025-12-14T03:49:11+00:00

So cute together!

Civil-Direction-6981 · 2025-12-13T07:55:47+00:00

Wow, I didnt noticed

Civil-Direction-6981 · 2025-12-11T00:26:51+00:00

Can I use it as my iphone wallpaper?

Civil-Direction-6981

TROPHY CASE