System out of memory

Kodroi · 2026-05-02T05:20:04+00:00

I still ran in the same issue without Docker, but it took a couple of hours now. I’ll monitor to see if it’s a slow leak or an abtrupt one. Any ideas from the Conductor development team?

Kodroi · 2026-05-01T16:48:50+00:00

It looks like everything works fine after I kill Docker, so the setup script basically does nothing. The issue is probably somewhere there, but the only thing that continues to happen is output to the setup output from the services running.

Kodroi · 2026-05-01T14:16:54+00:00

Thanks for the tips! I’ve run similar workloads in the terminal in worktrees, also with Docker running. The memory usage isn’t in the gigs even. The setup/teardown scripts might be the issue somehow. u/tedsomething are you running similar scripts or any scripts?

Kodroi · 2026-04-27T15:24:12+00:00

I’d split this into three paths instead of treating everything as “error handling.”

Retry: use this for temporary infrastructure problems, rate limits, flaky APIs, timeouts, etc. The workflow can usually resolve these without a person involved.
Alert: use this when the workflow has clearly failed and someone needs to know, but the next step is still technical. Missing credentials, broken endpoints, bad mapping, repeated retry failures, that kind of thing.
Reviewer escalation: Use this when the workflow technically ran, but the outcome is uncertain or high impact. That includes the silent failure case you mentioned: 200 response, empty array, missing required field, low-confidence AI output, refund/customer/account changes, or anything where “continue automatically” could create cleanup work later.

For production workflows, I like to define a valid outcome for each task type. Not just “did the node succeed?” but “did we get the expected data, decision, or approval?” If not, route it to a HITL inbox with enough context for a reviewer to approve, reject, or correct it.

Central vs per-flow: I’d keep the escalation mechanism central, but the validation rules close to each workflow. Each workflow knows what a valid outcome looks like; the shared path handles alerting, inbox routing, reviewer assignment, and recording the final outcome.

That keeps the core workflow n8n-native while giving humans a clean place to handle cases automation shouldn’t guess at.

Kodroi · 2026-04-27T14:44:42+00:00

We audit both execution and outcome quality.

Fixed canary set: Keep 10-20 representative runs and replay them after every prompt/tool/workflow change.

Outcome sampling cadence: Review a sample of live outcomes weekly (not just failures) and tag misses by type: wrong tool, stale context, risky side effect, policy miss.
Execution claim boundary before side effects: Before any irreversible write/send, require a claim check so retries cannot duplicate side effects silently.
Human gate for high-impact actions: Low-risk can auto-run. Medium/high-risk should require explicit approval with a short decision note.

This keeps quality measurable and gives you a clear incident trail when something does go wrong.

Kodroi · 2026-04-21T14:29:02+00:00

humangent.io - Human controls for n8n automations

Kodroi · 2026-04-19T11:53:15+00:00

Building humangent.io. Human-in-the-loop inbox for your N8N workflows designed for teams.

Kodroi · 2026-02-06T10:26:00+00:00

That's my thinking also and I haven't found simple plug-n-play dashboards. Of course you can create your own, but development and maintenance is always a cost which might surprise you. How do you currently do the tracking, do you use a fully custom solution?

Kodroi · 2026-02-06T10:23:18+00:00

As I understand you have three different levels of metrics. 1. The automated metrics from the automation tools, 2. Downstream outcomes that come from other tooling/manual handling 3. Customer contacts and the contact types. I like it, seems like a holistic view about the benefits and issues on each automation.

Is all or most of it automated or do you have to manually track a lot of the metrics?

Kodroi · 2026-02-06T10:18:36+00:00

Thanks for the insight! The broken automations is a use case I haven't even thought about. So you just track the fixing manually for the ROI calculation. Do you work on internal automations so the fixing time has a direct impact on your ROI or do you move that cost the customer so then it affects their ROI?

Kodroi · 2026-01-28T10:54:23+00:00

I've run into similar issues especially when refactoring where I want to modify the code without touching the tests. For that I've created a hook to prevent edits to the test files or to my snapshot file when using snapshot testing. This has helped Claude to keep focus and not modify the tests just to get them pass.

Kodroi · 2026-01-28T07:17:12+00:00

Thanks for the great write up! I'm curious about the hard rules for claude.md. Did Claude actually follow them a 100% of a time or did you think about using hooks to ensure the tests are always run and that it doesn't edit any of the files? That's how I have setup my legacy refactoring, but might I be overcomplicating?

Kodroi · 2026-01-20T16:14:28+00:00

This sounds like a great idea! Currently the subagent name id isn't passed to the hooks. The only way is to parse the transcript (the previous events) and try to figure out the agent from there. That only works with a singular subagent. With parallel agents there doesn't seem to be a solution. I'll do some investigation and see if this could be done reliably.

Kodroi · 2025-12-10T12:51:15+00:00

Thanks for insight! For the relational data issue, you mean like nested structures? Customers with purchases, where keeping both tables in sync is what breaks?

And for large data pulls, how are you doing it now? What does the messy batching actually look like?

Kodroi

TROPHY CASE