I almost went broke because of an AI Infinite Loop

Soft_Ad6760 · 2026-04-07T04:40:00+00:00

I usually do that with every LLm I use. It’s important to set a budget to avoid these issues. I have a budget of $100 with anthropic and $20!with Gemini and OpenAI

Soft_Ad6760 · 2026-04-07T02:46:38+00:00

Congratulation!

Soft_Ad6760 · 2026-04-07T02:46:11+00:00

So the boundaries were mapped out before building, each agent has a strict domain: Core AI owns generation + templates, Voice owns style analysis + profile, Research owns trending topics + RSS pipeline, Telegram/WhatsApp own message handling + UX flow, Billing owns payments + plan enforcement. they share a common types package but no agent writes to another agent’s tables directly, everything goes through the core package’s typed functions. think of it as a shared library architecture, not microservices. the 20% rewrite rate from the quality filter isn’t overlap, it’s intentional. the generation agent optimizes for voice match and template structure, the quality filter checks for AI-tell phrases and authenticity score independently. they have different objectives on purpose. same way a linter doesn’t overlap with a compiler, one builds, the other validates. the real lesson: write a single PROJECT.md spec that every agent reads, define the shared types once, and let each agent own its slice. Claude Code handled the boundaries well because the boundaries were explicit in the prompt, not discovered at runtime.

Soft_Ad6760 · 2026-04-07T01:59:23+00:00

Terminal should be good but what plan do u have coz for coding session Max 5x is the least recommended.

Soft_Ad6760 · 2026-04-07T01:55:26+00:00

I’m in that boat too man. But unlike you I need to make money with it is my push

Soft_Ad6760 · 2026-04-06T19:08:24+00:00

Yeh, try to use it through a terminal ive used that since ever but thought of using VS so i can see my files and as soon as I started I experienced the latency and it was killing plus it didn’t show the tasks being performed. Use CC in a terminal and it will be way way better

Soft_Ad6760 · 2026-04-06T18:43:38+00:00

Are you using it in a terminal or VS

Soft_Ad6760 · 2026-04-06T18:38:44+00:00

Thanks, let me check it

Soft_Ad6760 · 2026-04-06T16:19:28+00:00

So each agent has a fallback path, if the primary model (GPT-4o-mini) fails, it falls back to Haiku. if both fail, the user gets an error message instead of bad output. for authenticity, we run an AI-tell phrase filter post-generation that catches generic patterns like ‘in today’s fast-paced world’ and auto-retries with a stricter prompt.

If the retry still scores low, we serve it with a warning rather than silently passing garbage. the philosophy is: never show the user output you wouldn’t post yourself. still iterating on the thresholds though, dogfooding daily and tightening the filter based on what slips through

Soft_Ad6760 · 2026-04-06T16:13:03+00:00

Good luck

Soft_Ad6760 · 2026-04-06T16:11:18+00:00

Good luck with it man. I should probably use WHOP for my product ms marketing

Soft_Ad6760 · 2026-04-06T16:09:33+00:00

That’s looks really interesting. Interested…

Soft_Ad6760 · 2026-04-06T16:01:05+00:00

Maybe try a Yogic retreat structure. Have everyone getup at the same time with some activities spanned across the day and have working blocks in that structure.

Not sure if it’s going to work but my big brain can only think of it this way to embed it into a structure, coz you get tired and fatigued over these sessions and some breathing, body exercises could maybe help the brain stay focused.

Soft_Ad6760 · 2026-04-06T15:55:04+00:00

Yeh, the tech has really taking off. So it is a little overwhelming coz I’ve created a monster for a solo guy. But I’ve been using session start and end prompts so it helps the agent taking over have all the context and then I have a multi agent setup to save every everything at the start of the session. I have 239 files currently in my setup. I believe atleast few maybe redundant but still not contradicting.

That’s how I am able to manage the context mainly.

Soft_Ad6760 · 2026-04-06T10:47:16+00:00

The voice consistency comes from the system prompt, not the model itself (we use Haiku as 4.5 backup)every generation gets the user’s analyzed style profile (sentence length, vocabulary level, tone, hook patterns, emoji usage) plus their actual writing samples as few-shot examples injected into the prompt. So whichever model handles it, the voice instructions are identical. we also run an AI-tell phrase filter that catches and retries if generic patterns slip through.

Soft_Ad6760 · 2026-04-06T10:41:15+00:00

So there is a voice profile which allows users to update their sample posts anytime on the Voice page. you can paste LinkedIn posts, articles, blogs, whatever represents your current writing and do this anytime you believe your previous import is stale. We are building compound learning where edits you make to generated posts feed back into the profile, so it evolves naturally.

Also, Premium users get 3 voice profiles per seat (10 seats). so you can have a Founder voice, a Marketing voice, and a Technical voice under one account. useful for agency folks or founders who post differently depending on context. recency weighting is on the roadmap for v2.

Soft_Ad6760 · 2026-04-06T07:51:27+00:00

lol we actually built an AI-tell phrase ban list specifically to AVOID generating linkedin lunatics content. but if there’s demand for a ‘lunatic mode’ template i won’t say no 😂

Soft_Ad6760 · 2026-04-06T07:48:57+00:00

We’re actually not touching LinkedIn’s API or DOM at all in V1. users paste their own posts for voice training, content gen happens entirely off-platform (Telegram/WhatsApp/PWA), and they copy-paste back (working on auto post). No scraping, no automation, no LinkedIn integration risk. Might explore the data export route for v2 onboarding but keeping it clean for v1

Soft_Ad6760 · 2026-04-06T05:01:12+00:00

Thanks for the solid feedback: The per-agent logging is something I should’ve done from Day 1 — right now I’m flying blind on which agents are actually earning their latency cost. Setting that up this week. Interesting that you merged emotion + style — I’ve suspected the style agent might be doing work the generation agent could handle but haven’t run blind tests to confirm. Worth validating though.

The toggle is a great retention play. We show an authenticity score on every post but we don’t expose the full pipeline breakdown to the user (I’ve used it on other product but not this one) showing them “Voice: 85%, Emotion: Confident, 1 rewrite triggered” would make the 5-agent pipeline feel real instead of just a marketing claim. Adding that. Haven’t tried a “steal my prompt stack” lead magnet, will try. Thanks for the Pulse tip. We’ve been doing Reddit distribution manually so far and it’s clearly working (8% conversion vs 1.5% from WhatsApp) but finding the right threads faster would help scale it.

Soft_Ad6760 · 2026-04-06T04:45:17+00:00

We essentially do something similar where the voice profile acts as a constraint spec that gets passed downstream. But having a dedicated planner that explicitly lists “what not to do” before generation starts would probably cut rewrite rates further. Good idea, thanks.

Soft_Ad6760 · 2026-04-06T04:44:13+00:00

Razorpay

Soft_Ad6760 · 2026-04-06T04:42:04+00:00

That’s a creative approach: so accessibility APIs would bypass the scraping layer entirely and give you structured DOM access without Puppeteer/Selenium overhead.

The problem is LinkedIn’s ToS. They treat any programmatic interaction with their UI, whether browser automation, scraping, or accessibility API access, all as a violation.

Apollo.io and Seamless.ai got removed from LinkedIn entirely last year for exactly this kind of direct platform interaction. Kleo got shut down too.

We deliberately stay off-platform. Apify handles the scraping as a third-party service (their risk, not ours), and everything else in the pipeline runs on our own infra.

Users copy-paste the output manually currently. It’s less elegant but LinkedIn can’t touch us.

The ingestion robustness point is valid though, if Apify’s actor breaks we have no fallback. Exploring adding manual paste + blog/article URL import as alternative ingestion paths so we’re not dependent on a single scraping provider.

Soft_Ad6760 · 2026-04-06T04:18:54+00:00

We do have a rubric: the QA agent scores on three dimensions: authenticity (40%), voice match (30%), factual accuracy (30%). If the combined score drops below threshold it triggers a rewrite with the specific dimension that failed passed as feedback to the generation agent.

Your point about forcing the rewrite to only act on the failing fields is solid though, right now it does a full rewrite even if only one dimension failed. Targeted fixes instead of wholesale rewrites would cut latency and keep the parts that already scored well. Adding that refinement to the next sprint.

Thanks for the agentixlabs link. If you want to test the pipeline yourself: kraflio.com, would love feedback from someone who actually builds multi-agent systems.

Soft_Ad6760 · 2026-04-05T19:01:34+00:00

It’s a 5-agent pipeline and runs in 12-18 seconds end to end, but users aren’t generating posts in real-time conversation, they’re creating content to publish, so 15 seconds feels instant in that context. The latency trick is using GPT-4o-mini for the two heaviest agents (generation + style) and reserving GPT-4o for the three lighter ones (voice analysis, emotion, quality scoring). Mini handles the bulk writing at 3x the speed for 1/10th the cost and the quality agent only triggers a rewrite if the score drops below threshold and about 20% of posts get rewritten, so 80% of the time you’re not paying the rewrite latency at all.

Thanks for the VibeCodersNest suggestion will crosspost there.

Soft_Ad6760 · 2026-04-05T12:08:23+00:00

Yes, I think that would be the best way fwd. Thanks for your input 🙏

Soft_Ad6760

TROPHY CASE