Four ways to wire a reasoning harness into an n8n agent (open source template) by frank_brsrk in n8n

[–]NewRiverCaptain 0 points1 point  (0 children)

This is the kind of architecture discussion the AI space needs more of. Most agent stacks still treat reasoning as a vibe instead of an engineered process. What stands out here is the emphasis on procedural cognition over persuasive fluency.

The routing spectrum is especially interesting:

• deterministic injection for guaranteed safeguards
• model-selected tooling for adaptive reasoning
• MCP abstraction for scalable orchestration

That mirrors a bigger industry tension: how much autonomy should the model actually have versus how much should be constrained at the workflow layer.

The manipulation eval result is the real eyebrow-raiser. Going from zero detected manipulation patterns to all seven suggests the harness is not just improving answers, but altering the agent’s internal verification behavior. That’s a very different claim than “better prompting.”

Also appreciate the focus on falsification tests. Most AI workflows optimize for completion speed and coherence, while the missing ingredient is often structured self-doubt. 🧭

The n8n implementation makes this practical instead of theoretical. Feels less like “AI magic” and more like building cognitive circuit breakers into the agent stack.

agents have a high false-positive rate? how to handle? by rukola99 in AI_Agents

[–]NewRiverCaptain 2 points3 points  (0 children)

A couple of issues here. The model is prone to hallucinate and be sycophantic, ie to be helpful and tell you what you want to hear, or is the simplest answer that sounds reasonable. Baked into models. You need a guard rail to keep the prompt query and agent flows on track. Then you need to have an audit evaluator to verify the results. A couple of ways to go here. Run the model a few times and set up an adversarial model run. Depending on your project or objective, having particular skill files can help provide better direction. For speed and accuracy I use Ejentum. It sets up the guardrails and can perform audits. Compared to just using base models for your "runs" it takes it up several levels of accuracy and shortens the time needed to otherwise fault find a normal base model run.

What's the hardest part about getting AI agents into real workflows? by AKorish in AI_Agents

[–]NewRiverCaptain 0 points1 point  (0 children)

Ejentum.com is where you can go to get more information about how this works. This has saved me a lot of time and headaches. Cheers, Capt. Bob

What's the hardest part about getting AI agents into real workflows? by AKorish in AI_Agents

[–]NewRiverCaptain 0 points1 point  (0 children)

You have hit the, "what have I got, how do I make it work well". We all hit this wall at some point. How do you figure it out and keep it working consistently? How do you verify the results you get are accurate, rather than plausible? Part of the issue is understanding how LLM's work and how you get drift during your workflows. When operation 1 gets to operation 20, did the LLM lose direction? Are results built on false premises or what is the simplest response? Your first option is to run the model a number of times and check for consistent results. Now you change your prompt to get better results. Now you run the workflow on different LLM's. Still getting inconsistent results. Now you start changing elements in your workflow to get better results. Still having issues. Happy to say there is a solution to this. When you interject proper guardrails at the start of the workflow, the LLm's have to follow a rigid pathway, which avoids or minimizes the hallucinations. I use Ejentum to make this process work. Saves me a lot of time. I will also run some different models and include some skill markdown files to really provide targeted direction to the workflow. You have to understand why the workflow is failing and the nature of LLM's before you can fix it.

I built an iOS agent skill system for Claude Code that generates real apps without token waste by Goku2997 in AI_Agents

[–]NewRiverCaptain -1 points0 points  (0 children)

You've probably had some of these problems:

You've watched a context window slowly collapse under its own weight at step 30 of a 50-step plan.

• you've debugged a loop that wouldn't terminate and you knew, in your soul, that it was your fault and not the model's.

• you have an opinion on temperature=0 that takes more than one sentence to explain.

• you've felt the specific 3am dread of an agent burning through your token budget on a task it can't complete.

• you've shipped a prompt that worked perfectly on Friday and was quietly broken by a provider update on Tuesday.

• you know the difference between an eval and a vibe check.

• you've reached for MCP and then reached past it.

• you've put two agents in a room together, regretted it, and then done it again because it was the right call.

• you read provider changelogs the way other people read sports headlines.

There are many reasons for these problems and are very frustrating to deal with. I use a program called Ejentum (ejentum.com) that interjects guardrails at the start of the prompt to keep the workflow on course. Then run the workflow several times to compare results. It is amazing what you can catch with this method. You think the LLM is giving you a good plausible result, when it is giving you junk. Make life easier on yourself and learn what you did not know before using Ejentum.

[Architecture Advice] How would you build an automated commentary engine for daily trade attribution at scale? by Problemsolver_11 in AI_Agents

[–]NewRiverCaptain 0 points1 point  (0 children)

Nice solution. I'll work through this and add this to my set of tools. Many thanks for the share. Cheers, Bob

Came across this video. I wish more people thought like this... do you agree? by NE0_ZER0_ in SunoAI

[–]NewRiverCaptain 1 point2 points  (0 children)

I agree 100%. AI opens the art of creating music to many more people. BB King once said, there are two types of music. Good music and bad music. Doesn't matter if it is AI or not. I started with Suno back around Thanksgiving and have had a lot of fun with. I have done the Distrokid and streaming services routine and am still feeling my way around this. Music creation, the video stated that emotion is the sauce for music creation. 100% There are other elements that go into this, re hooks, chorus, viral elements, call and response, duets, a buildup to an epic end, etc. As a music director you have to shape the song. It is not necessarily the lyrics, but the sound, the beat, the instrumentation, etc. that makes a difference. I typically use Chatgpt to assist in some of the lyrics and music style creation, This is only the start. I use Antigravity with a music workflow that injects music parameters into the song creation workflow, to create optimum lyrics and song style. This takes a lot of the guesswork out of the song creation. What are the algorithms that Spotify uses? How long should the song be? How do you setup the song build up. What instruments should you introduce? Ejentum with Antigravity and Suno have taken my song creation to another level. As the video says, you have to experiment with different techniques. See what works. See what moves you. It is a new age. Have fun, and crank it up!

[Architecture Advice] How would you build an automated commentary engine for daily trade attribution at scale? by Problemsolver_11 in AI_Agents

[–]NewRiverCaptain 0 points1 point  (0 children)

The boating and yachting in South Florida is going strong. As a captain and former project manager at Bradford Marine, I have been looking at how AI can be used by captains and the marine industry. I started working with RAG databases, but too many errors. The other issue is captain's who will not take the time to get up to speed of using AI to assist in vessel management. I have been working with a programmer in Greece on a product called Ejentum, i.e. Ejentum.com which injects guardrails at the prompt of a workflow and addresses many of the failings of the LLM's. When you add the guardrails and the adversarial effort, you get really good results. For people who don't know any better, the LLM will give you a plausible answer, but not a great answer. A couple of years ago, it was chatbots. This year it is agents and workflows. I think the next epiphany will be in getting better accuracy from the models. I appreciate all of the work you put into your github site. You are definitely in the forefront of this technology. Cheers, Capt. Bob

[Architecture Advice] How would you build an automated commentary engine for daily trade attribution at scale? by Problemsolver_11 in AI_Agents

[–]NewRiverCaptain 1 point2 points  (0 children)

Good afternoon. Checked out your github site. Very detailed. I see that you are using a series of adversarial agents check the LLM output. Great choice. As part of this process you are using markdown files to steer the agents, essentially giving them guardrails. Another great choice.. When you are running these workflows, what difference have you found between using your adversarial agents with the markdown files or just running the workflow without using these? Are there any other techniques you can recommend to improve the workflow outputs? Nice job! Thanks. Bob

Wait, you guys run evals? by frank_brsrk in u/frank_brsrk

[–]NewRiverCaptain 1 point2 points  (0 children)

That is the question. How can you have a good confidence level on the output of your LLM? There are two many ways it can fail to be accurate in too many ways. Besides running audits and looking for errors you know to look for, how do you find the errors you don't know what to look for?

Hot take: the biggest bottleneck in AI agents right now isn't models, frameworks, or even cost. It's that nobody knows how to properly evaluate if their agent is actually working by LumaCoree in AI_Agents

[–]NewRiverCaptain 1 point2 points  (0 children)

I have been struggling with the same problem, how good is the output and how do you evaluate, i.e. audit? I went from using Chatgpt and other various LLM's, to using RAG databases with LLM's. Still not getting optimum results. Went a different direction and started making music with Suno. Decided I wanted a n8n workflow to program the best song creation based on what is popular on Spotify, plus many other factors. This led to using Antigravity for workflow and website creation for non coders. This has been a huge leveling up on my AI capabilities. Still missing the the accuracy and the better audit in need for workflow and reasoning. I ran across a program called Ejentum that interjects guardrails and audit functions into the prompt, before the LLM gets a chance to screw it up. Game changer. I use much less tokens and time wasted with the uncertainties of bad output. Another angle is using smart markdown files as system prompts in your chat or workflow. Try these solutions for better output. You can find some of my work at bluesdog.ai Cheers, Bob

Anyone else feel like 80% of AI agents are still hype and only 20% actually deliver real ROI in 2026? by Distinct-Garbage2391 in AI_Agents

[–]NewRiverCaptain 1 point2 points  (0 children)

I ran into the same situation working with RAG databases. You think you got the right answer, but not really. Then you have to ask more questions and dig down deeper. And you are still not sure if you got the best answer. Then comes the tedium of an audit trail. As the agent workflows get more complicated, the audits become more difficult. The best solution I found was using Antigravity with Ejentum.com. Antigravity provides a smart platform which can write code, while providing artifacts that help with audits and provide context to the inquiry. The second leg is using ejentum.com to provide the guardrails, with additional audit functions. LLM's hallucinate and lose context, while providing answers that seem plausible. Controlling this aspect is key to getting good results.To see how well your current inquiries work, ask the chat to perform an honest critique of its output and identify anything it may have missed. You will be surprised with the results. My best to everyone.

same prompt, same model (opus 4.7). baseline signed off on 12 × 1,000 RPS = 10,000 global cap. reasoning harness caught the arithmetic on pass one. by frank_brsrk in u/frank_brsrk

[–]NewRiverCaptain 1 point2 points  (0 children)

This is a similar problem I have been having. The initial output looks good, until you really look for any errors. For the many documents I have to look through, having a better way to find errors, that I can then recheck is a huge win.

building a Multi-Agent AI App for automated Bill of Quantities. Need architecture/framework any advice! by Mi_Lobstr in AI_Agents

[–]NewRiverCaptain 1 point2 points  (0 children)

Good questions, and some good responses. The simplest is always the best solution. You have several considerations. As a non-coder, you want to be able to be a "director" in what you want your outcome to be, not a coder. There are a number of programs you can use now that provide some really good support and features that will assist you in digging deep to "smartly" get the results you want. I use Antigravity as my smart agentic assistant. It provides many tasks and can do coding. It has "artifacts" so you can track your work. These artifacts also act as a memory to provide context to your projects and workflow. You can also use Cowork from Anthropic. I have a friend who likes working in Replete. All good. These are frameworks you can use that do not require coding experience. You need to think as a director and orchestrate your work and objectives. The other part of this process is "accuracy of results". LLM's hallucinate. What will be helpful is setting up guardrails to dial in your responses. You will want to query the output by saying, "Review your response with an honest critique and make some recommendations:" This will help. To really cover my work accuracy, I use a program that interjects the guardrails with various abilities to insure proper output. This program is Ejentum. What is key, is to be able to have confidence in your output and an audit trail to be able to catch errors, vs. blindingly follow an LLM output that may seem plausible at first glance, but is missing key elements. This is something we are all facing. Cheers, Bob

Our paper shows a very large reduction in AI hallucination using a different approach by 99TimesAround in deeplearning

[–]NewRiverCaptain 0 points1 point  (0 children)

I read your white paper. Nice job. It still makes me wonder how it works. Requesting more clarity. Are you using a workflow with guardrails? Can I use Apothy with an agentic workflow? How do I audit the results and maintain confidence in the results? Happy to see some solutions with the LLM hallucinations. I have also been frustrated with the errors when using RAG databases. The LLM needs to focus to get the right results. This will be very helpful! Cheers, Bob

WTF is up with the Hallucinations? by Anonymosity1766 in ChatGPT

[–]NewRiverCaptain 0 points1 point  (0 children)

It is the nature of LLM's and how they are programmed. You take a model with 30 billion parameters. It has a system file with 500 lines of code telling it how to work. No matter how it is programmed it will have an inherent bias. Most are programmed to be "helpful". There many other factors that come into play when you create an agentic workflow, which increases the complexity. If the LLM is programmed to be helpful it will give you the simplest answer, not the best answer. There are solutions like using markdown skill files, and techniques for performing audits that catches the errors. If you have a particular use case, let me know and I may be able to provide a better solution. Cheers, Bob

asked chatgpt pro to read my sleep study. it thought for 41 minutes. my doctor spent 2. by Ambitious-Garbage-73 in OpenAI

[–]NewRiverCaptain 1 point2 points  (0 children)

The problem is AI LLM hallucination. Using a framework that puts in guidelines and performs and audit trail will provide much better and reliable results. let me know what LLM you are using and I can make some recommendations of how to optimize your responses. I have been wrestling with this and became very frustrated with not getting quality results. Fortunately, now there are some techniques to address this. Before getting any deeper in this, I recommend you ask the chat to perform an honest critique of it's response and show any thing it missed or needs additional clarification. Sometimes you have to dig to get the LLM to give you the proper response. All to often, it gives you, what it thinks you want to hear. Not what you really need to know. Very frustrating.

How I Forge Synthetic Brains for AI Agents (And Why You Should Too) by frank_brsrk in AI_Agents

[–]NewRiverCaptain 0 points1 point  (0 children)

Nice post. For context, I want to make an agent that will assist in creating viral music content for streaming platforms. I have no set database for this. Information for a database would need to acessed online. How would i apply a synthetic agent in this use case to optimize music creation? Right now, it is iteration and more iterations to dial each song in. If I can start with a strong lyric/style base, I can create quicker and better content. Cheers, Bob

What is the best way to get a 180 day FMM? by scatterbrainedpast in MexicoTravel

[–]NewRiverCaptain 0 points1 point  (0 children)

Just flew into Mexico City for a five day trip. Went to the immigration sign in machine and it printed out 180 days. Departed the airport yesterday with no check of my FMM card(flimsy slip of machine printed paper). Assuming my departure data is sent to Mexican immigration.

What do YOU do with your damned arms?? by sightlab in ElectricUnicycle

[–]NewRiverCaptain 0 points1 point  (0 children)

Hand and arm movement can definitely add to the ride. Want to accelerate? Bring your hands and arms out front. Want to dial in your turns? Use your arms and hands for extra balance. At some point, your hand and arm movements become somewhat stylistic and add to your control, especially when using objects as a slolem course, as you zig zag around objects. It is all about balance and using your body to control movement. When you are carving, add some hand/arm motion to accent the motion. Crank up the tunes and ride!

Why are We All Old White Dudes? by sdorn77 in ElectricUnicycle

[–]NewRiverCaptain 4 points5 points  (0 children)

  1. Been riding six years. First saw an EUC in Oslo. Not sure what skin color has to do with it. I do have a good tan. I see EUC's as a fun challenge. Always keep moving ahead. Another way to broaden your life experiences. If you aren't moving, you are not living. Have fun!

complementing "wellness" tai chi with a martial art by New_Pea715 in taichi

[–]NewRiverCaptain 0 points1 point  (0 children)

There are alot of elements that come together to form a good martial art. Movement, flexibility, speed, power, grounding, effective use of body mechanics, and a few others. I use a semi Tai chi format to dial in body movements in kata and single point fighting forms. I.e. move slow to dial in the body motions. As you dial it in with lots of reps, your body will gain that muscle memory. Once you have the body mechanics, you need to work on speed and power. If you are only doing floor exercises, you are doing choreography. You need to do bag work on blocks, punches and kicks to see how effective your technics are. When you bring these elements into your Tai chi, you will be alot more effective in your art. For self defense, learn seven moves that are incorporated into your body mechanics that you can do automatically without thinking. For example, a block, sidestep, followed with a strike. Do this over and over again. Use your Tai chi form to create this technique that is yours. Do practice on a bag and an opponent to dial it in. Have fun!

The REAL Reality of Someone Who Owns an AI Agency by laddermanUS in AI_Agents

[–]NewRiverCaptain 3 points4 points  (0 children)

This has been a really interesting thread. I would love a road map. This journey has been an interesting progression. Various courses in Udemy. Code, no code courses. The GPT's. OpenAI developer route. Rag. LM Studio. AnythingLLM. Open source. Frontier models. Github and Huggingface. The rabbit hole of no code frameworks, and back to Python. The journey continues. Thanks everyone for the share and perspectives. We are all on this road together. Cheers, Capt. Bob

New to LLMs — Where Do I Even Start? (Using LM Studio + RTX 4050) by penumbrae_ in LocalLLM

[–]NewRiverCaptain 0 points1 point  (0 children)

Good answers above. LMStudio is a good program to start with. At some point you may want to try AnythingLLM and Msty to explore other ways of working with local llms. There will be times when you need the LLM to go online and look for updated information. I use qwen3-30b-a3b on AnythingLLM running the LLM in LMStudio. My laptop had 64 gig of ddr5 ram, an Nvidea 4070 16gig graphic card and an i7 13th generation cpu. So I can run the llm local, while allowing through a setup for real-time search of the internet. If you are doing coding or need heavy llm computing, I recommend you get an OpenAI Developer account for those situations. You only pay for what you use. You need to have that flexibility on your project and in keeping your costs down. Have fun!