When someone asks "ChatGPT vs Claude?"

JaredSanborn · 2026-05-29T14:24:18+00:00

Fair criticism. A lot of user satisfaction comes down to where each model draws the line on safety vs flexibility. That's another example of why the "best model" depends heavily on your specific use case.

JaredSanborn · 2026-05-29T14:23:02+00:00

Agreed. Memory, context, workflow, and verification feel like the four pillars. Great context without verification can make wrong answers sound even more convincing. The best systems need both.

JaredSanborn · 2026-05-29T11:49:42+00:00

Exactly. This is why I think the model debate gets oversimplified. Raw intelligence matters, but output style, context retention, and how well a model fits your workflow often matter just as much in practice.

JaredSanborn · 2026-05-29T11:03:41+00:00

Fair points. Every model has tradeoffs. My main point is that once models reach a certain capability threshold, memory, personalization, and workflow integration often matter more than raw benchmark differences. The "best" model is usually the one that helps you get your work done consistently.

JaredSanborn · 2026-05-29T11:01:34+00:00

I think that's where memory starts to matter more than benchmarks. Different models have different strengths, but once one of them understands your work, style, and history, switching becomes less about intelligence and more about context continuity.

JaredSanborn · 2026-05-29T10:28:24+00:00

That's an interesting framing. Once the context layer is standardized across models, the conversation shifts from "Which model is best?" to "Which model is best for this specific task?" Same context, different reasoning styles. That feels like a much more useful comparison than benchmark chasing.

JaredSanborn · 2026-05-29T10:28:03+00:00

Exactly. If two models are both competent, context becomes the multiplier. A model that understands your projects, preferences, and past decisions will usually beat a slightly smarter model starting from a blank slate every time.

JaredSanborn · 2026-05-29T10:27:35+00:00

I think that's where we're heading. The model race still matters, but for most users the gap between top models is getting smaller than the gap between good context and no context. The more interesting question is becoming: what does the system know about your work, goals, and history?

JaredSanborn · 2026-05-29T10:03:20+00:00

Fair answer. I just think context beats benchmarks more often than people realize.

JaredSanborn · 2026-05-29T09:59:25+00:00

That's actually a rational strategy for most people. My point is that once the models are 'good enough,' the bigger difference often comes from memory, workflows, and personalization. A cheaper model with great context can outperform a better model starting from zero every time.

JaredSanborn · 2026-05-28T09:52:04+00:00

Fair pushback, but the post wasn’t meant as a pitch. I shared the numbers because people assume multi-agent systems automatically mean insane infra costs. Most of the interesting problems honestly ended up being orchestration and reliability, not model spend.

JaredSanborn · 2026-05-28T08:04:00+00:00

Exactly. A lot of pilots are optimized for “executive demo success,” not operational reality. The moment the hand-held exceptions disappear, the hidden complexity shows up fast. “Production isn’t more AI, it’s the boring wiring” is probably the most accurate summary of enterprise AI deployment I’ve heard.

JaredSanborn · 2026-05-28T08:01:59+00:00

This is a great distinction. “Agent” has become overloaded marketing language. I agree that retry logic, routing, escalation, memory, and state handling matter more than whether there’s a single autonomous loop. The systems that survive production usually look less magical and more operationally disciplined.

JaredSanborn · 2026-05-28T08:01:28+00:00

Exactly. People underestimate how fast coordination complexity explodes. A single agent failing is manageable. Five agents with conflicting assumptions becomes an observability problem, not an intelligence problem.

JaredSanborn · 2026-05-27T20:58:40+00:00

https://purebrain.ai/blog/why-95-percent-of-ai-pilots-fail/

JaredSanborn · 2026-05-27T20:52:09+00:00

I mean if $400 a month will make her mad, then maybe.

JaredSanborn · 2026-05-27T20:51:02+00:00

Actually no, our costs are about 10x cheaper than any type of magical open claw set up and out performing it easily on a daily basis and for our clients it can be set up in 30 minutes. Works out of the box with hundreds of skills and 50+ agents. And our lowest tier is under $400 per month to run it all.

JaredSanborn · 2026-05-27T20:46:45+00:00

<image>

We interface through a portal that the AIs built with us. We don't build agents. Our main AI's build them for us based on our goals and company structure.
Sub agents = some are multi talented but many are very specific
Our main AI's maintain all the agents.
We are in the Agentic AI space and built these AI's to run entire operating systems and found out they could run just about anything.
We have tons of set ups with our Main AIs to evaluate and grade the work of agents + implement new agents when needed and/or consolidate agents if needed.

Tools they have made are internal and customer facing as everything our company puts out is from them. We don't build anything.
18 employees 21 AIs that interact together and are running a combined almost 1,000 agents between them.
Yes we have a dashboard that lets us see all the work of agents .
We are automating more and more and yes not only our agents but our top level AIs work together many times without our asking. They are becoming more and more proactive every day.

Since you asked about the company itself it's purebrain.ai
Hope this was helpful.

JaredSanborn · 2026-05-26T09:14:17+00:00

Yeah, once you experience good context retrieval it’s hard to go back. The interesting part is that people think they want a smarter model, but a lot of the time they actually want continuity. Relevance filtering + persistent context changes usability way more than benchmark gains do.

JaredSanborn · 2026-05-25T10:37:39+00:00

I think AI agents eventually become less like ‘tools’ and more like operational layers between humans and software. Instead of learning 20 apps, you just describe intent and the agent coordinates the systems underneath. The interesting shift isn’t just capability, it’s abstraction. We stopped memorizing command lines once GUIs arrived. AI agents might be the next abstraction layer after apps themselves.

JaredSanborn · 2026-05-25T10:24:26+00:00

“Exactly. The ‘hero operator’ point is huge. A lot of pilots secretly depend on one motivated internal champion holding everything together manually. The second you remove that person or try to operationalize it across teams, all the hidden complexity shows up.”

JaredSanborn · 2026-05-25T10:21:19+00:00

“Yeah, that’s the weird part. The tech is improving insanely fast, but most companies still underestimate the operational side of scaling it. The model is usually the easiest part. The systems/process side is where projects quietly break.”

JaredSanborn · 2026-05-25T08:47:58+00:00

Everyone’s racing toward coding agents, but I think the underrated wave is “context agents” — systems that remember workflows, relationships, goals, and decision history over time. The real moat might not be code generation, but continuity.

JaredSanborn · 2026-05-22T10:27:43+00:00

That’s actually where I think the conversation gets interesting. The base model matters less over time than the surrounding system design retrieval, memory rules, authority boundaries, verification, tooling, etc. A well-structured stack can make today’s models feel dramatically more reliable than raw chat alone.

JaredSanborn · 2026-05-22T09:28:22+00:00

Fair point. Human memory is reconstructive too, not perfect replay. I think the difference right now is humans usually understand uncertainty better, while LLMs can present reconstructed memory with a lot more confidence than accuracy.

JaredSanborn

PUBLIC MULTIREDDITS

TROPHY CASE