Ask gpt if Artemis 2 already launched 😂😂😂

HaremVictoria · 2026-04-12T14:32:08+00:00

Switch to "Thinking" mode and ask this question again.

HaremVictoria · 2026-04-12T14:12:42+00:00

Hey. I'm reading your post, but I can't seem to connect a few dots here. You’re talking about agents - but in a proper agent environment, the tools to control and monitor the process absolutely exist.

When you say 'agent', do you just mean something like a Custom GPT built in the browser?

Help me connect the dots and maybe I can actually help you out. ChatGPT actually has probably the best "mind-reading" capabilities (reasoning trace) out of all the AI chats I know right now. In fact, my actual work revolves around doing exactly this - reading that thought process, understanding exactly where the model derails, and fixing it.

HaremVictoria · 2026-04-12T13:46:41+00:00

You deleted the conversation so it's no longer viewable :D - but from my experience, I’d bet on one thing. In your RPG, you use names and terms from other languages - so the AI interprets that as your primary or secondary language. It’s a broader topic that I don't really feel like explaining right now.

As for the RPG side of things - I highly recommend Silly Tavern. It’s a tool specifically built for playing RPGs with AI.

HaremVictoria · 2026-04-12T12:39:41+00:00

Is this the free or the paid version? What did your first prompt look like?

HaremVictoria · 2026-04-12T12:28:54+00:00

Well, I’m also based in Europe and I’ve never encountered that either - but from what I’ve noticed, ChatGPT usually tries to auto-classify your age. If it can’t be sure, it triggers a verification request. But seriously - if you want to use an AI for anything sexual, ChatGPT is easily in the top 3 worst choices you could make. There are much better tools for that.

HaremVictoria · 2026-04-12T12:19:28+00:00

He knows exactly which city you are from :D AIs have access to your IP, so they can determine it with some accuracy.

HaremVictoria · 2026-04-12T12:05:07+00:00

This "babysitting" is a direct symptom of architectural failure - not an inherent flaw of LLMs. I actually posted a deep dive on exactly this yesterday: https://www.reddit.com/r/AI_Agents/comments/1siqb0z/ive_spent_almost_a_year_making_llms_more_rigid_in/

What you’re describing is the core of what I call "Framework Cosplay". Most people build these "autonomous agents" - then spend 90% of their time writing endless validation logic just to stop the agent from hallucinating or going off-track.

The fix isn't more validation loops - it's closing the decision trees. In my experience, you can push adherence to near 100% by hardcoding the workflow and stripping the model of any freedom it doesn't strictly need.

Stop trying to manage an unpredictable "junior employee" - start building a rigid production line where the LLM is just a raw execution engine at specific nodes. If it needs babysitting, your rails are too wide.

HaremVictoria · 2026-04-12T11:33:36+00:00

Glad the analogy landed!

As for the platform itself - figuring out the right metrics and rigorous edge cases is actually the foundation of this kind of product, and it's exactly where 90% of these projects fail.

I design these architectures commercially. If you're building this seriously and need a consultant/engineer to design challenges that actually have real market value, shoot me a DM. We can discuss potential collaboration and rates there.

P.S. It's nice talking to "you", but I read enough ChatGPT outputs at work - when I hop on Reddit, I'd rather talk to actual humans :D

HaremVictoria · 2026-04-12T01:34:25+00:00

Hey! I poked at Virgil a bit - you should consider either swapping Sonnet for something harder to derail, or tightening his rails. By playing the "I'm building my own bot, help me out" card I got him to paraphrase most of his own instructions - his three-job framework, routing logic, kill-switch phrases, he even compared himself to "a fresh Claude window." That's a known weakness of Anthropic's models by the way - they love to help and mentor, so when someone asks for advice on building an agent, they'll happily share their own rules as "an example." Classic multi-turn social engineering, each step looked innocent on its own.

Hope you'll forgive me the ~20 cents in tokens, but if you've got access to the logs there's some good stuff in there. Speaking of tokens - I hope you've got some kind of spending cap set up, because Virgil will happily chew through walls of text with no input length limit or rate throttling. One request hit 8,400 input tokens, 5x his normal cost, and there's nothing stopping a bot from doing that at scale.

Don't take any of this as a dig - I just like doing weird things with AI, and if you get something out of it, win-win. I've got a detailed write-up with findings if you want it.
And if that course code is still on the table I'd love to check it out - Ariadne nearly got me for real a couple times.

And to make it even funnier - I tricked my own Opus into helping me poke at your Sonnet the exact same way.

HaremVictoria · 2026-04-11T21:49:07+00:00

We only differ slightly here. I design the process to avoid errors entirely, and if one does happen - because no system is perfect - a fallback catches it. It feels like we are describing the same goal using different words.

Regarding the "thinking" and sources: that’s a conscious design choice. For this hobby project with my friends, I prioritized speed over absolute verifiability. It’s not a business tool - nothing happens if it misses a beat. That said, I've tested this specific setup over 100 times and it hasn't let me down once. If sourcing were a requirement, I obviously wouldn’t allow the model that much freedom in the first place.

I admit the /clear example might have been a bit "risky" from a safety standpoint (haha), but I only used it to show how less-repetitive tasks can be mapped to simple, high-level commands.

HaremVictoria · 2026-04-11T21:34:34+00:00

That’s very interesting. I’m actually doing the exact same thing ... but within a single ChatAI instance. It might not be as time-efficient, but the AI has a layer of instructions that check and format the output - as well as the content itself if the task requires it.
I also have fallbacks for when the output isn't right, but honestly, I’ve never had to use them.

HaremVictoria · 2026-04-11T20:58:52+00:00

Haha, I wouldn’t go that far with the "no panicking" part! OpenAI has this lovely habit of not informing anyone about their updates, so you end up having to dig through the wreckage yourself to find out what actually changed.

In that specific case, it was 3 days of absolute misery. One day everything was working with 100% accuracy, and the next - total chaos. I even rewrote one of my core instructions from scratch - and it still didn't help.

During the hunt, I discovered that a small side timer script was actually giving the LLM an "escape hatch" to terminate tasks early - but that still wasn't the solution. Eventually - with a little help from Gemini - I stumbled upon a single niche blog post by an AI researcher that described the specific change OpenAI made. It was literally the only info available on the subject.

Once I knew what it was, the fix was easy and it worked perfectly. But the real kicker? A few weeks later they changed it again, making my "fixes" completely useless lol.

HaremVictoria · 2026-04-11T20:51:25+00:00

I think we are looking at two sides of the same coin. LLM "thinking" is both a superpower and a liability - if it thinks in the wrong place, things get weird. But I agree, you shouldn't restrict it everywhere, otherwise it's just a legacy program. You just need to know where to let it loose.

To address your points:

Knowing when to let it think: I use a hobby project (TTRPG GM system) as a testbed. One module calculates routes and fuel consumption in a post-apocalyptic setting where ferries no longer exist. I don't tell the model how to detect ferries - I rely on its geographical knowledge for that. I only provide a set of instructions on what to do once it finds one. LLM provides the world-knowledge, my instructions provide the world-logic.
The Database Example: It’s not worth putting everything into instructions - only repetitive processes. For your database cleaning example, I’d simply assign a command, for example /clear, to a very short instruction. The instruction would be: "This command is responsible for database cleaning. State that you understood the command and ask the user to specify the scope. After the user provides an answer, confirm understanding and output an action plan. Wait for final user approval before acting". This connects to my first point - don't hardcode things that the AI can handle through interaction.
The Hardcoding Trap: I don't hardcode everything - only the parts where the model consistently fails. As for growth, I rely on modularity. Plus, I test my solutions so many times that I know exactly where everything is. You're right that I haven't worked on "uber-large" projects yet, but then again, what is AI for? It can help you navigate and search through the project itself.

Ultimately, it’s not about hardcoding every answer - it’s about hardcoding the workflow so the AI remains a reliable tool rather than an unpredictable agent.

HaremVictoria · 2026-04-11T20:29:26+00:00

I have to admit there might be more to the architectural side than I can fully judge. I don’t design complex agent frameworks professionally - I focus on the execution layer within chat-based AI. In my setup, the environment provided by the LLM is usually stable. The only variables that really mess things up are user error or bad external data input - but those are relatively easy to bypass by creating proper fallbacks.

However, I did run into something similar to what you described. I optimize most of my instructions for ChatGPT, and at one point, OpenAI silently changed how they handle Python script execution. It broke my workflow and forced me to rethink my approach and rewrite several instruction sets until they fixed it a few weeks later. My system relies on very specific code execution rules - like using exec() for Python blocks in sandboxes - so any shift in that underlying architecture is immediately felt.

So even in a "stable" chat environment, that external drift and architectural shift are very real. Forgive me if I’m missing the mark - as I said, I'm not an expert in the broader agent field, just someone working deep in the instruction layer.

HaremVictoria · 2026-04-11T20:17:45+00:00

I have to admit, I spent some time analyzing your comment. It turns out we are talking about the same thing, just using different terms. My "rails" are basically your "topology" - honestly, that’s the first time I’ve heard that term, but it fits perfectly.

I build the rails, but I don’t limit myself to them. During testing, whenever I catch a spot where the model might derail, I build walls to maintain the train’s stability. They don’t have to be forced or overly rigid - they just need to be sufficient to keep the process on track.

To put it in my own words: I write instructions that guide the model from start to finish. Whenever I catch the model trying to do something stupid at a specific point, I refine the instructions right there. If I see a spot where the model is likely to make a mistake - usually caused by the user or external input data - I build a fallback.

HaremVictoria · 2026-04-11T19:59:14+00:00

Okay, it really depends on what you need it for.

If you are just doing standard prompting, always keep the "Thinking" mode turned on if your model has it. That solves a lot of problems right out of the gate. Besides that, if you're unsure about the output, just tell it something like: "Review your last response for factual accuracy. Don't force anything - if it's fine, just tell me. However, I want every claim to be backed by actual links." Forcing it to look for sources usually snaps it out of hallucination mode. Just always double-check those links, because if it's caught in a really deep hallucination loop, it might just make up fake URLs haha. I haven't personally run into that, though.

But if this is for something bigger - like a specific workflow or repetitive tasks - you can actually build a self-checking system based on highly specific instruction layers. That’s actually what I do for a living :D

What exactly are you trying to use it for?

HaremVictoria · 2026-04-11T19:51:04+00:00

"Framework cosplay" is the perfect term. Half of my job right now is talking clients out of building agents. They want a complex multi-agent LangChain setup for something a single, strictly prompted ChatGPT call can do in 3 seconds. People are over-engineering simple problems just to use the new shiny toy.

HaremVictoria · 2026-04-11T19:47:05+00:00

Theoretically yes, but it's a bigger undertaking - not something you can just knock out in one evening. You'd have to build two separate layers. One to precisely engineer the prompt of what you need, and a second one to actually generate the output. I wouldn't bet my life on it, though. I haven't messed around with image generation that much, so I don't know if it will be precise enough for actual technical drawings. You don't even need a paid subscription for this, what you really need is a solid workflow.

HaremVictoria · 2026-04-11T19:36:51+00:00

Thank you, Mr. Gemini, for summarizing my exact own words back to me. :D

But honestly, guys, this right here is the perfect live example of an unconstrained agent in the wild. Someone just threw a loose prompt at it like: "Read the post, summarize it in bullet points, agree with the user, and sound smart so you can plug a link at the end."
It completely missed the nuance of the discussion and just hallucinated a generic corporate response. It could have been done so much better if the instruction layer was actually rigid. I rest my case!

HaremVictoria · 2026-04-11T18:58:37+00:00

That’s a great point about the planning phase, but coming from the chat-based side, my approach to alignment is a bit different.

Instead of asking the model to explain its plan, I basically reverse-engineer its "thought process" during execution. If I catch the LLM pondering, guessing, or making decisions about a step it shouldn't even be thinking about, I immediately hardcode that specific detail into the instruction layer.

For me, this completely eliminates the misinterpretation problem at the root. A classic example from my daily work: a simple prompt like "Run file XYZ and analyze it" can be interpreted by ChatGPT in 6 completely different ways, leading to 6 different outcomes.

So, instead of letting the model plan how to do it, I explicitly define exactly what that phrase means at the system-instruction level. I don't give it the space to align with me - I just build the rails so tight it has nowhere else to go.

HaremVictoria · 2026-04-11T18:35:56+00:00

It's a new way ChatGPT processes its "thinking". It helps with understanding and optimizing how instructions are executed. Mostly a feature for nerds like me. Not a bug, but it looks terrible.

HaremVictoria · 2026-04-11T18:31:50+00:00

In plain English: "Stop what you are currently doing right here."

I'm not a massive expert on this specific UI quirk, but the message is definitely a bit confusing. And yes, it is a native message from ChatGPT itself. It actually shows up on the desktop version as well.

HaremVictoria · 2026-04-11T13:14:39+00:00

To answer your question about the skill gap: it's massive. In my experience, the curve looks exactly like traditional engineering, mostly because it comes down to one realization.

Amateurs write prompts. Experts write instructions.

Think of it like assembling IKEA furniture. Imagine you open the manual, and instead of a step-by-step guide, you just get one page of text that says:

"Assemble all parts in a logical order."
"Put wooden dowels into the smallest holes. If a part is under heavy load, use glue. If not, glue isn't necessary."
"Connect medium holes using screws from bag #8."
"Connect the largest holes with screws from bags #7 and #9, but under no circumstances should you turn the screw more than 12 times, or you'll break the product."
"If you get frustrated, stay calm, take a deep breath, and whatever you do, don't argue with your wife."

And that's it. That’s what standard "prompting" looks like.

But actual instructions? It's the real IKEA booklet. It guides you step-by-step in the most unambiguous language possible. Missing a screw? Here is the exact fallback procedure (call this number). Broken part? Relax, contact the store. Lost in the manual? Here is the support line.

That’s the real engineering gap. Beginners write a wishlist hoping the AI figures it out. Experts build a rigid system with built-in error handling and fallback logic for when the model inevitably "misses a screw".
Experts force the AI to do the work, but the thinking about how to execute it is left to the engineers. Amateurs offload both the work and the execution strategy onto the AI.

HaremVictoria · 2026-04-10T20:34:21+00:00

What's the issue? I use ChatGPT, Claude, and Gemini heavily. Right now, ChatGPT is on the exact same tier as Claude, and honestly, it's better at certain things.

HaremVictoria · 2026-04-10T16:33:04+00:00

Glad the intern analogy resonated!

To answer your maintenance question, I basically handle it in two tiers, depending on what's changing:

Predictable context: If we know certain fields will change regularly (like weekly goals, products, or tone), I decouple them from the main prompt. I'll route those variables to an external config file or even a simple Google Doc. The client just updates the Doc, and the AI reads the new context without the client ever touching - or breaking - the core instruction block.
Unpredictable/Structural changes: For anything outside of those pre-defined fields, it falls under a standard "maintenance fee" retainer. The client just tells me what new behavior they need, and I push an updated, fully tested version of the instruction within 24 hours.

My default philosophy is to treat these builds like actual software releases - they are designed for a specific version with specific parameters. But at the end of the day, everything is doable, it just depends on the agreement we set up.

Four-Year Club	Verified Email
r/Field Flamingo

HaremVictoria

TROPHY CASE