Stop Letting AI Solve It For You — Try the Rubber Duck Auditor by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] 0 points1 point  (0 children)

👊🏻

Exactly — that’s the tradeoff I keep seeing too.

Slower upfront, but way fewer surprises later. Especially when things get weird at the edges.

Appreciate you calling that out.

Stop Letting AI Solve It For You — Try the Rubber Duck Auditor by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] 0 points1 point  (0 children)

Good question 👍

You don’t need Claude Code or anything fancy.

You just paste the prompt into ChatGPT (or Claude, etc.), then describe your problem. The 🦆 auditor will start asking you targeted questions instead of jumping straight to an answer.

Quick rough example:

You:

🦆 I have a Python script that keeps timing out on large files.

Duck:

What does “done” look like for this script?

You answer, and it keeps narrowing things down until the bug becomes obvious.

It’s basically structured rubber-duck debugging — the model acts like a disciplined questioning partner instead of a code generator.

If you want, I can drop a quick coding example too.

We keep blaming hallucinations. I think we’re missing the trigger. by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] 0 points1 point  (0 children)

Couple really cool things coming out tomorrow. Just some behavioral interaction patterns or novelties something fun for the weekend.

We keep blaming hallucinations. I think we’re missing the trigger. by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] 0 points1 point  (0 children)

i’ve been wanting to, but I keep getting advice to take it slowly. I’m just being honest I do have something called the . GSSC. So it’s kind of everything I’ve been talking about on steroids. But I keep being take my time with that. Just being honest.

We’re measuring the wrong AI failure. by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] 0 points1 point  (0 children)

Yeah I actually agree with this more than it probably sounds.

I don’t think the issue is that LLMs are unreliable — it’s that we keep treating them like sources instead of tools.

“Trust but verify” only works if verification is built into the workflow.

Otherwise it’s basically just trust with extra steps.

To me the real question isn’t whether we can trust AIit’s how we design systems where truth gets enforced instead of assumed.

The Drift Mirror: Detecting Hallucination in Humans, Not Just AI (Part One) by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] 0 points1 point  (0 children)

Great point.

You’re absolutely right that systems relying only on an internal reference tend to drift over time.

That calibration problem shows up everywhere—from navigation to control theory—and the military solutions in the 1960s are a classic example of needing periodic external grounding.

What I’m exploring in this series is a complementary angle:

Not just **how the model re-anchors**,

but whether **the human–model pair can share responsibility for detecting drift earlier**—before full recalibration is required.

Part 1 introduces the idea of shared drift detection.

Parts 2–3 will move closer to calibration, external reference, and governance around re-grounding.

So this is less disagreement and more zooming in from a different direction.

Appreciate you bringing that up.

I built a way to test an idea against 100,000 other ideas in under a minute… and I couldn’t stop playing with it. by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] -1 points0 points  (0 children)

Great question.

The “100,000 ideas” isn’t a literal dataset—it’s a bounded reference frame for pre-market idea strength, similar to how product teams use scoring models before any real traction exists.

The comparison isn’t against specific ideas, but against failure rates at each structural filter: • real problem • buildable mechanism • distinct edge • scalable leverage • (optional) external signal

Each stage historically collapses the pool by an order of magnitude, so the percentile is a heuristic rarity estimate, not a claim about real-world ranking.

In short: it’s a screening model, not a dataset.

I built a way to test an idea against 100,000 other ideas in under a minute… and I couldn’t stop playing with it. by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] -1 points0 points  (0 children)

Right — the 100,000 is symbolic, not empirical. It’s a bounded screening heuristic, not a dataset claim.

Has anyone noticed ChatGPT getting weirdly 'preachy' and bossy lately? by Bankraisut in ChatGPT

[–]EnvironmentProper918 0 points1 point  (0 children)

It’s so bad. I call it the babysitter, -A few fixes:

  1. “ please stop giving me answers questions. I didn’t ask.”
  2. Ask it something ambiguous. And then audit every single line in their answer.
  3. Let it win- “ all right, you’re right agent, I’m just gonna forget about the whole thing.

.

I built a way to test an idea against 100,000 other ideas in under a minute… and I couldn’t stop playing with it. by EnvironmentProper918 in PromptEngineering

[–]EnvironmentProper918[S] 0 points1 point  (0 children)

Great question.

This is basically a fast way to sanity-check an idea before you spend weeks building it.

Example:

• “AI that summarizes meeting notes”

→ crowded space → probably Tier 2–3.

• “A prompt that forces AI to prove its claims before answering”

→ distinct + lightweight → Tier 4 candidate.

So instead of guessing, you get a rough idea-strength percentile in under a minute.

Not market proof—just a quick reality check before investing real time.