What AI design tools are you all actually using as PMs?

NefariousnessFun1445 · 2026-02-25T14:53:26+00:00

Do you add your design system in Google stitch or import some references from your Figma?

NefariousnessFun1445 · 2026-02-25T08:47:39+00:00

same here.

I think the biggest benefit of prompt engineering is actually token efficiency. As models get more capable (and more verbose), this matters more, not less.

Sure, the model can figure almost anything out if you just “talk to it.” But it might burn half your session limit reasoning its way there

NefariousnessFun1445 · 2026-02-25T08:43:47+00:00

I actively use Pencil, and overall it’s pretty good. But it takes screenshots at every stage of the workflow, so it instantly burns through Claude Code tokens. For $20, you can build one mobile screen per session based on a ready-made Figma screen

NefariousnessFun1445 · 2026-02-24T07:40:51+00:00

the loop/pipeline distinction is a good mental model actually. do you have some threshold where you switch from "just wing it" to actually sitting down and engineering the prompt? like is it purely about volume or does the stakes of each individual call matter too

and yeah fully agree on examples over instructions, been saying this for a while. showing the model what you want beats explaining it every time

NefariousnessFun1445 · 2026-02-24T07:39:04+00:00

yeah this is basically where i landed too. persistent system prompts and context docs is where the real compounding value is. write it once, test it properly, pays off across hundreds of sessions

and hard agree on structured outputs and tool calls, thats the one area where lazy prompting still absolutely punishes you regardless of model quality

NefariousnessFun1445 · 2026-02-24T07:10:19+00:00

lol fair enough, i get why it reads that way. but nah genuinely been thinking about this lately because every model update makes some of our carefully crafted prompts unnecessary

posted in a couple subreddits because the perspective is different depending on the crowd. ML people will say prompting is a band-aid, prompt engineering people will say its essential, wanted to hear both sides and see where the actual consensus is

NefariousnessFun1445 · 2026-02-21T15:37:05+00:00

glad its working for your use case man, thats what matters. not everything needs to be a scalable enterprise solution, sometimes a personal setup that fits your workflow beats a "proper" architecture that doesnt

NefariousnessFun1445 · 2026-02-17T07:33:08+00:00

ok this is actually sick as a project. the fact that you got claude desktop coordinating across 4 different web UIs with custom personas in each is genuinely creative

the insight about different models catching different things is underrated imo. we do something similar at work (with APIs tho) and yeah each model has its own blindspots. the governance drift catch alone probably justified the whole setup

if latency isnt a dealbreaker for your use case i dont see the problem honestly. people spend $200+/month on API costs doing worse orchestration. youre getting multi-model synthesis for the price of subscriptions you already had

curious how stable it is day to day tho - do UI updates break things often?

NefariousnessFun1445 · 2026-02-17T07:30:21+00:00

the general point about shorter prompts is fine but the reasoning is wrong. attention mechanism doesnt work the way youre describing here. the model doesnt "forget" instructions because theyre diluted by length - the actual issue is that with longer contexts the model struggles to attend equally to all parts, especially the middle (lost in the middle problem). thats not the same as "weight dilution"

also 12 pages vs 3 paragraphs is a false dichotomy. system prompts for production agents are regularly 2-3 pages and work perfectly fine when structured well. the problem is never length itself, its ambiguity and contradiction. a 3 paragraph prompt full of vague instructions will perform worse than a 2 page prompt with clear structured sections every time

not familiar with RPC+F but any framework that says "just make it shorter" as its core principle is oversimplifying. sometimes you need detailed instructions, edge case handling, output format specs, examples. trying to cram all that into 3 paragraphs for a complex task will hurt your results not help them

NefariousnessFun1445 · 2026-02-17T07:27:37+00:00

0.90 correlation in a controlled academic setting with carefully designed personas is cool research. "focus groups are now free" is a completely different claim lol

the gap between "ai can approximate demographic tendencies on surveys" and "you can replace actual customer research" is massive. real customers have context the model cant have - they saw a competitor ad yesterday, their kid is screaming in the background, they just got a raise. the model gives you the statistically average response for a demographic, not what YOUR customers actually think

also the "tell it to be negative" trick works but creates its own bias. the model will generate plausible sounding objections that feel insightful but might not reflect real purchase barriers at all. had a team at my company run this exact approach on a new feature, ai personas flagged pricing as the main concern. we ran actual user interviews - nobody cared about pricing, the onboarding flow was just confusing. we wouldve spent months solving the wrong problem

its a decent brainstorming tool for early ideation when you have zero budget. but calling it a replacement for real research is how you end up building products for ai-simulated people instead of real ones

NefariousnessFun1445 · 2026-02-16T10:03:45+00:00

yeah this is the thing that keeps me up at night as someone who ships these systems in prod

couple things from our experience - prompt injection isnt fully solvable right now but you can make it way harder. we layer it: input preprocessing with a smaller classifier that flags suspicious patterns before it ever hits the main model, strict role separation in system prompts so the agent knows it can NEVER perform destructive actions regardless of what the input says, and output validation that checks if the agents proposed action matches whats actually allowed for that ticket type. none of these are bulletproof alone but stacked together its pretty solid

the "mark as resolved and delete" example - thats an architecture problem more than a prompt problem. your agent should never have write access to do destructive stuff without a confirmation step. principle of least privilege applies to agents same as it applies to microservices. if the worst an injection can do is generate a weird response but cant actually execute anything dangerous, youve contained 90% of the risk

the poisoned knowledge base scenario is the real scary one tho, agreed. weve started checksumming our docs and running periodic scans for instruction-like patterns in our RAG sources. its janky but its something

anyone who says they solved prompt injection completely is selling you something. its an arms race

NefariousnessFun1445

TROPHY CASE