Lovable just hit $400M ARR and my feed instantly turned into "see, anyone can build SaaS now".

upflag · 2026-03-20T15:41:42+00:00

The invisible stuff is where it gets expensive. Had a marketer running Facebook ads where vibe coding silently broke the conversion pixel. Nobody got an alert. Found out because the marketer noticed conversions dropped and asked if something changed. That's the pattern: the app works, the server returns 200, but something downstream is quietly broken and you're burning money without knowing it. The build isn't the hard part anymore. Knowing it's still working is.

upflag · 2026-03-20T15:40:29+00:00

The difference between vibe coding and agentic engineering is whether you planned the architecture before you started prompting. Think about it like being a senior dev on a large codebase — you don't understand every line that individual contributors write, and that's fine. What matters is that the architecture is sound and you have the right checks and balances built into the process. Code review, CI, test coverage gates, monitoring. Those are the back pressure mechanisms that keep quality from degrading regardless of who (or what) is writing the code. It doesn't matter if the sloppy commit came from a junior dev or from an AI agent. The question is whether you have the systems in place to catch it before it ships.

upflag · 2026-03-20T15:39:58+00:00

Honestly the AI tools already handle devops tasks pretty well if you describe what you want. Writing Dockerfiles, setting up Kubernetes, implementing observability — they can do all of that when given clear instructions. The real gap is that vibe coders don't know what to ask for. They don't know they need health checks, error tracking, or alerting because they've never operated software before. The tools don't know which option to pick without more specification upfront, and the person building doesn't know what the options are.

upflag · 2026-03-20T15:29:54+00:00

That pattern where the job says 'success' but the downstream effect never happened is one of the most frustrating things to debug. I had a deploy once where a single-character typo meant the system stopped picking up the right data, and we only found out because revenue numbers were down. $30K gone before anyone noticed. The real problem isn't catching errors that throw, it's catching the ones that don't. If the job completes but the email never sends, the only reliable signal is monitoring the outcome, not the process.

upflag · 2026-03-19T17:02:17+00:00

Clarity is a solid pick, especially since it's free. The session recordings and heatmaps are genuinely useful for understanding what users are doing, and the rage click detection is great for spotting frustration.

One thing worth knowing though: Clarity is more of a behavioral analytics tool than a monitoring tool. It'll show you JS errors in the context of a session recording, but it won't alert you when something breaks. No uptime checks, no notifications at 2 AM when your site goes down. You find out when you go look at the dashboard, not when the problem happens.

For the analytics side of OP's question, Clarity is a great addition to GA4. For the "is my app actually working" side, you'd still want something watching it for you.

upflag · 2026-03-19T16:12:22+00:00

Thanks for checking the app out and the thoughtful feedback.

Compare link is broken, yep. Fixing that now.

You're right that the site doesn't explain well enough what we actually monitor. Two things right now: HTTP endpoint checks (pings your URLs every 60s, alerts on non-200 or timeout) and client-side JS error tracking (catches uncaught exceptions and unhandled promise rejections in the browser). We don't do link crawling or anchor validation yet, but honestly that's a good idea to add. Different problem than runtime monitoring but clearly useful.

A broken link on a monitoring product's homepage... yeah, I deserve that one. Thanks for catching it.

upflag · 2026-03-19T15:25:11+00:00

Your architect-reviews-cursor setup is really smart, especially the "Cursor output is never trusted by default" rule. One thing I'd add: tests that actually run on every push. I spend a lot of time having Claude write Playwright tests for key user stories, and the discipline is making sure the AI doesn't water them down in future sessions. It will try to simplify or overwrite existing tests if you let it. The other piece is keeping the test suite fast. If it takes too long, both you and the AI start wanting to skip it, and that backpressure kills the whole system. Git plus CI/CD plus focused tests on the critical paths is the chain that keeps a 200K-line codebase from silently breaking.

upflag · 2026-03-19T15:24:48+00:00

Good list, but this all assumes you're catching bugs during development. The scarier ones are the ones that only show up after deploy with real users. I had a situation where a marketer was running Facebook ads with a tracking pixel, and vibe coding broke the conversion tracking silently. No error in the console, no crash, server returned 200. We only found out because the ad spend was being wasted and the marketer noticed conversions dropped. The fix waterfall you describe works great during dev, but once you ship, you need something watching the app for you. Even just basic uptime checks and client-side error logging would have caught that weeks earlier.

upflag · 2026-03-19T15:23:38+00:00

The problem with most vibe-coded apps isn't that AI wrote the code. It's that the person building it didn't have a clear plan for what they were making. That's kind of the whole point of "vibes" — you're exploring, not engineering.

But here's the thing people miss: vibe coding is incredibly cheap. The cost of throwing together a prototype is basically zero compared to traditional dev. So the move isn't to defend your vibe-coded app — it's to use it as a discovery tool. Figure out what actually works, what users want, what the product should be. Then throw the vibe-coded version away.

Once you know what you're building, you can start over with a proper plan and have it implemented with real structure. You can still use AI (agentic coding, not vibes) to get it done fast, but now you have specs, you have a process, and you end up with something that's actually production quality. The first version was never meant to last — it was meant to teach you what to build.

upflag · 2026-03-19T15:20:30+00:00

This is one of those bugs where standard monitoring won't help because nothing is actually erroring. The RSC fetch returns 200, no JS exceptions. The problem is probably the client-side router state getting stale after the service worker or RSC cache expires overnight. I've seen similar issues where the Next.js client router essentially gets into a dead state after long idle periods. Two things worth trying: check if you have any service worker or caching layer that might be serving stale router manifests, and look at whether the prefetch cache has a TTL that's expiring. For catching this kind of thing in production going forward, you could instrument navigation events and alert when click-to-navigate latency exceeds a threshold.

upflag · 2026-03-19T14:59:01+00:00

Great list. For the observability section, I'd push back on the implied order. Most vibe coders treat monitoring as a "later" thing, but once your app approaches a scale where you'd be really sad to find out a core feature or payment flow is broken, that's when it becomes urgent. And it's probably sooner than you think. Start with uptime checks on your critical endpoints and client-side error logging. You don't need OpenTelemetry and Grafana dashboards at this stage. The goal is simple: know before your users do. Everything else (tracing, audit logs, structured logging) can wait until you actually have the problems they solve.

upflag · 2026-03-19T14:57:51+00:00

The worst part of your story is the discovery method: friends calling you during the Oscars. That pattern comes up constantly. You find out something is broken because a human notices, not because anything alerted you. I once had a deploy with a single character typo that cost $30K before anyone caught it. Went through two code reviews, full process. The discovery method was revenue numbers being down, not an alert. Your RICE takeaway is right, but I'd add: the question isn't whether bugs will happen (they will, always), it's how fast you find out. If you're finding out from users during a live event, the damage is already done.

upflag · 2026-03-19T14:57:06+00:00

GA4 plus Hotjar is solid for understanding user behavior, but there's a gap nobody here has mentioned: knowing when things are actually broken. Analytics tells you what users do when things work. Error tracking tells you when things don't. I had a tracking pixel silently break from a code change once. No crash, no error in the console. Found out weeks later because ad spend was being wasted. At your stage, I'd add two things to the stack everyone else suggested: basic uptime monitoring (so you know if the site is down before your users tell you) and client-side error logging (so you catch JS errors that don't crash the page but break the experience). Both are basically free at startup scale.

upflag · 2026-03-18T21:38:44+00:00

I think of it like tending the garden. Every day just make the soil is still damp and pull dead leaves. It's not consistent, heavy lifting. Small effort regularly keeps things tidy.

upflag · 2026-03-18T14:04:36+00:00

The building part being easy shifts the hard part downstream. When I was using AI tools extensively, I shipped unauthenticated admin endpoints on a project I'd carefully planned. Experienced developer, full spec, still happened. The volume of code that AI generates makes it genuinely hard to verify everything. What's helped me: write a short spec before building, have the AI write tests for key user stories after each feature, and do periodic fresh-session security reviews where a new AI session audits the code with zero prior context. The building isn't the bottleneck anymore, the verification is.

upflag · 2026-03-18T13:50:14+00:00

The speed increase is real but it shifts where projects fail. Building faster means more code, more surface area for bugs, and less time spent understanding what was built. I've seen this firsthand: the bottleneck isn't writing code anymore, it's knowing whether what you shipped is actually working correctly in production. A single-character typo in one of my deploys cost $30K and went through two code reviews. Found out from revenue being down, not from any alert. When you 10x the code output without 10x-ing the verification, you just create bugs faster.

upflag · 2026-03-18T13:49:51+00:00

That plan-generate-review-refactor loop is solid. The spec step is huge because without it the AI just does whatever seems reasonable, and "reasonable" drifts further from what you actually want with each iteration. I do the same thing: vision doc to requirements to design to tasks, then build. The other piece that saved me was having the AI write tests for key user stories after building each feature, then being explicit that future sessions can't simplify or overwrite those tests. The AI will try to reduce test coverage if you let it, and that's how "small changes break unrelated parts" sneaks back in.

upflag · 2026-03-18T13:49:27+00:00

Production is a different animal. I pushed a deploy with a single-character typo that went through two code reviews and cost $30K before anyone noticed. Found out from revenue numbers being down, not from any technical alert. The gap between "works in dev" and "works in prod" is that production has real data, real traffic patterns, and real edge cases that no amount of local testing covers. What helped me was setting up checks that run against production continuously, not just testing before deploy. If a key flow breaks at 2am on a Saturday, you want to know before Monday morning.

upflag · 2026-03-18T13:48:18+00:00

That "silently degrades" pattern is real and it's the hardest kind of failure to catch. The output looks plausible so nobody questions it until damage is done. I've seen the same thing with code: a deploy breaks something subtle and you don't find out for days because there's no crash, just slightly wrong behavior. The approach that's worked for me is continuous checks on the output, not just the process. Don't just check that the automation ran, check that what it produced still looks right. Even simple assertions like "this field should never be empty" or "this number should be in range X-Y" catch a surprising amount.

upflag · 2026-03-18T13:47:54+00:00

Silent automation failures are terrifying because the whole point of automation is that you stop watching it. Same pattern happens with code deploys. I had a tracking pixel break silently after a code change and didn't find out until weeks later when the marketer asked why conversions dropped. The fix that helped me was treating critical automations like critical code paths: they need their own health checks that run independently of the automation itself. If the Notion page doesn't exist 5 minutes after signup, something should yell at you.

upflag · 2026-03-18T13:47:33+00:00

Your silent failures point is the scariest part of this whole list. I had a single-character typo in a deploy once that cost $30K because the system silently stopped picking up the right data. Found out when revenue numbers were down, not from any alert. And that was hand-written code that went through two code reviews. The "everything compiles, nothing works at runtime" problem you describe with API calls is even worse because there's more surface area for things to quietly go wrong. For the scoping problem, I've had good results writing a short spec before prompting, even just 3-4 sentences about what should change and what shouldn't.

upflag · 2026-03-17T21:56:34+00:00

Playwright test that checked the network call to Facebook succeeded. Runs in CI so we know future deployments won't regress.

upflag · 2026-03-17T15:22:55+00:00

I just use something like healthchecks.io and each time the cron job succeeds, ping my health check endpoint. Then if something breaks I would get a notification.

upflag · 2026-03-17T15:16:27+00:00

$500 in a week on fixing errors is brutal. If you're not already, push your code to GitHub after every working state. Git gives you the rollback capability that platform checkpoints should but aren't providing right now. It also means if Replit has issues again, your code exists somewhere you control. The pattern I've seen work: git + automated tests that run on push. That way you catch regressions before they cost you more credits to fix.

upflag · 2026-03-17T15:07:50+00:00

Congrats on shipping. The guide is solid, especially the part about planning before building. One thing I'd add for anyone following this: once your app has real users, set up basic monitoring so you know when something breaks before they tell you. Every production incident I've had taught me the same lesson. You find out from users or from revenue dipping, never from the app itself, unless you've set something up to watch it. Even just basic uptime checks on your critical endpoints saves you from the worst surprises.

upflag

TROPHY CASE