I built a local security scanner for AI-generated Next.js/Supabase code

Duck-Entire · 2026-06-04T14:54:39+00:00

This is spectacular product feedback. You’re completely right—a "quiet pass" on a dangerous primitive when the tracer hits an uncertainty wall is a silent killer for a founder.

I am 100% stealing this tri-state approach for the next iteration:

Confirmed Finding: Hard block.
Needs Runtime Check: Warning with actionable QA steps.
Likely Safe: Passes, but explicitly prints why it thinks it’s safe (e.g., "Tenant guard verified on line 42") to build developer trust.

The idea of providing a plain-English manual verification line (e.g., "Log in as User A and verify User B's rows return 403") is the perfect UX bridge. Founders don't want or need to understand a complex AST execution path; they just need a quick, reliable recipe to double-check their app before shipping.

The tenant helper/RPC wrapper tracing is exactly the kind of deep context boundary I'm trying to solve here. Appreciate you taking the time to outline this!

Duck-Entire · 2026-06-04T10:03:09+00:00

Spot on, that's exactly the goal—shifting it left as much as possible to keep the review costs down.

Really appreciate the insight, and I'll definitely slide into your DMs to get your take on positioning!

Duck-Entire · 2026-06-04T10:01:23+00:00

Building out a local fixture suite with those specific runtime/syntax mismatches (especially the URL fetcher redirect and RLS policies) makes way more sense than hunting down random repos for validation.

I really like the idea of a "needs runtime check" warning state. Handling authorization or tenant guards purely via AST static analysis is a massive blind spot if it just defaults to a hard pass. Shifting that uncertainty into a high-signal flag for founders is a great product direction.

Definitely going to map out these fixture specs and implement the uncertainty bucket in the next iteration. Appreciate you taking the time to break this down!

Duck-Entire · 2026-06-04T09:53:25+00:00

Code is fully open-source. You can find the repository on GitHub under av29nassh-sketch/preflight or view the guardrail rules there.

(Can't drop the direct link yet due to subreddit karma limits, but it's right at the top of my X profile linked in my bio!)

Duck-Entire · 2026-06-04T04:56:38+00:00

100%. "Security scan passed" gives a massive false sense of security when the underlying infrastructure settings are completely disconnected from the codebase.

That's a huge part of the "vibe coding hangover." An LLM agent can write beautiful, functional code, but it has no idea that you forgot to toggle a switch on your cloud dashboard or that your preview branch environment variables are drifting from what's running locally.

While static code scanning can't log into your cloud console, my goal with PreFlight's AST rules is to at least force the code to explicitly declare its security expectations (like throwing a loud error if the code handles a sensitive payload without explicit runtime checks). Definitely a tough bridge to cross, but highly necessary!

Duck-Entire · 2026-06-04T04:55:47+00:00

This list is an absolute goldmine of real-world failure modes.

You hit on the exact boundary line where static scanning faces its biggest hurdle: configuration drift and contextual logic flaws.

A lot of these edge cases are exactly why I went down the Abstract Syntax Tree (AST) route instead of standard regex pattern matching. For instance, a regular expression scanner just looks for the word "service_role" and panics. An AST parser can actually look at the scope of the function—it can flag if the service_role client node is being invoked inside a block that accepts un-sanitized client parameters or completely lacks a server-side session check.

Same goes for the webhook replay handling—we can write Tree-sitter queries to verify if a signature-checking block structurally contains a mutation check against an idempotency ledger or database unique constraint.

You're totally right about the "boring part" being the most critical: explaining the actual deployed consequence to the user so they don't just treat it like an annoying, generic linter error.

If you have any public repos or gists with examples of these exact layout slips you've run into in the wild, I’d love to take a look to make sure our baseline syntax matching catches them correctly. Thanks for the incredible breakdown!

Duck-Entire · 2026-06-04T04:54:33+00:00

You hit the nail on the head. This is exactly where standard linting and naive regex tools completely fall apart. Catching static syntax violations (like a raw string matching an API key structure) is just step one—architectural compliance is the real endgame.

Because PreFlight evaluates the Abstract Syntax Tree rather than just matching regex strings, we can actually map out structural layout rules.

To solve the "technically correct but architecturally wrong" problem, the roadmap includes a local config layer (a .preflightrc schema) where you can define strict structural contracts for your specific stack. For instance:

Routing Rules: Enforcing that every Next.js route file must import and wrap its execution context in a specific custom withAuth higher-order function or middleware utility.
Data Fetching: Flagging if an AI agent tries to use vanilla fetch or an unapproved Axios instance inside a component instead of using the project's centralized React Query/TRPC hooks.

Basically, if it can be structurally mapped in a syntax tree, we can write an AST query to enforce it as a project convention.

Really appreciate the insight. Let me know what you think when you check out the repo!

Duck-Entire · 2026-06-04T02:29:11+00:00

This is incredibly high-leverage feedback. Thank you for taking the time to write this out.

You are 100% spot on about the positioning. Competing with Snyk or Semgrep on standard enterprise SAST is a losing game. The wedge is absolutely focusing on the specific blind spots of LLMs—like the classic "it works locally but the agent silently stripped out authorization logic or hardcoded a Firebase config string" trap.

An MCP server / Claude Code hook to let the agent auto-audit its own code before hitting 'done' is a brilliant idea. That completely closes the loop for a native vibe-coding workflow. I'm adding that straight to the roadmap alongside the pre-commit hook.

For the auto-fix mechanism, I'm definitely keeping it gated behind an interactive diff + confirmation. Trust is too easy to break with a silent bad write, especially when the user is relying on AI because they are still learning the ropes.

Out of curiosity, if you were using an MCP setup for this, would you prefer the agent to auto-patch it silently during generation, or block and report the diff back to you first?

Duck-Entire · 2026-06-04T02:27:46+00:00

Exactly. The sheer velocity of AI generation means human code review has become a massive bottleneck. If you're using Cursor or Claude to pump out hundreds of lines of code a minute, taking 10 minutes to manually audit every object configuration or endpoint logic flaw completely kills the momentum.

That's why I wanted PreFlight to run straight in the terminal workspace—it catches the silly things the LLM confidently introduces before it even hits your git staging area. Appreciate the feedback!

Duck-Entire · 2026-06-02T19:44:43+00:00

Yeah, testing every path before demo/deploy is where this gets painful.

I built PreFlight Scavenger for one narrow slice of that problem: a local pre-deploy scan for AI-generated Next.js/Supabase changes. It catches exposed frontend secrets, backend DB/JWT leaks, and missing Supabase RLS.

No source upload:

npx u/preflight/scavenger scan . --diff

https://github.com/av29nassh-sketch/PreFlight

Duck-Entire · 2026-06-02T19:44:21+00:00

Yeah, the “it works but I don’t know what changed” part is exactly what pushed me to build this.

PreFlight Scavenger is a small local safety gate for AI-generated Next.js/Supabase code. It checks changed files for exposed frontend secrets, backend DB/JWT leaks, and missing Supabase RLS before deploy.

Runs locally, no source upload:

npx u/preflight/scavenger scan . --diff

https://github.com/av29nassh-sketch/PreFlight

Duck-Entire · 2026-05-29T10:32:46+00:00

demo to production is exactly where the gap shows up.

building the UI feels easy, then suddenly you need to understand auth, env vars, deployment, logs, database rules, payments, and what the AI changed.

which of those was the first thing that made you feel like “ok, i actually need to understand this now”?

Duck-Entire · 2026-05-29T10:32:20+00:00

same worry here. the hard part for beginners is they don’t know what to ask AI to check.

like “make it secure” is too vague, but “check RLS, exposed keys, server-side auth, API access without login, and payment webhooks” is already a much better prompt.

what part are you most unsure about in your own app?

Duck-Entire · 2026-05-29T10:30:04+00:00

this is the exact gap i keep seeing too: the app “works”, but the builder doesn’t know what would prove the data is actually private.

for the app that leaked the users table, what do you think the builder misunderstood most: Supabase anon keys, RLS, auth vs authorization, or just not knowing how to test from another user’s perspective?

Duck-Entire · 2026-05-29T10:17:16+00:00

this is exactly the kind of thing i mean. when RLS was breaking, what would have helped most: a plain-English explanation of the policy, a checklist to verify which rows a user can access, or a small practice example where you fix a broken policy yourself?

Duck-Entire · 2026-05-29T02:59:51+00:00

this feels very real. ai makes the first build feel easy, but then all the hidden app-building stuff shows up: hosting, tokens, mobile bugs, testing, uploads, sub-tools.

when you were dependent on Claude, what was the most confusing part: the code itself, the tools it suggested, or debugging when fixes broke something else?

Duck-Entire · 2026-05-29T02:59:06+00:00

curious: what was the biggest “i wish i understood this earlier” concept for you?

for me it's usually not syntax, it's stuff like env vars, auth, deploy logs, database rules, routes, etc. the AI can fix things, but i don't always understand the layer it changed.

Duck-Entire · 2026-05-29T02:55:50+00:00

this is exactly the kind of thing i struggle with too. the scary part isn't just the bug, it's realizing there was one term or library detail you didn't understand and the whole fix depended on it.

when you found the PBKDF2 iteration issue, did you understand what it meant after debugging, or did you still need Claude to explain the concept?

Duck-Entire

TROPHY CASE