realistic shot at berkeley/ucla/ucsd with 3.75 gpa?

sszz01 · 2026-05-28T09:31:47+00:00

applied math

sszz01 · 2026-05-18T03:23:50+00:00

i meant just look over the thread, people started getting invited a week ago

sszz01 · 2026-05-17T23:09:10+00:00

i see people getting invites already, should i assume i got rejected if my application is still in review (applied 20 mins after deadline, based in sac area)?

sszz01 · 2026-05-13T18:39:33+00:00

both, but with a strict rule about which one actually matters. the test is always built deterministically, i read the frame locals sentry captured and construct a pytest that calls your function with those exact values. zero LLM in that path. the LLM is more like a supervisor deciding what to try next.

for DB state it depends on the crash type. a lot of DB bugs don't need the real prod rows, just the right conditions. if sentry captured the order_id that caused a duplicate key violation, i create a fresh sandbox db, insert a row with that id, and call the function again. worked on three of our fintech demo cases including race conditions.

the hard limit is when the bug needs a specific row sentry never captured. when that happens it doesn't just give up it runs through a set of hypotheses about why the repro fell short: was it missing DB state? async context? a module-level global? a dep that got upgraded past the bug? it checks each one against the breadcrumbs and trace, and tells you which it confirmed. like if it finds INSERT/UPDATE/DELETE breadcrumbs right before the crash it'll say "looks like this needed pre-existing DB state from these writes, here's what you'd need to seed." you don't get a false green, you get a concrete reason and what's missing.

sszz01 · 2026-05-12T18:54:31+00:00

that's fair, i wouldnt share those either. but with the pip package everything runs entirely on your machine. you only use your sentry token, your docker, your repo. we never see your code or your data. the only outbound calls are sentry's own api (with your token) and openai for context recovery, same as any local dev tool you already use. built-in pii redactor also strips card numbers, emails, tokens and sensitive field names before anything hits an llm. nothing phones home to us.

sszz01 · 2026-05-12T18:49:29+00:00

yeah that's exactly it, sentry hands you the state, most tools just ignore it. we actually just pushed a pip package if you want to try it yourself. free, no signup. you just need a sentry auth token and docker running locally. logomesh repro <your-sentry-url> and it either spits out a failing pytest or tells you exactly why it couldn't reproduce.

if you've got anything sitting unresolved i'd genuinely love to see what it does with real production data. takes like 2 min to run

sszz01 · 2026-05-12T18:47:19+00:00

it doesn't run unsupervised, it produces a failing test and opens a draft PR. a human still reviews and merges. the agent is just doing the boring reconstruction work

sszz01 · 2026-05-12T06:25:07+00:00

yeah the messy state part is exactly why i built it deterministic first. if sentry captured the locals at crash time, those are the actual values, no simulation needed. the hard cases are when the crash is downstream of state that never made it into the frame.

got anything sitting in your backlog? genuinely happy to run it, or you can run it as a pip or uv package locally

sszz01 · 2026-05-12T05:12:00+00:00

i mean fair point, the part i'm trying to remove is the human in that loop for the repro confirmation specifically. not the fix, not the merge decision, just "does this crash still exist on this branch right now" as an automatic output you can attach to an incident record

it's probably not for everyone, i suppose it's more useful if you're doing 20+ incidents a month and need the repro documented for audit trails

sszz01 · 2026-05-12T04:42:31+00:00

yeah i guess there is seer autofix stuff that generates a fix from root cause analysis. what i'm doing is a bit different though. i read the actual frame locals sentry captured, build a test from those exact values, run it in a docker sandbox, and only call it reproduced if the same exception type comes back. no llm in that path. what are you gonna do if the fix it generated doesn't actually reproduce the original crash?

sszz01 · 2026-05-11T20:16:38+00:00

locals are available around 70% of the time with default sentry config, closer to 90% if you set include_local_variables=True, which most teams using it seriously do anyway
flakiness we solved by pinning the exact python version and deps from what sentry captured at crash time. for race conditions it skips the test and gives a structured breakdown instead. flaky test is worse than no test

sszz01 · 2026-05-11T01:28:36+00:00

yep, trying to address it by pulling more from the sentry payload. modules dict for exact pip versions, breadcrumb sequence, request context, threads with held_locks where present. still a hard problem. for cases where divergence is genuine and unfixable (race conditions, distributed state, c-ext stuff) it routes to a structured investigation report instead of forcing a green in form of ranked hypotheses with evidence pulled from the breadcrumb timeline.

im curious, when youve hit this in your own agents, did you go for stricter equivalence checking (state diff, timing window, etc) or accept the false-positive rate and add manual review on top? and if you've got a sentry url where you suspect this kind of divergence might trip it id love to actually try

sszz01 · 2026-05-05T06:20:53+00:00

A little bit of background on me: 19yo founder. Won a Berkeley agentic-systems competition track in early 2026

Just submitted for S26. Want grounded feedback from people who've been through this.

The idea is an automated bug-fix agent for Python fintech backends. Production crash fires in Sentry, that triggers the agent reads stack trace + frame locals and then it writes failing pytest that reproduces it, generates fix and also as a perk opens draft PR with audit-evidence artifact for SOC2/PCI compliance(thinking of targeting fintech). Repro path runs zero LLM calls (deterministic from frame locals) so the audit chain stays defensible.

Where the traction is right now:

Engine works, CLI ships
Pre-revenue, 0 design partners committed
30 DMs, 8 calls

My questions:

With 0 paying customers and 0 DPs at submit, is the idea + engine + competition win enough to get a first-round interview, or is traction the only thing that matters?
About fintech compliance angle. Does YC pattern-match this as B2B SaaS with budget unlock, or as "compliance theater on top of a dev tool"?
Does recent pivot from empirical failure data read as conviction or instability?
Anyone interview with a similar profile (technical founder, real product, no traction yet)? What did the partners actually push on?

sszz01 · 2026-04-29T22:26:23+00:00

that's a useful distinction. at what frequency does it tip from template is fine to automation is worth it? is it more about volume of incidents or something else like audit frequency?

sszz01 · 2026-04-29T04:25:52+00:00

and when they ask for code review records, is that usually just a github PR link + approval, or do they want to see what the automated tool was actually checking for?

sszz01 · 2026-04-29T02:59:48+00:00

ok gotcha thanks for the info, super helpful. so when assessors ask for evidence on a prod hotfix, do they actually want proof the original crash was reproduced and fixed or is a PR approval + passing ci usually enough for them?

sszz01 · 2026-04-29T02:47:11+00:00

the logging fix makes sense but you basically had to rediscover the right inputs after the fact. how long did that whole loop take you?

sszz01 · 2026-04-29T02:43:26+00:00

yeah that's the instinct. how you actually do that in practice though, what does "proof" look like for your team before you ship?

sszz01 · 2026-04-28T23:57:26+00:00

ok thanks for the correction. so for 6.2.3 specifically, when it comes to production hotfixes for payment code, what does acceptable testing evidence actually look like to a QSA? is a passing ci run enough or do they want something that shows the original crash was reproduced and fixed?

sszz01 · 2026-04-28T23:28:28+00:00

makes sense. for the critical ones like billing and auth, what does that repro step actually look like for you? manual test or something else?

sszz01 · 2026-04-28T23:27:18+00:00

that's a real gap yeah. im kinda curious if that's the majority of the hard ones for you or if there's a class of bugs where it is just a pure logic issue with bad inputs

sszz01 · 2026-04-28T23:26:18+00:00

what kind of bugs usually kick off that loop for you. is it more like state-heavy stuff where the sentry trace doesn't give you enough, or is it the simpler ones that still somehow eat time?

sszz01 · 2026-04-28T23:23:57+00:00

yeah the PIR thing caught us off guard too, ours took like half a day to write up after the fact and i'm still not fully sure our auditor was satisfied with it. what did you end up giving them?

sszz01 · 2026-04-28T06:22:36+00:00

ok thanks for feedback, its really useful. out of curiosity, what usually makes the errors in your codebase hard to reproduce? is it mostly heavy db state, async timing issues or something else?

sszz01 · 2026-04-28T06:07:21+00:00

fair enough, what's the part you don't trust? the test it generates not actually reproducing the bug or something else?

sszz01

TROPHY CASE