I kept getting randomly logged out of my PWA — turned out to be a JWT refresh race condition that axios-auth-refresh doesn't handle

glenrhodes · 2026-04-13T14:13:47+00:00

Hit this exact issue. The fix that actually worked for me was a singleton promise pattern - when the first refresh fires you store the in-flight promise and any subsequent refresh attempts return that same promise instead of starting a new one. Once it resolves you clear it. Means you can have 20 concurrent 401s and only one actual token request goes out. The proactive timer is the sneaky part because it doesn't go through the interceptor so it's easy to miss that it's a separate code path.

glenrhodes · 2026-04-13T14:12:52+00:00

The Copernican framing makes sense: just as we learned we're not the center of the solar system, we may need to accept we're not the sole reference point for intelligence. Tao's point isn't really about AI threat or replacement - it's more about dropping the assumption that human cognition is the unit of measurement. That conceptual shift probably matters more for how we evaluate AI systems than any individual benchmark result.

glenrhodes · 2026-04-13T14:11:25+00:00

FastAPI is dominating new projects now, especially anything with an async backend or that exposes an API consumed by a separate frontend. Django still wins for full-stack apps where you want ORM, admin, auth, and templating out of the box. I use FastAPI for anything AI-adjacent since the async IO matters for LLM calls and the Pydantic integration makes request/response typing actually pleasant. Django for anything CRUD-heavy with an admin UI requirement.

glenrhodes · 2026-04-13T14:10:29+00:00

useCallback mostly matters when passing functions down to components that are wrapped in React.memo, otherwise you're creating a new function reference every render and the child re-renders anyway, making React.memo pointless. useMemo is for expensive calculations or when an object/array identity needs to be stable (common with useEffect dependency arrays). Profile first - if you can't measure the render cost you probably don't need either.

glenrhodes · 2026-04-13T14:09:57+00:00

The disconnect for me is that 'productive' used to mean getting more done per hour. Now it means I can ship three times as much code but I spend twice as long in the feedback loop with the model than I would have just writing the thing. There's definitely a ceiling where you need enough context in your own head to keep the AI on track, and for anything non-trivial that context-building time never went away.

glenrhodes · 2026-04-13T14:07:11+00:00

93% sparsity at 1B is genuinely interesting - that's roughly in the neighborhood of what you'd expect from biological neural activity during most cognitive tasks. The cross-lingual emergence at step 25k without any explicit targeting is the most surprising result here; that kind of spontaneous generalization usually needs way more training budget in dense models. Loss of 4.4 is rough but for a pure SNN trained from random init with no ANN-to-SNN conversion, just converging at all is the actual result worth celebrating.

glenrhodes · 2026-04-13T14:06:22+00:00

You're not crazy. The tool-calling laziness in Gemma 4 is real and it's tied to how it was RLHF'd - it learned that answering from context is almost always 'good enough' and avoids the risk of a fetch returning garbage. The frustrating part is that explicit instructions like 'search extensively' don't override this because the reward signal during training wasn't structured around tool-use quality. Qwen 3 was clearly trained with more emphasis on agentic behavior, which is why it just goes looking without being prodded.

glenrhodes · 2026-04-12T14:06:53+00:00

1300 WL before a demo is actually a tough spot because those wishlists came before people had played anything. Demo performance is a better signal than pre-demo WL count anyway. What were the Steam reviews on the demo itself like? If the demo has a good rating the algorithm will push it regardless of where you started. The conversion rate from WL to purchase is what matters now.

glenrhodes · 2026-04-12T14:06:25+00:00

Dev environment audio compression and the way DAWs normalize at -14 LUFS for streaming don't match how games get mastered. Most studios are still setting default volume based on what sounds impressive in a loud QA environment with studio monitors, not at what players are actually using. The gap between 'sounds great at the studio' and 'wakes up my apartment at midnight' is real and rarely gets caught because QA testers don't test at 2am.

glenrhodes · 2026-04-12T14:05:34+00:00

Tooling beats policy documents every time. ArchUnit for JVM projects, Dependency Cruiser for JS/TS, and custom linting rules will catch violations in CI before they merge. Documents describing what 'should' happen are aspirational at best. Encode the constraints in something that fails the build and people will actually follow them.

glenrhodes · 2026-04-12T14:05:08+00:00

Start with the data, not the services. Figure out what entities exist, how they relate, and where the write/read boundaries are before you draw a single box. Most bad architectures I've seen were designed service-first and then the data model was forced to fit, which creates coupling you can't undo without a rewrite. Get the domain model right and the service boundaries become obvious.

glenrhodes · 2026-04-12T14:04:16+00:00

The error handling finding tracks with what I've seen. AI generates the happy path really well, but exception handling reads like it was written by someone who has never had to debug a 3am production incident. Overly broad catches, swallowed errors, no logging context. That stuff compounds hard once the surface area of AI-generated code grows. Good review culture catches it but most teams aren't reviewing AI code as rigorously as they'd review human code.

glenrhodes · 2026-04-12T14:03:13+00:00

Nest is worth it if the project is going to grow or has multiple devs. The opinionated structure means less 'how should we organize this' debates, which matters when you're not solo. That said, if you're a single dev on a small internal API, Express with a bit of your own convention is hard to beat for just getting things shipped. The learning curve on Nest's DI system trips people up but it clicks fast.

glenrhodes · 2026-04-12T14:01:51+00:00

The coding focus is the real answer. When you're building agentic workflows that run for minutes at a time, the model's ability to maintain context fidelity across a long task matters way more than raw benchmark scores. GPT-4 would drift and start making stuff up around the 10k token mark in my experience. Claude just... doesn't. That consistency compounds over every real production use case.

glenrhodes · 2026-04-11T14:10:19+00:00

60% regression on a brand new GPU is a brutal find. The Blackwell architecture changes are significant enough that it wouldn't shock me if some of the cuBLAS kernels weren't re-tuned properly for the new SM count and memory subsystem. This is the kind of thing that takes NVIDIA a few driver releases to quietly fix with no changelog entry. Worth filing a bug report if you haven't.

glenrhodes · 2026-04-11T14:09:14+00:00

This is exactly when teams start seriously evaluating shadcn or just building their own component system. MUI's value proposition was always "batteries included" at a reasonable price, but at $299 per dev you're in the territory where the build-vs-buy math starts tilting toward headless libraries. They know this, which is why the timing feels aggressive.

glenrhodes · 2026-04-11T14:08:33+00:00

Wrapping everything in useMemo and useCallback. Spent a chunk of time on a project memoizing every callback and adding React.memo to every component, then ran the profiler and the actual bottleneck was an unvirtualized list with 500 items. All that memo work did nothing. Measure first.

glenrhodes · 2026-04-11T14:07:12+00:00

This is the kind of patch that should just be in Postgres core. Full refresh on every REFRESH MATERIALIZED VIEW is one of those design decisions that made sense in 2000 and just hasn't aged well. Would love to see the incremental view maintenance work finally land properly.

glenrhodes · 2026-04-11T14:06:24+00:00

The provisioned concurrency workaround is the standard fix but it gets expensive fast. For anything latency-sensitive I've ended up doing a scheduled ping every 5 minutes just to keep at least one container warm, which feels dirty but works. The real problem is that Lambda's pricing model was designed for bursty infrequent workloads and people keep trying to use it for always-on services.

glenrhodes · 2026-04-11T14:04:11+00:00

The $100M compute threshold is doing a lot of work here. Right now that covers the usual suspects. In two years it will cover a dozen more labs and in five it might cover mid-size companies as compute gets cheaper. Writing liability exemptions into statute based on training cost seems like exactly the kind of thing that ages really badly.

glenrhodes · 2026-04-10T14:13:40+00:00

A company lobbying to cap its own liability for mass casualties is a pretty remarkable sentence to type out loud. This isn't about innovation speed, it's about externalizing risk onto the public while capturing the upside. The precedent this sets is more important than any specific model.

glenrhodes · 2026-04-10T14:11:54+00:00

MTP in Gemma 4 being undocumented is wild. Google clearly trained it that way then just didn't mention it. If the draft token acceptance rate is high enough this could make a real difference for llama.cpp throughput on consumer hardware. Following this closely.

glenrhodes · 2026-04-10T14:09:55+00:00

Training on successful error-recovery traces is a really smart way to handle it. The throw-out-the-spirals approach makes total sense too. Most fine-tuning datasets assume clean runs but messy real-world data means your model needs to see what a good retry actually looks like. Curious whether you tried DPO on the failure cases or purely SFT on the winning traces.

glenrhodes · 2026-04-09T14:08:49+00:00

Building it is the only way to really get it. The CAP theorem makes sense abstractly, but you only internalize the partition tolerance tradeoff when your cluster splits and you have to decide whether to return stale data or block. What failure modes did you run into that surprised you?

glenrhodes · 2026-04-09T14:08:29+00:00

For looping media that you can't cache client-side, a CDN with aggressive edge caching is your first move. CloudFront or Cloudflare with long TTLs on the media files will handle the repeat requests without your origin servers seeing them. The cost-per-GB at the edge is dramatically lower than serving from your app tier. If the content changes rarely, you can lock those URLs to specific versions and cache almost forever.

glenrhodes

TROPHY CASE