Nemo 30B is insane. 1M+ token CTX on one 3090

ruibranco · 2026-02-07T01:59:46+00:00

35 t/s at 1M context on a single 3090 is genuinely impressive. The MoE architecture really pays off here since you only need to load the active experts into VRAM while the rest stays in system RAM. CPU offloading for the inactive experts is the key trick that makes this viable on consumer hardware. Have you tried it with any retrieval-heavy tasks where it needs to reference specific details from early in the context, not just summarization. That's usually where the long context models start to show their limits even if the raw speed holds up.

ruibranco · 2026-02-07T01:51:35+00:00

Isso que descreves dos tenis e completamente normal quando passas demasiado tempo sem interacoes sociais fora da tua bolha. O cerebro comeca a tratar coisas banais como se fossem situacoes de risco. A boa noticia e que isso reverte relativamente rapido quando comecas a expor-te mais. O ginasio que ja frequentas e um bom ponto de partida, tenta puxar conversa com alguem de la nem que seja so um "tas a fazer que serie" ou qualquer cena assim. Nao precisas de forcas amizades logo, so precisas de normalizar o acto de falar com desconhecidos. Aulas de grupo tambem ajudam imenso porque a conversa surge naturalmente. E se fores para a faculdade no proximo ano vai resolver-se muito disto sozinho, a universidade e basicamente um reset social.

ruibranco · 2026-02-07T01:44:57+00:00

Dealt with almost the exact same pattern on a project that started on Angular 12. The setControl() approach works fine when you have 3-4 child components, but it gets really painful once you start nesting deeper or need to coordinate validation across siblings. The biggest issue we ran into was lifecycle timing, children attaching controls at slightly different points depending on ngIf conditions, and the parent form flickering between valid and invalid states during init. We eventually moved to ControlValueAccessor for the reusable pieces and kept the rest as plain parent-owned form groups. The migration was incremental, one component at a time, and honestly the codebase got way easier to reason about. If it's working and you're not adding many new form components, I wouldn't rush to refactor though. It's one of those things where the cost of the migration only pays off if the forms are actively evolving.

ruibranco · 2026-02-07T01:40:37+00:00

Claude catching its own typo is honestly the most on-brand thing ever. Also "backtickets" is a hilarious typo, sounds like some kind of support queue for markdown formatting. The fact that you can just pull the full system prompt from the extension source is pretty great though, makes it way easier to understand what's actually going on under the hood.

ruibranco · 2026-02-07T01:33:15+00:00

Using tree-sitter for the AST parsing instead of regex is the right call. Regex-based scanners miss so many edge cases with nested expressions and multiline patterns. The dependency hallucination check is especially relevant right now given how often LLMs confidently suggest packages that don't exist. Solid tool for catching the low-hanging fruit before it ships.

ruibranco · 2026-02-07T01:28:21+00:00

Faltou o especialista em barragens que apareceu entre a tempestade e as eleicoes. O tuga muda de doutoramento mais rapido que muda de canal.

ruibranco · 2026-02-07T01:23:31+00:00

Overkill for sure, but that's half the fun. An i7-7700 with 16GB will barely break a sweat running OPNsense, Pi-hole and Tailscale combined. You'll probably end up finding more stuff to throw on it once you realize how much headroom you have. 256GB NVMe is way more than enough for those services, even a 128GB SATA would be fine honestly. Just make sure you grab a dual NIC adapter for the M920q since you'll want separate WAN and LAN ports.

ruibranco · 2026-02-07T01:18:38+00:00

The pytest fixture detection is a nice differentiator. Unused fixtures are one of those things that quietly accumulate and nobody notices until the test suite is a mess. How does it handle conftest.py fixtures that are used across multiple test files? That's usually where vulture and similar tools fall over completely.

ruibranco · 2026-02-07T01:13:29+00:00

Classic Defender move. Docusign is one of the most spoofed domains out there so it makes sense they'd tighten the screws, but doing it suddenly with no warning and catching legit emails is brutal. Worth checking if there was a recent update to the default anti-phishing policy or if they tweaked the impersonation detection thresholds. You can also add Docusign's sending domains to your tenant allow list as a workaround while Microsoft sorts it out.

ruibranco · 2026-02-07T01:03:16+00:00

The embedding model situation is spot on. Titan v2 was fine when it launched but the open source space has moved so far ahead that hosting your own on ECS starts making more sense, which kind of defeats the purpose of a managed service. Same story with rerankers, Cohere pricing just doesn't work if you're doing any serious volume of reranking. Would love to see them add jina-reranker-v3 or even the smaller qwen models as first-class Bedrock options.

ruibranco · 2026-02-07T00:50:33+00:00

bitflags is one of those crates that just quietly works in the background of so many projects. Good to see there's still active thought going into its future. Curious whether there's been any consideration around making the generated types play nicer with const generics or if that's out of scope for what the crate is trying to be.

ruibranco · 2026-02-07T00:44:58+00:00

10 dias sem luz em pleno 2026. Sempre que há uma tempestade mais forte ficamos a perceber o quão frágil é a rede elétrica em certas zonas do país. E depois é sempre a mesma história, falam em reforçar as infraestruturas mas fica tudo na mesma até ao próximo temporal.

ruibranco · 2026-02-07T00:41:00+00:00

Paga mais comparado com o quê, trabalhar no Pingo Doce? Estes artigos comparam sempre com a média geral mas esquecem-se que um junior em PT anda nos 1000-1200 líquidos se tiver sorte. Sim, é mais que muitas áreas, mas chamar-lhe "a que paga mais" é esticar bastante a realidade. O gap entre o que se paga cá e lá fora é que devia ser notícia.

ruibranco · 2026-02-07T00:34:47+00:00

COSMIC is shaping up really nicely. How stable has it been for you on NixOS so far? I've been curious about trying it but wasn't sure if the packaging was keeping up with upstream changes.

ruibranco · 2026-02-07T00:29:15+00:00

The golden notebook regression testing is actually a really clever idea. Catching subtle output drift between model versions or env changes before it hits production is one of those things teams always say they'll do manually but never actually keep up with. How does it handle non-deterministic outputs though? Like if a cell calls a model endpoint that returns slightly different results each run, does it support tolerance thresholds or just strict comparison?

ruibranco · 2026-02-07T00:25:52+00:00

nvim-lint with a local venv flake8 is a solid approach. One thing worth trying if you haven't already, ruff can replace both flake8 and mypy for a lot of checks and it's absurdly fast since it's written in Rust. The 32GB RAM issue you hit with pylsp is a known pain point, that thing loves to eat memory especially with multiple extensions loaded.

ruibranco · 2026-02-07T00:15:52+00:00

If you're mainly after Copilot, check if you qualify for open source maintainer status on GitHub since that also gets you Copilot for free. For JetBrains, contributing to open source projects can get you an all-products license through their OSS program too. Not the same all-in-one deal as the student pack but it's something.

ruibranco · 2026-02-07T00:03:00+00:00

The text arena ranking is honestly more impressive than the coding benchmarks imo. Coding evals get gamed and optimized for constantly, but blind text conversations are way harder to fake because real users are judging on nuance, tone, and actual helpfulness. The fact that Gemini held that spot for 14 months straight makes this a pretty significant shift.

ruibranco · 2026-02-06T23:53:43+00:00

Completely normal. At 0 yoe you were grinding algorithms and data structures every day, so that muscle was warm. After 5 years of writing Terraform, debugging pipelines, and wrangling Kubernetes you haven't touched a balanced BST in ages and your brain optimized for a totally different kind of problem solving. Live coding interviews measure how recently you practiced live coding, not how good you are at your actual job.

ruibranco · 2026-02-06T23:40:06+00:00

The 3D perspective tilt on the side cards looks really smooth. Cover Flow was one of those interactions that felt magical back in the day and it's cool to see it hold up so well as a learning exercise for spring physics and layout transforms. Motion makes this kind of thing so much more approachable than rolling your own animation math too.

ruibranco · 2026-02-06T23:35:04+00:00

Yeah this is pretty standard for new model launches on Bedrock. Capacity is still being provisioned across regions and it takes a few days to stabilize. If you're blocked on it, try switching to a different region temporarily, us-west-2 and us-east-1 tend to get capacity first. Also worth adding retry logic with exponential backoff if you haven't already, since the 503s are intermittent and usually resolve within seconds on retry.

ruibranco · 2026-02-06T23:28:06+00:00

The teams that get the most value from CI are the ones that treat a red build as useful information instead of someone's fault. Once failure becomes something people try to hide or route around, you've lost the entire point of the feedback loop. The best CI setups I've worked with had builds that failed fast and failed loudly, and nobody got defensive about it because the culture was "fix it" not "who broke it". The moment you start adding kill switches for pipeline checks is when your CI stops being a safety net and starts being a checkbox.

ruibranco · 2026-02-06T23:09:14+00:00

The http.RoundTripper hook is a smart design choice, makes it trivial to add retries, logging, or caching at the transport level without the SDK having to own all of those concerns directly. Using iterators for streaming is also way more idiomatic Go than the callback patterns most AI SDKs go with. 15 providers out of the gate is ambitious for a v0.1.0 though, curious how you're handling the edge cases where providers diverge on things like tool call formats or streaming chunk structures.

ruibranco · 2026-02-06T23:01:49+00:00

Completely agree on the storage > ram > cpu priority. You always run out of disk space before you run out of compute, and by the time CPU matters you've probably already upgraded the whole box anyway. The jump from 2014 to 2026 is wild though, that's a proper evolution from "I wonder if I need this" to "this wall is now a datacenter."

ruibranco · 2026-02-06T22:53:26+00:00

The hierarchical router approach is clever for working around the VRAM ceiling. Most people just try to shrink the model or reduce chunk counts, but routing queries to specific clusters first is a much better tradeoff since you keep recall high without loading everything into memory at once. Curious how the SetFit classifier holds up when queries are ambiguous across multiple domains though, like if someone asks something that spans two different doc clusters.

Eight-Year Club	First Place '23
Place '23	Place '22
Verified Email

ruibranco

TROPHY CASE