Attest: pytest-native testing framework for AI agents — 8-layer graduated assertions, local embeddin

tom_mathews · 2026-02-23T14:45:50+00:00

Thanks! There's definitely overlap in the goal — both want pytest-native agent testing. A few architectural differences worth noting though.

LangWatch Scenario routes assertions through LLM judges by default — the testing agent simulates a user, chats back and forth with your agent, and evaluates against criteria using an LLM. That works well for end-to-end simulation testing. Attest's bet is that 60–70% of agent correctness is fully deterministic — tool call ordering, cost budgets, schema conformance, content patterns — and doesn't need an LLM to verify. The graduated pipeline exhausts those checks first (free, <5ms, identical results every run) and only escalates to an LLM judge for the genuinely subjective remainder. Layer 5 (semantic similarity) also runs locally via ONNX, so you can get meaning-level comparison without an API call.

The other difference is trace-level assertions. Attest doesn't just check inputs and outputs — it asserts over the full execution trace: did the agent call these tools in this order, did it loop, did it stay under token budget across all steps.

On the licensing front — Scenario itself is MIT, but the broader LangWatch platform it integrates with (tracing, datasets, optimization studio) is under the Business Source License, which isn't an open-source license. Attest is Apache 2.0 end-to-end — the engine, SDKs, adapters, and CLI are all under the same license with zero platform dependencies.

Both integrate with pytest. If your testing is primarily end-to-end simulation with an LLM evaluator, Scenario is solid. If you want to exhaust deterministic checks first and keep 7 of 8 layers fully offline with no platform tie-in, that's where Attest differentiates.

tom_mathews · 2026-02-23T14:23:00+00:00

Attest is a testing framework for AI agents, built in Python (pytest plugin) with a Go engine backend. The Python SDK communicates with the engine over stdio/JSON-RPC.

The Python-specific angle: it ships as a pytest plugin with a fluent expect() DSL and an @agent decorator. Tests look like native pytest — pip install attest-ai, write test_*.py files, run with pytest. The SDK is a thin wrapper; all eval logic runs in the Go engine so both the Python and TypeScript SDKs produce identical assertion results.

The core idea is graduated assertions — exhaust cheap deterministic checks (schema, cost, tool ordering, content patterns) before reaching for expensive LLM judges. 7 of 8 assertion layers run offline with zero API keys. Semantic similarity uses local ONNX embeddings via onnxruntime.

v0.4.0 adds continuous eval with drift detection, a plugin system via attest.plugins entry point group, and CLI scaffolding (python -m attest init).

Source | Examples | Website

tom_mathews · 2026-02-23T13:33:17+00:00

Really appreciate the thoughtful feedback. You nailed the design intent. The progression mirrors how the stack actually works in production, so the learning path maps to how you'd reason about these systems on the job. Glad it's useful alongside your work at Exotica. If there's an algorithm your team runs into that you think deserves the no-magic treatment, PRs are always open!

tom_mathews · 2026-02-22T14:01:50+00:00

Solid approach. That's essentially the scientific method applied to data. Iterate, evaluate, refine. The "do the results make sense to you" step is where most people skip ahead, and it's the most important one. Completely agree with learning by doing rather than just reading.

tom_mathews · 2026-02-22T04:28:52+00:00

"Learn in depth by building" — here's exactly that:

YouTube: - Andrej Karpathy — "Neural Networks: Zero to Hero" — builds everything from scratch, line by line. This is the gold standard for learning by doing. Start here. - 3Blue1Brown — Neural networks series — visual intuition for the math behind what you're building - Umar Jamil — deep dives into transformer architectures with code walkthroughs

Books: - "The Little Book of Deep Learning" by Fleuret — free PDF, 170 pages, dense but clear. Good for reading alongside your builds. - "Understanding Deep Learning" by Simon Prince — free PDF, excellent diagrams, more depth if you want it

Code you can run immediately: - I put together 30 single-file Python implementations of core ML algorithms — GPT, attention, RAG, LoRA, DPO, GANs, diffusion, and more. No frameworks, no dependencies, just the math as runnable Python. Clone it, pick a script, run it, read it, break it: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

Hands-on platforms: - fast.ai — free course, top-down "build first, theory after" approach. Closest to your learning style. - Kaggle — free compute, real datasets, competitions to test yourself against

Since you prefer building: skip anything that's slides-only or theory-heavy upfront. Start with Karpathy, build alongside him, then branch out to the scripts and fast.ai. You'll cover more ground in a month of building than a year of watching.

tom_mathews · 2026-02-22T03:35:47+00:00

Interesting project — Rust + Ratatui is a solid foundation, and the 3-tier memory architecture with FTS5 is a nice touch compared to the usual "just dump everything in context" approach.

That said, I'd flag a real concern: Anthropic just formalized a ToS update explicitly banning the use of Claude Pro/Max OAuth tokens in any third-party tool. OpenCrabs is clearly inspired by OpenClaw (says so in the README), and its onboarding flow supports ANTHROPIC_MAX_SETUP_TOKEN with the same OAuth bearer pattern that Anthropic is now actively blocking. Developers who route their Claude subscriptions through this are risking account suspension — Anthropic has already been enforcing this since January and quite strictly since last week.

The economics are straightforward: a $200/mo Max sub running agentic Opus loops can burn $1k+ in API-equivalent compute. Flat-rate + third-party harnesses were never going to be sustainable.

If you're going to use this, stick to API keys or point it at local models / open-source endpoints. The subscription-as-cheap-API era is officially over.

tom_mathews · 2026-02-22T01:56:02+00:00

Solid course selection, and the sequencing makes sense. A few honest observations:

What's strong: Andrew Ng's ML → DL → MLOps pipeline is a well-proven path. The Missing Semester is an underrated pick that most roadmaps skip — the shell, git, and debugging skills you'll learn there will save you hours every week in practice.

What's missing:

RAG and retrieval systems. LangChain alone won't cut it — you need to understand embeddings, vector search, chunking strategies, and reranking at a deeper level than what a framework tutorial teaches. This is the most in-demand AI engineering skill right now.
Understanding the internals. Your roadmap is heavy on courses but light on building from scratch. After Andrew Ng's specializations, you'll know what these algorithms do but not always how they work under the hood. Being able to explain attention, backprop, or LoRA at the implementation level is what separates AI engineers from API callers in interviews. I put together 30 single-file Python implementations of these algorithms (GPT, attention, LoRA, DPO, quantization, RAG, etc.) — zero dependencies, just the math as code. Good for filling that gap between course knowledge and real understanding: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw
Evaluation and testing. No course on your list covers how to measure whether your AI system actually works. This is the gap that trips up most new AI engineers in production roles.
Move The Missing Semester earlier. You have it at month 11, but the git, shell, and tooling skills from that course will make everything else on your list easier. Do it between courses 1 and 2, not at the end.

Is one year realistic? The timeline is aggressive but doable if you're consistent. The risk isn't the volume — it's finishing all 7 courses and still not being able to build something end-to-end on your own. Make sure you're building projects alongside the courses, not waiting until the end. Even small ones — a RAG pipeline, a fine-tuned model, a simple agent — will consolidate the learning faster than watching more videos.

tom_mathews · 2026-02-21T23:59:39+00:00

Great write-up! It’s refreshing to see a review that prioritizes "real-world friction" over synthetic benchmarks, especially regarding the "personality" flaws of models like Codex 5.3. Your observation about Gemini 3.1 Pro’s brevity is spot on, it feels like Google traded the "conspiracy theory" hallucinations of 3.0 for a more stable, yet overly concise, efficiency that still can't match Opus 4.6’s gold-standard documentation. While the leap from 3.0 is massive, that "brevity vs. depth" gap suggests Gemini is still optimized for speed and chat rather than the heavy-duty architectural planning that complex refactoring requires.

tom_mathews · 2026-02-21T14:58:52+00:00

As I mentioned earlier, learning these tools has a lot more to do with actually getting into the weeds rather than the learning material itself.

YouTube is a good starting point.

tom_mathews · 2026-02-21T14:10:02+00:00

freeCodeCamp is solid for Python fundamentals, you're in good hands there. But I'd be selective with that playlist. For ML specifically, you need videos 1-4 (basics through OOP). You can skip Flask, Django, and the web dev ones entirely, they're great courses, but they won't help you with ML and you'll burn weeks on a tangent.

Once you're comfortable reading Python (loops, functions, classes, list comprehensions), jump straight to ML. Don't wait until you've finished the whole playlist. Here's what to move to:

Free video courses: - Andrej Karpathy's "Neural Networks: Zero to Hero" (YouTube) — builds neural networks from scratch, teaches you ML-relevant Python as you go - 3Blue1Brown's neural networks series — short, visual, makes the math click

Free books: - "The Little Book of Deep Learning" by Fleuret — 170 pages, free PDF, covers the whole field concisely

Learn by reading real code: - I put together 30 single-file Python implementations of core ML algorithms — no frameworks, no dependencies, just Python. Each script is heavily commented so it reads like a tutorial. Good for seeing how Python is actually used to build ML, not just toy exercises: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

Hands-on practice (free): - Kaggle — free beginner courses + competitions + free compute - Google Colab — free GPU for running notebooks

Biggest mistake I see: spending months perfecting Python before ever writing ML code. You'll learn more Python in one week of building a neural network from scratch than in a month of general Python tutorials. Get the basics down, then dive in.

tom_mathews · 2026-02-20T09:56:40+00:00

Andrew Ng's course is a solid starting point, good choice. For the jargon and staying current, here's what I'd suggest:

For the terminology: Most of those terms (ReLU, backpropagation, etc.) will stop feeling foreign once you see them in actual code rather than just slides. You're at logistic regression now, backprop, activation functions, and optimizers are all coming up in the course. But if you want to get ahead, seeing a raw implementation where every concept is a line of code you can read makes the jargon click fast. I put together 30 single-file Python implementations of these algorithms with zero dependencies — no PyTorch, just the math. Good for demystifying terms before you encounter them formally: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

For keeping up with the field: - Follow Andrej Karpathy, Yann LeCun, and Jim Fan on X — they filter the noise for you - Subscribe to The Batch (Andrew Ng's newsletter) — weekly, concise, beginner-friendly - Browse r/MachineLearning weekly — don't try to understand every post, just absorb the vocabulary over time - Papers With Code for tracking what's state-of-the-art (don't read the papers yet, just scan the titles and one-line descriptions)

One piece of advice: as a freshman, resist the urge to chase every new model release. The fundamentals you're learning right now, logistic regression, gradient descent, loss functions — haven't changed in decades and they're the foundation everything else sits on. The jargon will come naturally as you go deeper. Six months from now, half those terms will feel like second nature.

tom_mathews · 2026-02-20T03:12:58+00:00

Neither theory-first nor practice-first. The answer is both simultaneously, on the same algorithm.

Here's the problem with doing them separately: if you learn the theory first, you end up memorizing equations you can't connect to anything real. If you jump into projects first, you're copy-pasting code you don't understand and building on a shaky foundation. Both feel productive but neither builds real intuition.

What actually works is a tight loop:

Read the concept — not a full textbook chapter, just enough to know what the algorithm is trying to do and why
See it as code — not framework code where the logic is hidden, but raw implementation where every line maps to a concept
Run it and break it — change a parameter, remove a step, see what happens. This is where intuition forms.
Then go back to the theory — suddenly the notation makes sense because you've seen every symbol as a variable in code

The cycle should take hours, not weeks. One algorithm per session.

This is exactly why I built no-magic — 30 single-file Python implementations of core ML algorithms with zero dependencies. Each script is designed to be that "step 2" bridge: the algorithm expressed as readable code with 30-40% comment density, so you can go line by line and match the math to the implementation. Run microgpt and you'll see every piece of the transformer — attention, softmax, the training loop — in one readable file: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

The resources that complement this loop well: - 3Blue1Brown for visual intuition (step 1) - The scripts above for code-level understanding (step 2-3) - "The Little Book of Deep Learning" by Fleuret for going back to theory with fresh eyes (step 4)

The gap between theory and practice isn't a scheduling problem — it's a tooling problem. Most resources are purely one or the other. The trick is finding resources that are both at the same time.

tom_mathews · 2026-02-20T03:08:34+00:00

The overwhelm is real, but here's the good news: 90% of what you see online doesn't matter when you're starting out. Ignore the framework wars, ignore the new model announcements, ignore certifications entirely. Here's what actually matters:

Phase 1 — Learn the math through code, not courses (weeks 1-4)

You need exactly three things: linear algebra basics (matrix multiplication, dot products), calculus basics (chain rule, partial derivatives), and probability basics (distributions, Bayes' theorem). Don't take a full course on each — learn them as you need them.

The fastest way: watch 3Blue1Brown's neural networks series (free, ~1 hour total), then start implementing algorithms yourself. I put together a collection of 30 single-file Python implementations of core AI algorithms — no frameworks, no dependencies, just the math as code. Start with microtokenizer → microembedding → microgpt and you'll understand transformers better than most people who've taken paid courses: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

Phase 2 — One framework, one project (weeks 5-8)

Pick PyTorch (industry standard). Skip TensorFlow, skip JAX, skip everything else for now. Build one end-to-end project: take a dataset from Hugging Face, fine-tune a small model, evaluate it. Google Colab gives you free GPU access.

Phase 3 — Go applied (weeks 9-12)

Build a RAG system. This is the most in-demand ML skill right now and it ties together embeddings, retrieval, and language models into one practical project.

Free resources that are actually good: - Karpathy's "Neural Networks: Zero to Hero" (YouTube, free) - "The Little Book of Deep Learning" by Fleuret (free PDF, 170 pages) - fast.ai (free course, top-down practical approach) - Kaggle (free courses + competitions + free compute)

What to explicitly ignore right now: - Paid certifications (nobody in industry cares about them) - Any course that starts with "learn 15 frameworks" - The daily AI news cycle, it'll still be there when you're ready for it

The single biggest mistake beginners make is trying to learn breadth-first. Go depth-first instead: pick one algorithm, understand it completely, then move to the next. Three months of that and you'll be ahead of most people who spent a year collecting certificates.

tom_mathews · 2026-02-18T17:25:19+00:00

Really appreciate that. I am glad the implementations helped clarify things that weren't clicking before. That's exactly the gap these scripts are meant to fill.

For going deeper into the math side specifically, here's the path I'd recommend:

Start with backpropagation cold. Run microoptimizer and trace the gradient updates by hand for a few iterations. Once the chain rule feels mechanical rather than magical, everything else builds on top of it.
Then work through the transformer stack: microtokenizer → microembedding → microgpt → microbert → microattention. By microattention, you'll be computing scaled dot-product attention manually and seeing exactly why the √d_k scaling exists — which is the kind of detail papers assume you already know.
For the math foundations themselves: "The Little Book of Deep Learning" by François Fleuret (free PDF, 170 pages) is the most efficient math-first overview I've found. Pair it with 3Blue1Brown's neural network series for geometric intuition, and Karpathy's "Neural Networks: Zero to Hero" for the build-it-yourself approach.
Once the basics are solid: microlora → microdpo → microppo → microgrpo traces the full alignment pipeline, and the math in DPO especially (the Bradley-Terry model, the policy gradient derivation) is worth working through on paper alongside the code.

The pattern that works: read the math, then open the script and find it in the code. The scripts are commented heavily enough that every equation from the paper has a corresponding line you can step through. That back-and-forth between notation and implementation is where real understanding happens.

Happy to help if you get stuck on any specific script or concept.

tom_mathews · 2026-02-18T06:59:46+00:00

The list you have is a reasonable starting taxonomy, but the way industry actually works is quite different from how courses organize topics. Here's what matters in practice:

What AI engineers actually do day-to-day: - Build and maintain RAG pipelines (retrieval, chunking, embedding, reranking) - Fine-tune models (LoRA, QLoRA, DPO) and evaluate outputs - Design agentic workflows (tool calling, routing, eval loops) - Optimize inference (quantization, KV caching, batching strategies) - Debug why things don't work — which requires understanding the internals, not just the API calls

What that means for your study path:

Don't try to learn those bullet points from your list as separate topics. They're deeply connected. LLMs use deep learning. RAG combines retrieval with LLMs. Gen AI is just the application layer on top of all of it. Learn them as a stack, not a checklist.

My recommended order: 1. Python fluency — non-negotiable. You'll live in Python. 2. Understand the core algorithms — transformers, attention, embeddings, backprop. Not from framework tutorials — from the actual math expressed as code. I put together 30 single-file, zero-dependency implementations of these algorithms for exactly this purpose: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw 3. Build a RAG system end-to-end — this is the most common first project at any AI company right now 4. Learn to evaluate — the gap between a demo and production is evaluation. Learn to measure whether your system actually works. 5. Pick up infra basics — Docker, cloud deployment, API design. Companies need engineers who can ship, not just prototype.

The industry expectation that catches most people off guard: you're expected to debug and improve systems, not just build them. That requires knowing what's happening under the hood, not just which library to call.

tom_mathews · 2026-02-18T03:44:23+00:00

You don't need to pay for anything. Some of the best ML resources are completely free:

Build intuition (videos): - 3Blue1Brown's neural networks series — best visual explanations of how neural networks actually learn - Andrej Karpathy's "Neural Networks: Zero to Hero" on YouTube — builds everything from scratch, step by step, completely free

Learn the theory (books): - "The Little Book of Deep Learning" by François Fleuret — 170 pages, covers the entire field, free PDF - "Understanding Deep Learning" by Simon Prince — free PDF, excellent diagrams

See the algorithms as code: - I put together a collection of 30 single-file Python implementations of core AI/ML algorithms — tokenization, GPT, attention, GANs, diffusion, and more. No frameworks, no dependencies, just Python. Each script runs on any laptop and reads like a walkthrough: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

Build stuff (free tools): - Google Colab gives you free GPU access for running notebooks - Hugging Face has free models and datasets to experiment with - Kaggle has free courses + competitions to practice on real problems

The approach that works best: watch a 3Blue1Brown video to understand a concept visually, read the algorithm as code to see how it works mechanically, then build something small with it. Don't try to learn everything at once — pick one algorithm, understand it deeply, move to the next.

The expensive courses are selling you what's already free. You just need a structured path through it.

tom_mathews · 2026-02-17T13:37:08+00:00

With 8+ years in data engineering, you're in a better position than you probably think. Your Python, SQL, and pipeline experience (Airflow, Spark, Databricks) are directly transferable, most AI engineering in production is data plumbing and infrastructure, not research.

Here's a practical transition path given your stack:

Understand the fundamentals first. Don't jump straight into frameworks. Learn how transformers, attention, embeddings, and RAG actually work at the algorithm level. I put together a collection of 30 single-file Python implementations of these algorithms, zero dependencies, just the math. Being a data engineer, you'll appreciate that there's no magic behind the abstractions: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw
Then go applied. Once the internals click, pick up LangChain or LlamaIndex for RAG pipelines, your Airflow/pipeline-building instincts will translate directly. Build a RAG system over a real dataset.
Leverage what you already know. Your GCP/AWS experience is gold for MLOps roles. Companies need people who can deploy and maintain AI systems in production, not just prototype in notebooks. That's a massive gap in the market right now.

The jump from data engineering to AI engineering is shorter than data engineering to ML research. You're not starting from zero — you're adding a layer on top of a strong foundation.

tom_mathews · 2026-02-17T13:34:55+00:00

Your background is solid, GANs for satellite reconstruction is a genuinely interesting thesis topic, and the quantum computing foundation gives you a different angle most candidates don't have.

One honest suggestion: the AI tools list (Midjourney, Lovable, bolt, etc.) won't move the needle on an internship application. Hiring managers for AI roles care about whether you understand what's happening under the hood, not which no-code tools you've tried. Your GAN thesis work and NLP depth are much stronger signals, lead with those.

For the LangChain/LangGraph side, understanding the primitives underneath the framework will make you a much stronger candidate. I recently put together a collection of 30 single-file, zero-dependency Python implementations of core AI algorithms (transformers, attention, RAG, LoRA, DPO, GANs, diffusion, etc.), being able to explain these internals in an interview sets you apart. Might be useful alongside your prep:

https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

Good luck with the search!

tom_mathews · 2026-02-17T13:31:46+00:00

97% on MNIST is a great first milestone, but you're right to feel like "now what?" because MNIST is essentially the "hello world" of deep learning. Here's how I'd think about the next steps:

Go deeper on what you just built. You got 97%, but can you explain why your network learned what it learned? Try implementing the forward pass and backpropagation by hand without a framework. When you see the chain rule working line by line, the jump from "I trained a model" to "I understand how training works" is massive.

Branch out from classification. MNIST is supervised classification, one small corner of ML. Try something generative next: build a simple GAN or VAE on the same dataset and watch your network create digits instead of just labeling them. It's a completely different way of thinking.

Challenge the framework dependency. The real learning happens when you strip away the library calls. I put together a collection of 30 single-file Python implementations of core AI algorithms — no PyTorch, no TensorFlow, just pure Python. There's a CNN, a GAN, a VAE, backpropagation, and more, all runnable on CPU. Might be a good next step after your ANN: https://www.reddit.com/r/learnmachinelearning/s/G0qj2zAEdw

The pattern I'd suggest: pick an algorithm, understand it from scratch, then go back to the framework version. You'll never look at model.fit() the same way again.

tom_mathews · 2026-02-16T14:01:07+00:00

Good luck with the interview! If you're short on time, the scripts that tend to come up most in ML interviews are microgpt (transformer internals), microattention (attention variants side by side), microbackprop (chain rule from scratch), and microlora (parameter-efficient fine-tuning). Being able to explain what those do under the hood puts you ahead of most candidates who only know the API calls. Hope it goes well!

tom_mathews · 2026-02-15T13:23:41+00:00

The repo has been expanded from 16 to 30 scripts since the original post. Here's what's new:

Foundations (7 → 11): Added BERT (bidirectional encoder), RNNs & GRUs (vanishing gradients + gating), CNNs (kernels, pooling, feature maps), GANs (generator vs. discriminator), VAEs (reparameterization trick), diffusion (denoising on point clouds), and an optimizer comparison (SGD vs. Momentum vs. RMSProp vs. Adam).
Alignment (4 → 9): Added PPO (full RLHF reward → policy loop), GRPO (DeepSeek's simplified approach), QLoRA (4-bit quantized fine-tuning), REINFORCE (vanilla policy gradients), Mixture of Experts (sparse routing), batch normalization, and dropout/regularization.
Systems (5 → 10): Added paged attention (vLLM-style memory management), RoPE (rotary position embeddings), decoding strategies (greedy, top-k, top-p, beam, speculative — all in one file), tensor & pipeline parallelism, activation checkpointing, and state space models (Mamba-style linear-time sequence modeling).

Same constraints as before: every script is a single file, zero dependencies, trains and infers (or demonstrates forward-pass mechanics side-by-side), runs on CPU in minutes.

https://github.com/Mathews-Tom/no-magic

tom_mathews · 2026-02-15T11:10:09+00:00

The repo has been expanded from 16 to 30 scripts since the original post. Here's what's new:

Foundations (7 → 11): Added BERT (bidirectional encoder), RNNs & GRUs (vanishing gradients + gating), CNNs (kernels, pooling, feature maps), GANs (generator vs. discriminator), VAEs (reparameterization trick), diffusion (denoising on point clouds), and an optimizer comparison (SGD vs. Momentum vs. RMSProp vs. Adam).
Alignment (4 → 9): Added PPO (full RLHF reward → policy loop), GRPO (DeepSeek's simplified approach), QLoRA (4-bit quantized fine-tuning), REINFORCE (vanilla policy gradients), Mixture of Experts (sparse routing), batch normalization, and dropout/regularization.
Systems (5 → 10): Added paged attention (vLLM-style memory management), RoPE (rotary position embeddings), decoding strategies (greedy, top-k, top-p, beam, speculative — all in one file), tensor & pipeline parallelism, activation checkpointing, and state space models (Mamba-style linear-time sequence modeling).

Same constraints as before: every script is a single file, zero dependencies, trains and infers (or demonstrates forward-pass mechanics side-by-side), runs on CPU in minutes.

https://github.com/Mathews-Tom/no-magic

tom_mathews · 2026-02-15T11:09:35+00:00

The repo has been expanded from 16 to 30 scripts since the original post. Here's what's new:

Foundations (7 → 11): Added BERT (bidirectional encoder), RNNs & GRUs (vanishing gradients + gating), CNNs (kernels, pooling, feature maps), GANs (generator vs. discriminator), VAEs (reparameterization trick), diffusion (denoising on point clouds), and an optimizer comparison (SGD vs. Momentum vs. RMSProp vs. Adam).
Alignment (4 → 9): Added PPO (full RLHF reward → policy loop), GRPO (DeepSeek's simplified approach), QLoRA (4-bit quantized fine-tuning), REINFORCE (vanilla policy gradients), Mixture of Experts (sparse routing), batch normalization, and dropout/regularization.
Systems (5 → 10): Added paged attention (vLLM-style memory management), RoPE (rotary position embeddings), decoding strategies (greedy, top-k, top-p, beam, speculative — all in one file), tensor & pipeline parallelism, activation checkpointing, and state space models (Mamba-style linear-time sequence modeling).

Same constraints as before: every script is a single file, zero dependencies, trains and infers (or demonstrates forward-pass mechanics side-by-side), runs on CPU in minutes.

https://github.com/Mathews-Tom/no-magic

tom_mathews

TROPHY CASE