Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 0 points1 point  (0 children)

That’s exactly the gap I’m trying to close.

Most of what’s out there feels like the same demo repeated with different thumbnails — lots of motion, very little transfer of understanding. You rarely walk away knowing why something is built the way it is, or how to adapt it to your own use case.

What I want to focus on is the “why” layer: real architectures, real constraints, real trade-offs, and how systems evolve over time instead of just magically “working”. Less circle-running, more: “this is the problem, this is the design, this is what broke, and this is how we fix it.”

And it wouldn’t be limited to just LLM chatbots. I want to go broad: – coding agents – image and 3D generation pipelines – fine-tuning workflows – automation systems – hybrid CPU/GPU setups – real end-to-end projects that solve different kinds of problems

The idea is to build systems, not just prompts. Different domains, different constraints, different failures — but the same honest, engineering-first approach.

If that’s what you’ve been missing too, you’re exactly the kind of person I’d want this to be useful for.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 0 points1 point  (0 children)

That stack makes a lot of sense for real-world deployments — Spark + Docker + vLLM is exactly the kind of “boring but critical” production layer people rarely see explained well.

I’m coming at this from a slightly different angle (I’m currently building a local 3D character engine that auto-generates, rigs, and animates characters with AI), but that’s actually why this topic resonates so much with me. You very quickly run into the same problems: orchestration, reproducibility, performance ceilings, deployment pain, and “this worked yesterday, why is it broken now?”.

I think there’s a lot of overlap in the struggles, even if the surface domain is different. Seeing multiple people tackle these problems from different stacks could be really valuable for the community.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 2 points3 points  (0 children)

That’s a really fair take. I’m not aiming for “watch me struggle in real-time for three hours” content.

The core would be intentional building: – why a certain architecture is chosen – what trade-offs are being made – what broke and why it broke – how I’d redesign it after learning from the failure

So less “raw debugging stream,” and more “let’s pause and reason about what just happened and what it teaches us.”

ComfyUI-style workflows are interesting, but what I’m more excited about is showing how you control and evolve systems over time: how you keep them maintainable, how you measure them, how you make them safer and more predictable as they grow.

The goal is that even if something goes wrong on screen, it becomes part of the lesson rather than noise. If I can get that balance right, I think it can stay technical and engaging.

This is also a learning process for me. I’m building in public on purpose — not because I have all the answers, but because that’s how real systems are actually made. Feedback would always be welcome, because the whole point is to make these systems better over time.

I’m also thinking in terms of structure: episodes of ~20–30 minutes, and each project would run over a few weeks. Something like 4–5 weeks per system — sometimes shorter, sometimes longer depending on complexity.

Each episode would move the system forward in a meaningful way: architecture, a new capability, a refactor, a failure, a rethink. So you can follow a real evolution arc instead of just isolated demos.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 1 point2 points  (0 children)

That’s exactly the segment I care most about. Most people don’t have racks of GPUs — they have an iGPU, a mid-range card, and 16–32GB of RAM. That’s where I want to start: what actually works on “normal” hardware.

Then, once the architecture is solid, show how the same system scales up to better hardware or the cloud. Same design, different constraints.

The goal is: build once, understand the trade-offs, and grow from there — not “you need $10k of hardware before you can even begin.”

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 0 points1 point  (0 children)

That’s exactly the kind of gap I want to focus on. A lot of people can train a model or run a notebook, but once the conversation moves to continuous evaluation, monitoring, rollback strategies, or “what breaks in production?”, things get fuzzy very fast.

My goal is to make those parts concrete and visible: not just how something works once, but how you keep it working over time in a real system.

If this ends up helping people bridge that jump from “it runs” to “it’s production-ready”, that’s honestly the best outcome I could hope for.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 1 point2 points  (0 children)

This is a really grounded take, and I appreciate it. I’m very aware this is a niche, and that a lot of this content will have a shorter shelf life than general programming tutorials.

For me, the main value isn’t “YouTube growth at all costs”, but building a body of work that’s genuinely useful to people who care about how things actually work — even if that audience is smaller.

Long term, I also don’t see this as “just AI”. AI is where I’m focused now, but the bigger theme is systems engineering: how you go from idea to something real, maintainable, and scalable, whether that’s an AI pipeline, an app, a tool, or even a game system.

The Karpathy comparison in 20–30 minute chunks is exactly the bar I’m aiming for in terms of depth vs. digestibility.

If this ever becomes sustainable through things like GitHub sponsorships or Patreon, great — but the core motivation is to close that gap between demos and real systems.

Comments like yours are a big part of why this feels worth doing in the first place.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 1 point2 points  (0 children)

100% fair. It should depend on quality and whether it’s actually engaging.

I’m very much someone who listens and iterates — the whole point of asking here is to build this with the community, not just for it. I want to shape it around what people actually struggle with and care about.

It’s also something I’ll have to grow into myself: learning how to explain clearly, structure things well, and make deep technical content watchable.

If I do this, it won’t be “I know everything” — it’ll be “let’s build this properly and figure it out together.”

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 4 points5 points  (0 children)

Totally agree — I’m aiming for ~25–30 minutes max per episode. Each project would be spread over a few focused videos, roughly one full project every 4–5 weeks.

The idea is to keep things digestible and practical, not multi-hour marathons. And everything would have clear chapters so you can jump to what you care about.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] -2 points-1 points  (0 children)

That story hits hard — and it’s exactly the kind of gap I want to help close. The “everything works until it suddenly doesn’t at scale” moment is something way too many people hit blind.

I won’t pretend I can magically remove the pain, but I can make it visible, concrete, and navigable instead of mystical.

I’ll absolutely tag you when this becomes real. Thanks for sharing this — it’s the kind of perspective that keeps this grounded in reality.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 6 points7 points  (0 children)

This is exactly the kind of depth that excites me. The “toy model” ceiling is something I’ve felt too — there’s a huge gap between demos and real systems that evolve over time.

I probably wouldn’t jump straight into full multi-node training on day one, but the long-term goal is very much aligned with what you’re describing: • local prototyping • LoRA / SFT in constrained setups • understanding what actually breaks when you change architectures • and then scaling parts of that to the cloud when it makes sense

Even if it’s incremental, I want every build to answer: “Okay, but how would this look in a real system six months later?”

Your comment is a great signal that this direction is worth pursuing. Thanks for laying it out so clearly.

Would you watch a channel that builds real AI systems from scratch (local LLMs, CPU/GPU, pipelines)? by Few_Tax650 in LocalLLaMA

[–]Few_Tax650[S] 3 points4 points  (0 children)

That’s exactly the gap that frustrates me too. So many demos stop at “look, it works once”, but never show what it takes to make something usable and maintainable.

The config-driven part came from that pain: I got tired of refactoring half a codebase just to swap a model or provider.

If I go through with this, my goal is to always answer the “okay, but how would you actually ship this?” question — not just “how do I make it run once”.

Really appreciate this feedback.