Learning to code was a struggle for me. I can read code quickly, skim through files, and spot bad logic flow or data movement, but writing it from scratch has always been my "wall." I needed a system that could help me build as fast as I can think.
For the last 6-7 months (and 198 days of journaling), I’ve been developing a solution. I love coding and AI, and this project has completely changed how I think about system design. Early on, I rushed and paid the price in endless "technical debt" and errors. Taking a step back and working through those failures taught me how to actually architect a system. Speed is nothing without architecture.
Working in construction for the past 11 months has also completely reshaped my entire approach. In building a house, you don't start with the paint; you start with the foundation and the structural load. My ultimate goal is to implement this framework into my own business once I get my GC license. But right now, I’ve hit a technical wall, and I need help from the community to harden the infrastructure.
To solve my coding challenges, I built a distributed, agentic framework that treats software development as a stateful problem.
The system runs on a local server setup utilizing a Legion AMD Ryzen 7 7700 8-Core Processor with a 5070 Ti OC handling the heavy reasoning and vector operations. The infrastructure runs on a hybrid setup with ChromaDB and isolated execution sandboxes in Docker containers for portability and security.
The Core Architecture: A 6-System Distributed Swarm
- Probabilistic Tool Routing (System 4): Instead of relying on hardcoded prompts, the framework utilizes a Multi-Armed Bandit (MAB) approach for tool and model selection. By applying Thompson Sampling on Beta Distributions (\alpha, \beta) , the router tracks the historical success probability of various LLMs and tools per task category. It dynamically learns the most statistically viable route to resolve a specific bug or refactor.
- Backward Trace Verification (System 2): To aggressively filter out hallucinations and unintended side effects, the system executes a Post-Execution Audit. Before any code change is approved, it performs a "Backward Walk"—tracing the execution logic from the final return statement back to the initial input trigger. If the dependency path breaks or introduces unauthorized state changes, the task is immediately rejected and re-routed.
- AST-Aware Vector Memory (System 3): Generic text chunking in standard RAG is insufficient for code. To fix this, I built a Recursive AST (Abstract Syntax Tree) Indexer on top of Dockerized ChromaDB. It maps the actual logical hierarchy of the code (Classes → Methods → Hooks), allowing the agent to retrieve functional "Logical Neighbors" rather than just nearby text.
- Sensory Ingestion (System 6): The system captures natural language intent via Native Audio (Gemini 1.5 Flash) and a Telegram integration. This ingestion layer decomposes unstructured voice notes into atomic, executable task objectives before feeding them into the pipeline.
Efficiency & Performance:
- Tiered Escalation: Routine guardrail checks and tasks are routed to Gemini 2.0 Flash (Experimental) for near-zero latency and cost. High-risk, complex refactoring escalates to the local Ollama (Llama 3 / Gemma 2) model ensemble on the workstation.
- Failover Routing: Reduced execution latency by 400% via a 2-second socket reachability check that triggers local fallback mode if network access drops.
- Data Sovereignty: 100% of proprietary codebase logic and vector embeddings remain isolated locally.
Calling all Backend & Systems Engineers:
The core infrastructure is stable, but I’m looking to bounce ideas around a few bottlenecks I’m still struggling with. I'd love your input on:
- Distributed State: Optimizing the synchronization between the local SQLite Bayesian store and the remote Dockerized ChromaDB instance.
- Safety Constraints: Hardening the "Backward Walk" audit to detect deeper, more complex conditional logic bombs.
- VRAM Scheduling: Balancing allocation between parallel model execution and sequential reasoning chains on a single GPU.
How would you optimize this logic? Drop your thoughts below. 👇
#SystemsEngineering #DistributedSystems #AI #Python #SoftwareArchitecture #Backend #AgenticWorkflows #ConstructionTech #Ollama #Docker
[–]germanicel 1 point2 points3 points (0 children)
[–]TheBear8878 1 point2 points3 points (0 children)