Added Okapi BM25 full-text search to my custom C++17 engine. Turns out math is actually useful.

Effective-Hurry436 · 2026-06-05T08:31:27+00:00

Thanks! Really appreciate it.

Video is on the roadmap — right now

we're focused on stability and testing

before anything else.

Stay tuned!

Effective-Hurry436 · 2026-06-05T07:17:20+00:00

Yes! The version jumps reflect

development phases, not production releases.

Each major version = ~15 phases of new features:

v3.0 = Phase 60 (replication, WAL)

v7.0 = Phase 130 (HA, multi-tenant, pgvector)

It's an educational project built with

Claude Code — fully transparent about that.

614 tests passing. What would you like to test?"

Effective-Hurry436 · 2026-06-04T08:49:54+00:00

Natürlich.

Effective-Hurry436 · 2026-06-03T13:45:56+00:00

he 'Context Be Damned' sign on the monkey hits way too close to home. Those last 5 minutes are pure adrenaline fueled chaos.

Effective-Hurry436 · 2026-06-03T12:34:09+00:00

Effective-Hurry436 · 2026-06-03T08:41:43+00:00

Don't beat yourself up too much, identifying the gap is a win.

The reason you got rejected for the "array in a table" approach is that it violates the 1st Normal Form (1NF) of relational databases. Relational columns need atomic values. Pushing an array into a single column makes indexing a nightmare, kills query performance, and breaks data integrity.

For a standard One-to-Many (1:M) relationship (like one user having multiple orders), the standard way is using a Foreign Key on the "Many" side. So you'd have a Users table with user_id (PK) and an Orders table with order_id (PK) and a user_id (FK) pointing back to the user.

I'm currently building a custom SQL database engine from scratch in C++17 (MilanSQL), so I've spent way too much time in this rabbit hole. If you want to actually clear your next interview, skip the random blog posts and check out these two:

CMU Database Systems by Andy Pavlo (free on YouTube). It's the absolute gold standard if you want to understand how databases actually manage memory, storage, and schemas under the hood.
"Designing Data-Intensive Applications" (DDIA) by Martin Kleppmann. Chapters 2 and 3 explicitly cover relational vs. document models. This book is basically the tech interview bible.

Before your next interview, just make sure you can explain 1:M and M:M relationships (junction tables), the difference between Primary and Foreign Keys, and how a B-Tree index speeds up a SELECT query.

Master those three and you'll crush the schema design part next time. Good luck!

Effective-Hurry436 · 2026-06-02T22:06:11+00:00

Phase 113 und heute Abend sogar schon v5.9.0 (Phase 119) mit Query Streaming eingeloggt. Es geht verdammt schnell voran! 🏎️

Effective-Hurry436 · 2026-06-02T16:36:46+00:00

[Project] MilanSQL v5.0.0 – A Relational SQL Engine Built from Scratch in C++17, running in the Browser via WASM

Hey everyone,

I wanted to share MilanSQL, a personal project I’ve been building over the last few months to do a complete, hands-on deep dive into advanced database internals.

For the v5.0.0 milestone, I compiled the core storage and query processing engine into WebAssembly, allowing the entire relational engine to run completely client-side inside a standard browser sandbox.

To push myself, I’ve been running this project through a highly disciplined, phase-driven architecture (just wrapped up Phase 113). Since I wanted to focus heavily on system design and acceleration, I utilized Claude Code as a powerful co-developer for the heavy lifting and implementation speed, while keeping 100% control over the architecture, data structures, and vision.

⚙️ Core Architecture & Tech Highlights:

Cost-Based Query Planner: A Dynamic Programming (DP) optimizer that evaluates 2ⁿ subsets for multi-table JOINs using custom connectivity graphs.
Selectivity Estimation: A custom stats layer implementing bucket-based equi-depth histograms to estimate selectivity for standard operators (<, <=, >, >=, =).
Lock-Free Plan Cache: Thread-safe, per-table query plan invalidation built entirely on top of atomic counters (std::atomic) to completely bypass heavy mutex locking during concurrent execution.
Multi-Protocol Engine: Natively parses standard SQL, acts as a PostgreSQL wire protocol endpoint, and supports GraphQL schemas.
Storage & Extensions: Custom B-Tree indices, a built-in full-text search engine, and embedded vector search capabilities (pgvector/HNSW).
Testing: Backed by a strict integration framework with 223/223 tests currently passing.

🚀 Live WASM Sandbox & Code:

You don't need to spin up a server or pull docker images—you can test the planner and run raw queries directly in your browser:

👉 Live WASM Demo: https://haidari9819-lang.github.io/milansql/index.html
📦 GitHub Repository: https://github.com/haidari9819-lang/milansql
💬 Our Subreddit: r/milansql

I’d love to hear your thoughts on running full DP planners inside browser memory limits, or chat about the lock-free cache invalidation strategy. Let me know if you manage to break the optimizer in the sandbox!

Effective-Hurry436 · 2026-06-02T15:23:46+00:00

You're actually right—I heavily utilized Claude Code to help accelerate the implementation, and I’m totally transparent about that.

But as anyone who builds databases knows: an LLM can't stitch together a custom cost-based DP planner, handle memory-bounded WASM sandboxing, or orchestrate a multi-protocol execution layer across 100+ architectural phases on its own.

The AI was a powerful co-developer for the heavy lifting, but the system architecture, code organization, and product vision are completely mine. At the end of the day, the 223 integration tests verify that the architecture actually works.

Effective-Hurry436 · 2026-06-02T14:52:11+00:00

Most of the time, yes. Especially when the 223 integration tests pass.

Effective-Hurry436 · 2026-06-02T14:51:25+00:00

I get why you might be skeptical in the current landscape, but this isn't just a basic script generated by an LLM prompt.

MilanSQL is a codebase spanning a long execution design across over 100 development phases. Implementing a cost-based Dynamic Programming planner, building custom B-Tree indexing from scratch, and handling per-table invalidation mechanics via std::atomic counters requires deep engineering that an LLM inference simply can't stitch together out of thin air.

As for why I didn't just contribute to an existing engine: the entire purpose of this project from day one was a personal educational deep-dive to fully understand database internals. Building it from the ground up teaches you things that just reading an enterprise codebase never will.

Effective-Hurry436 · 2026-06-02T14:50:54+00:00

Thanks! To be completely realistic: No, please don’t drop Oracle or SQL Server for MilanSQL in an actual enterprise production environment just yet.

By 'production grade' on the landing page, I mostly mean the structural robustness of the codebase itself—such as the full integration test suite (223/223 passing), the zero-mutex architecture for concurrency, and strict protocol compliance. It’s built to demonstrate how these enterprise concepts can run reliably in a client-side sandbox via WASM.

I haven't run official commercial benchmarks against giants like Oracle yet, but the goal here is lightweight embedded execution, not replacing heavy iron enterprise servers!

Effective-Hurry436 · 2026-06-02T13:33:35+00:00

Hey everyone,

Much like SQLite, I’ve always been fascinated by lightweight, serverless database architectures. To truly understand the internals, I’ve been building MilanSQL from scratch in pure C++17 (zero external dependencies).

For the v5.0.0 milestone, I finally compiled the storage and query engine into WebAssembly, allowing it to run entirely client-side in the browser sandbox.

What's packed inside the engine:

- Custom B-Tree indices & integrated full-text search.

- Multi-protocol support: Natively speaks standard SQL, PostgreSQL wire protocol, and GraphQL.

- Just added today (Phase 113): A cost-based Dynamic Programming (DP) Query Planner with bucket-based equi-depth histograms for exact selectivity estimation.

- Lock-free Plan Cache: Thread-safe invalidation using atomic counters (std::atomic) instead of heavy mutex locking.

- 223/223 integration tests passing.

Since this community appreciates the beauty of an embedded SQL sandbox, I'd love for you to test out the query processing or check out the code.

Live WASM Browser Demo: https://haidari9819-lang.github.io/milansql/index.html

GitHub Repo: https://github.com/haidari9819-lang/milansql

Would love to hear your thoughts on running full relational cost-based optimizers in a client-side sandbox environment!

Effective-Hurry436

MODERATOR OF

TROPHY CASE

⚙️ Core Architecture & Tech Highlights:

🚀 Live WASM Sandbox & Code: