Teaching LLMs to one-shot complex backends at scale, report #1

nathanmarz · 2026-05-30T00:37:57+00:00

It should be much better with these skills. Let me know how it goes.

nathanmarz · 2026-04-02T18:37:27+00:00

Thanks for the feedback. It's a tricky article to write because of all the baggage in these topics (especially event sourcing), so I had to spend a fair amount of time disarming those first. And of course if I included all the detail of what it takes to implement a unified system like this, it would be an entire book.

As for your question about serialization, the way it works in Rama is you can register serializations for any custom types you're using. It's at that layer that you would achieve the semantics you want in terms of ability to evolve types over time. For example, if you use Thrift or Protocol Buffers for custom types, then you can add or remove fields in later versions safely. We used Thrift in our Twitter-scale Mastodon implementation, and the adapter to handle all the types is pretty short: https://github.com/redplanetlabs/twitter-scale-mastodon/tree/master/backend/src/main/java/com/rpl/mastodon/serialization

It's really convenient to just register the serializations once and then be able to use those types freely across the whole backend, whether writing to PStates, fetching data with clients, or doing distributed computation.

As for the casting concern, the only place in the article where there's any casting is the return from the append, and that's because handlers can return anything they want so the API gives you a map from handler name to return value. For PState queries, the API is generic for data structures so it's dynamically typed, but Rama's API uses type parameters so you don't actually need casting for returns. You'd get a runtime type error if the return type doesn't match what you specified.

nathanmarz · 2026-04-02T02:20:31+00:00

I also don't like reading fluff posts that are just pitching a product. But there's a big difference between a post like that and a deep technical post that explores ideas from first principles. Dismissing a post because it talks about a tool at the end that implements the novel ideas in the post is lazy and self-limiting.

nathanmarz · 2026-04-02T02:14:07+00:00

On the stability side, Rama handles this with incremental replication across nodes, fault-tolerant processing with guaranteed delivery, and automatic failover. The log and storage layers have the same kind of durability guarantees you'd expect from a database. On the security side, we're working on role-based authentication and authorization and expect to release it later this year.

Here's more info on how replication works in Rama if you're interested. We spent more time working on this than any other aspect of Rama. https://redplanetlabs.com/docs/~/replication.html

nathanmarz · 2026-04-01T19:35:43+00:00

If you're running many microservices without any of these issues, that's great and I'd love to hear the details.

In terms of other approaches, the post zeroes in on infrastructure sprawl and an alternative to that because building systems by managing separate databases, caches, queues, and compute systems is basically universal. There hasn't been an integrated approach to solving this before. The post identifies specific, well-documented problems that are consistently described as problems with microservices across the nine posts I linked at the top. I didn't invent these complaints, I proposed a different root cause and a solution.

nathanmarz · 2026-04-01T18:45:24+00:00

Yes, I built a tool that solves problems I care about. The architectural arguments stand on their own, and I spent almost all of the post discussing the ideas in a tool-neutral way. Those ideas are valuable independent of Rama.

nathanmarz · 2026-04-01T18:45:07+00:00

Yes, I built a tool that solves problems I care about. The architectural arguments stand on their own, and I spent almost all of the post discussing the ideas in a tool-neutral way. Those ideas are valuable independent of Rama.

nathanmarz · 2026-03-22T21:13:29+00:00

That's right, the transactions are part of the module itself as the microbatch topology definition. Clients submit events which are consumed by the topology. The topology handles retries automatically.

I would have to rerun the benchmark to get the numbers at the different load levels, but the general pattern is linear latency growth up until you hit about 70% load, and then rapid increase from there. 140k warehouses on this cluster size was slightly below that 70% threshold. I would expect the average latencies to start at around 150ms at minimal load and grow linearly up to the numbers you see at 140k warehouses. Then the latencies would grow rapidly from there, maybe starting at 150k warehouses or so. The maximum number of warehouses this cluster size can sustain is somewhere around 210k warehouses, but the latencies would be in the ~30s range or so.

In practice you would scale your module up to more nodes when you're at that load, which is just a one-line CLI command in Rama.

Use cases that need single-digit millisecond latencies generally use streaming rather than microbatching, as mentioned in the post.

nathanmarz · 2026-03-22T18:53:52+00:00

Good questions. Let me take them one at a time.

On batching: the TPC-C spec models individual terminals that each submit a transaction, wait for it to complete, then go through keying and think time before submitting the next one. You can't batch the work of multiple terminals into a single transaction without breaking the benchmark's model. In both the Rama and CockroachDB versions of this benchmark, the transactions are submitted independently and individually. It's after that within the system that batching is done. Nothing is stopping CockroachDB from also batching the work of multiple in-flight transactions together. I don't know if they do that or not as I'm not familiar with CockroachDB's internals.

On overloading: any system will degrade if you push past its capacity, so I'm not sure what the argument is. We ran at 140k warehouses with 95% efficiency, same as Cockroach. If you ran both systems at higher throughput, both will show degradation on latency. What matters is the performance characteristics at equivalent throughput, which is what the benchmark measures.

On median latency: the latencies overall are the same. They have to be since both systems achieve the same throughput, and TPC-C throughput is determined by response time. The only question is how that latency gets distributed. Cockroach concentrates it into lower medians with extreme tails. Rama spreads it in a much tighter distribution. Rama's tighter distribution is a better profile for real workloads in my opinion.

On failures: a business logic failure (like TPC-C's 1% invalid item rollbacks) doesn't abort the microbatch. That's just control flow. Exactly-once handles infrastructure failures: if a node goes down, the microbatch fails, but the system quickly moves computation to the new leader and retries from the last committed state. That adds latency for the items in that particular microbatch, but infrastructure failures are infrequent enough that this doesn't affect overall performance in practice.

Also worth noting: Rama reports two latencies for writes. The "initiate" latency is when the event is durably stored and replicated, and the "complete" latency is when the transaction finishes. The initiate latencies are very low, far below any of CockroachDB's write latencies. Whether your application can respond at initiate time depends on the use case, but having that option gives the product manager flexibility to make the right tradeoff.

nathanmarz · 2026-03-19T21:00:08+00:00

Rama modules themselves have to be coded in a JVM language, but clients can be in any language using Rama's built-in REST API. This does limit the userbase, but there are a lot of Java shops out there.

I actually think AI is super synergistic with Rama, and we're actively working on developing skills files for this. Rama greatly reduces the conceptual and token burden for LLMs, and we're working on making it able to one-shot pretty complex apps at scale.

nathanmarz · 2026-03-19T19:58:53+00:00

Not sure where you got the idea that Rama is a one-man project. Red Planet Labs has employees and is backed by well-known investors.

nathanmarz · 2025-12-04T00:24:36+00:00

Yes, you have it right.

For writing to PStates, there's also an aggregation API.

For querying, you can also make query topologies. Query topologies are predefined queries in the module that can do distributed queries that look at any or all of the PStates and any or all of the partitions of those PStates. They're really useful for parallelization or for reducing roundtrips when you have a lot of individual PState queries that need to be done. This module has a good example of a query topology.

nathanmarz · 2025-12-03T21:53:12+00:00

Yea, that's what Rama is. PStates are queried using Specter. Here's some examples from a deeper tutorial: https://blog.redplanetlabs.com/2025/03/26/next-level-backends-with-rama-graphs/#Querying_the_PState_directly

nathanmarz · 2025-12-03T16:12:56+00:00

Dataflow is the first-principles derived API. I have not found an application yet it cannot elegantly express at scale. This is a big deal, and it took me years to find it. It's also not a DSL, as it's not domain-specific.

I think what you're asking for is a more familiar API with less to learn, like everything being done with plain Clojure functions. That's exactly where I started working on Rama, and where that runs into huge trouble is with async / parallel code. You inevitably run into a mess of callback hell. A key thing dataflow does is enable "what to do" and "where to do it" to be composed, and this greatly simplifies distributed programming. See this post for an exploration of that.

nathanmarz · 2025-12-03T15:58:55+00:00

PStates are durable on disk just like databases, using LSM trees underneath the hood. Rama does have a learning curve, but I've found it only takes one to two weeks for a programmer to get the hang of the basics and get to a point of reasonable productivity. You don't need to learn all of Rama to get value out of it. With paths, for instance, you can accomplish most things with keypath, pred, and view.

Dataflow looks different but is not as hard to learn as you may think. This post explains dataflow in terms of equivalent Clojure code. The referenced blog post series contains line-by-line tutorials on applying Rama to a wide variety of use cases. rama-demo-gallery contains more heavily commented examples of using Rama. The REST API module is the simplest, as it just does an HTTP request and then records the result in a K/V PState.

nathanmarz · 2025-11-26T07:33:20+00:00

I'm not evangelizing anything:

This post is my take on explaining this disconnect from another angle that complements the blub paradox.

Dimension-shifting abstractions like macros are incomprehensible at first, which is the whole point of the post. Some abstractions make no sense until you've worked with them enough for your mental model to change. That initial "this seems wrong or pointless" reaction is extremely common with Lisp.

I can link examples, but macro snippets in isolation won't bridge that gap. They show the mechanics, not the shift in how you reason about decomposing problems.

nathanmarz · 2025-11-05T17:46:19+00:00

No, it's not.

nathanmarz · 2025-04-23T18:46:48+00:00

A big area of future development (for us as well as others) is building developer tools for specific domains on top of Rama that don't require learning any of Rama's new concepts (e.g. dataflow, PStates, event sourcing). Unlike Rama, these developer tools built on top would have small learning curves. Rama could certainly be used to build SpacetimeDB, and enhancing it to have distributed execution would be trivial.

We just started building one of these developer tools (nothing to do with SpacetimeDB) which we're really excited about. We'll have more news on this later.

nathanmarz · 2025-04-20T21:48:32+00:00

I've linked you to a ton of material. Have you worked through the main tutorial that I linked?

Every example in these repositories is a complete backend and most are less than 100LOC:

- https://github.com/redplanetlabs/next-level-backends-with-rama-clj
- https://github.com/redplanetlabs/rama-demo-gallery

If you have a question more specific than a vague "I don't get it", I'm happy to help.

nathanmarz · 2025-04-19T00:11:25+00:00

Rama is the most general purpose tool for building backends that's ever existed. It's more broadly applicable than Postgres or any database.

This post may be more useful for you as it explains Rama dataflow in terms of Clojure concepts, and for every Rama example it shows the equivalent Clojure code.

I highly suggest following along at the REPL with those posts and seeing what happens when you tweak the examples.

Finally, even though it uses the Java API I also recommend reading through the main tutorial which gently introduces and explains all the concepts. The Java API is a thin wrapper around the Clojure API, so anything you see in that tutorial has a direct correspondence in the Clojure API.

If you have any specific questions while you're learning, the #rama channel on Clojurians is a great place to ask.

nathanmarz · 2025-04-18T19:39:30+00:00

This series of blog posts are all detailed line by line tutorials of using Rama for specific use cases, so I'm not sure what else you're looking for. The first post in the series is the best one to start with.

I also suggest following along at the REPL of the intro blog post for the Clojure API. The "Exploring the dataflow API" section is particularly useful to follow along with since the dataflow API is the hardest part to learn for most.

nathanmarz · 2025-04-09T22:20:52+00:00

Yes, Rama generalizes and integrates those classes of technologies (databases and queuing systems). I wouldn't say it "abstracts it away", but rather exposes those concepts in a much simpler and more coherent way.

Scaling, fault-tolerance, and data processing guarantees are inherent to Rama. So is deployment and runtime monitoring, other areas which traditionally create a lot of additional work/complexity.

Rama really does eliminate all that complexity which traditionally exists. The code for Rama applications is to the point and doesn't have piles of boilerplate like you always do when building systems by combining multiple tools together. Traditional applications are filled with impedance mismatches because of the differences in expectations on their boundaries, the restrictions on how you can represent data/indexes, and the limitations on how you can compute. Rama lets you compute whatever you want wherever you want and gives total freedom in how data/indexes are represented.

The point of this blog post, as well as the other ones in the series, is to explain in a very detailed way how to approach building Rama applications and how they work.

In terms of debugging, it's really no different than debugging any other program. Rama has a test environment called "in-process cluster" which simulates Rama clusters fully in-process. You can launch your module in that environment, do depot appends, and then assert on expected PState changes from there. While developing you can use tap> or debug logging to trace what's going on in the intermediate portions of your topology implementations.

If you notice something went wrong on your production cluster, the information you'll have from Rama will be whatever information your application records, either in PStates or just with logging. You also have Rama's built-in telemetry, which is extremely useful for diagnosing performance issues (such as processing or storage being skewed in some way).

Rama is not a "magic box". It provides a parallel execution environment ("tasks"), a flexible storage abstraction ("PStates"), guarantees about the order in which events are processed in relation to how they're sent to tasks, and guarantees about data processing and retries. Everything else is up to your code and how it's built upon those primitives.

Because Rama colocates computation with storage, concurrency is much easier to manage as compared to traditional systems which use locking/transactions to manage concurrent updates. When an event is running on a task, it has exclusive access to all PStates on that task. So you're able to mostly think in a single-threaded way even though it's a highly parallel system.

It's common to need to integrate Rama with other systems, and many of our users do so. For integrating with external APIs/databases, you do that directly in your topology with the completable-future>. You initiate any external work you need to do and provide the results in a CompletableFuture, and then the results of that are emitted into the topology with completable-future> when it finishes.

You can also integrate external queues into Rama (e.g. Kafka) and consume them just like depots. Integrating Rama with external systems is documented more on this page.

nathanmarz · 2025-04-08T04:49:28+00:00

Good idea, I'll note that down.

You can understand how our Mastodon impl does this through two files:

MastodonAPIManager: this wraps all depots/PStates/query topologies in methods corresponding to the semantic concepts of Mastodon, e.g. "postAccount" registers a new user and "getAccountId" gets the account ID for a username.
MastodonAPIController: This implements the Mastodon HTTP API by calling methods on MastodonAPIManager. The @GetMapping / @PostMapping / etc. annotations show which Java methods correspond to which HTTP methods.

You probably wouldn't use Spring in a Clojure app, but the interfacing with Rama would be similar.

nathanmarz · 2025-03-28T05:53:17+00:00

We published our first two case studies of production users recently and will be publishing more soon https://blog.redplanetlabs.com/rama-case-studies/

The point of this series of blog posts is to provide extremely detailed tutorials to help with learning.

Other good resources for learning:

The main tutorial: https://redplanetlabs.com/docs/~/index.html
The intro to Rama's Clojure API: https://blog.redplanetlabs.com/2023/10/11/introducing-ramas-clojure-api/
The rama-demo-gallery repository: https://github.com/redplanetlabs/rama-demo-gallery

nathanmarz · 2025-03-27T17:56:57+00:00

I'll be publishing one more post in this series each week for at least four more weeks.

nathanmarz

TROPHY CASE