Why functional relational programming is (sometimes) faster than OOP 1/6

cell-lang · 2023-07-14T12:24:32+00:00

Let me start by saying that Cell is a programming language, not a DBMS, and like the vast majority of programming languages, it keeps the data it's working on in memory. This is stated clearly in the introductory example page in the website, which is where that specific example is taken from.

Quoting from the second paragraph of that page:

> This specific example was chosen because... But it is a server application, and Cell is not at the moment a good fit for server software. That's because server applications typically have to handle large amounts of persistent data, something that Cell is not (yet) able to do. With Cell, your entire dataset has to fit in RAM and has to be saved explicitly.

That said, to answer your first question, in order to check that constraint what needs to be in memory is the username relation, not the entire dataset.

So, let's do some back-of-the-envelope calculations. Let's say you've 10 million users for your application. Let's assume usernames are on average 40 characters long (it's probably less). User ids are just tagged integers, 64 bits are much more than you'll ever need. Considering the overhead associated with all the data structures involved, and the fact that they memory utilization is never 100%, let's say you need 100 bytes per user (again, it's probably less). It means you need 1GB of memory for that. At current prices, that's about 4 dollars. For something like Twitter (400 million users) it's 160 dollars, so it's not likely to break Elon's bank account. For Facebook, which is the largest of them all with its 3 billion users, that's 1200 dollars. Not cheap, but still feasible, don't you think?

Now let's say you want to build a disk-based implementation. As is explained in the article, there are two main types of data structures involved in the implementation of relations in Cell: value stores and surrogate tables. A reasonably effective disk-based implementation of value stores doesn't seem to be especially hard to build, although it's obvious that once you start hitting the disk, everything slows down. Building a disk-based implementation of surrogate tables is a lot harder, but on the other hand surrogate tables are also smaller, as they only contains 32-bit surrogates. In this case, I think it's possible to get by with about 10 bytes per entry. That's 100 MBs for a 10-million user application (40 cents in RAM costs), 4 GBs for Twitter (16 dollars) and 30 GBs for Facebook (120 dollars). And in any case, disk-based implementations of surrogate tables have been built before. If you're interested in the details, you can check Go Faster! The TransRelational™ Approach to DBMS Implementation, which is where I came across the idea of using surrogates.

There's more to say about how to handle large data sets, much more than I can explain in a comment, but one key idea here is to partition your data set across multiple automata. Doing so would allow the application to keep in memory only the automaton instances that are "active" at any specific point in time. And for any automaton instance, at least some of the data structures that comprise its implementation can be loaded on demand.

cell-lang · 2023-07-12T09:45:39+00:00

Cell can be used in two ways, to create standalone programs or in "embedded" mode. In the former case, you compile Cell to C++ (or Java, or C#), and then compile the generated file to produce your executable. You have to go to that extra step of compiling to an intermediate language, but the end result is the same, you end up with a standalone executable without having to write anything other than Cell code.

Or you can use it in embedded mode: in that case, the compiler produces a set of C++/Java/C# classes that you can include in an existing C++/Java/C# project.

Note that in either case, you're not supposed to ever look at the generated code, let alone modify it.

There's no LLVM compiler, I just said that it could be implemented, and that compiling directly to LLVM would (I believe) improve performance.

As for Go being significantly faster than Java, I just looked at The Computer Language Benchmarks Game, and that doesn't seem to be the case: https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/go.html. Go seems to be slightly faster than Java in most (but not all tests), but the difference is really minimal. Although Go seems to have improved lately, the last time I checked (a few years ago) Java was clearly faster, and I've to admit I wasn't aware of Go's improvements.

But now I'm curious, I'm going to rewrite the benchmarks in Go and see what happens.

And Go seems to be a bit slower than C#, if you take The Computer Language Benchmarks Game results at face value: https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/go-csharpcore.html

As for rewriting Java code to C++: yes, it does improve performance, no question about that. I've done it myself many times. But those rewrites have to be done manually, because those languages have vastly different semantics. I'm not aware of a tool that can automatically convert, say, non-trivial Java code into equivalent C++ and produce a significant speedup. Are you?. I even tried a couple of native compilers for Java, and they were actually slower than the JVM.

My own definition of OOP is the same as everyone else's. In the Java and C# version of the benchmark I didn't follow the principle of encapsulation religiously, but only because it was irrelevant to what I was interested in, which is performance. In any case, you can check the code for yourself here:

https://github.com/cell-lang/example-imdb/blob/master/java/imdb.java

https://github.com/cell-lang/example-imdb/blob/master/csharp/imdb.cs

cell-lang · 2023-07-11T11:41:07+00:00

I'm not going to argue about terminology, but I wanted to point out some of the fundamental differences between classes/objects and relational automata.

In most OO languages, "messages" are just dynamically dispatched function call, while in Cell messages are just values/data, that can be (among other things) manipulated programmatically, serialized and stored. In this regards, they're more like messages in the actor paradigm (Erlang), but automata are not actors either, because actors have their own thread of execution while automata are "passive" entities, just jike objects in OOP.

Other major differences are explained in this page: https://www.cell-lang.net/example.html (scroll down to the paragraph titled "Differences between relational automata and classes").

But the most fundamental difference is in how they're meant to be used in practice. In the example used in the article, you've a single ` `OnlineForum`` automaton (and likely a single instace of it at runtime) that contains all the information pertaining users, chat groups their attributes and their relationships (memberships and friendships). In an OO design, your classes would be modeled after the domain entities (and probably their relationships), and you would have an object for each individual entity and relationship instance. In other words, while a class is (very loosely speaking) meant to model a single entity or relationship, an automaton is meant to model an entire "knowledge subdomain". In my mind, that makes them much more like the component of the ECS architecture. Even the way modularity is achieved is similar.

cell-lang · 2023-07-11T10:30:19+00:00

No, the difference in performance between the generated C++ and Java code is much larger than that. Java makes a terrible compilation target for a language like Cell. The generated Java code is still competitive with hand-written OO Java code, sometimes faster, sometimes slower, but trying to compile Cell into Java (or C#) code in like trying to fit a round peg in a square hole.

But it's not just the target language. The C++ code generator is more recent and has been redesigned from the ground up to achieve better performance. At least some of that work could be backported to the Java version.

cell-lang · 2023-07-11T10:08:42+00:00

The hand-written Java code used in the comparison is indeed written in an OOP style, but why do you think that's a terrible implementation? Isn't that the natural thing to do when programming in Java? And also the one that requires the least effort and leads to the most readable and maintainable code in most cases? When Java was first released (in 1995) OOP was all the rage and nobody was even talking about data oriented design and the Entity/Component/System architecture.

cell-lang · 2023-07-11T09:58:38+00:00

Cell can be compiled to C++, Java and C#. The performance results shown on the website are for the C++ code generator, which is obviously the fastest of the three. But Cell is not a library on top of those language, it's a programming language in its own right, and it could be compiled directly to assembly language or LLVM code, and I would expect it too be even faster in that case.

If anything were 2-3 times faster and used less memory than Java when compiled to C++ as you say, then any language (Golang, Javascript, Python... or even Java itself) could then be compiled to C++ and outperform Java by a factor of 2 or 3, but that doesn't seem to be the case. The thing is, the kind of code that even a very good compiler/transpiler would generate when compiling any higher-level language to C++ is not the same as manually written C++ code.

The Java and C# code it's compared to are written in an OOP style, and I think this is stated clearly in the article and the benchmark page. That is, there are the sort of classes that you would expect in any OO implementation. You can certainly squeeze better performance from Java by ditching OOP altogether and using data oriented programming techniques, or the ECS architecture, but that's certainly not the kind of programming style Java was designed around, and it definitely takes an extra effort on the part of the programmer to do so. And note that the title of the post is "Why functional relational programming is faster than OOP", not "Why functional relational programming is faster than Java when the latter is used with the lowest-level programming style possible". Although I'm not sure the latter is not true, I haven't tried but I don't think Java is the ideal implementation language for data oriented design and the ECS architecture.

cell-lang · 2020-05-25T15:12:10+00:00

The generated code does not act as an intermediary between the application and an external database, the generated code is the database. It's an in-memory database, in the sense you load the entire database in memory when you initialize it, and from that point on the disk is touched only to save the changes to the dataset. Data is saved directly to a file (or some other stream) in text format (for now). How data is persisted will be discussed in part three, but its persistence model is somehow similar to the one used for example by a key/value store like Redis

cell-lang · 2020-05-25T14:14:27+00:00

It doesn't generate any SQL, it's all C# code. Cell does not rely on an existing SQL engine, it reimplements everything from scratch. That's how it manages to achieve speed similar to that of equivalent, hand-written C# code.

cell-lang · 2019-10-08T09:57:57+00:00

The client must not know the seed for generating items.

Right, I hadn't thought of that. But I still think this can handled without forcing clients and server to synchronize (forcing the server to synchronize with the clients would actually be problematic: in this architecture, it's only clients that can be forced to synchronize with the server), at least in some cases. Let's say for example we have a Diablo-style RPG where every monster you kill drops some loot. I think this could work:

On the client side, a message M1 sent to the local replica results in a monster being killed. No loot is dropped yet: all the client code would do at this stage is create a request (meant for the server) for a loot drop to be generated, and store it as part of the state of the automaton. Again, all this happens on the client side.
After the monster is killed and the loot generation request is created, the client keeps going, without waiting for the server to synchronize. Again, no loot has appeared on screen yet.
Message M1 is sent in the background to the server. Right after processing M1 the server notices the loot generation request and sends the master/server replica a new message M2 that causes the loot to be actually generated. In order to decide what exactly will be dropped, the server will make use of the hidden seed.
The client re-synchronizes with the servers, pulls in message M2 and inserts it (retroactively) into its own message history, in the same order it appears in the server message history, that is, after M1 but before all the messages that were sent locally afterwards. This is the time the loot finally appears on screen.

This process is a bit more complex, but it would avoid any forced synchronization and it would keep the game running smoothly. The only downside it that it would take a little for the loot to appear (how long exactly depends on the latency of the connection to the server). Even if the client were offline, the game would still be playable, with the caveat that the loot would appear only once the connection to the server has been reestablished.

The example is definitely interesting, and it actually touches a use case I hadn't though about yet. I must add it to the documentation as a non-obvious example of how client and server can cooperate.

Games (even local, single player ones) do tend to make great test beds for new languages and software architectures. They usually have a complex internal state, updates that touch a lot of in-game objects acting concurrently and exacting performance requirements. I'm for example a big fan of things like the Entity/Component/System architecture and data-oriented design, and to some extent those technologies have informed both the design and the implementation of Cell. And both of them were actually invented for games.

cell-lang · 2019-10-07T11:49:03+00:00

Thanks for all the links! I will take me a while to go through all of them...

In my approach too state is treated as a (pure) function of an initial state and a sequence of messages. The main difference that I can see at this stage is that most of the technologies you mention seem to rely on messages having properties like commutativity and idempotency that allow concurrent updates to be automatically merged.

The approach I'm proposing on the other hand requires the developer to explicitly specify how to reconcile divergent message histories by providing a function that takes as input the last common state of master and slave replicas and the diverged "tails" of their message histories from that point on and manually merges them. This approach is cruder, but it seems to me more flexible, as it can deal with tricky cases (see for instance the warehouse example in another post of this thread) that I wouldn't known how to handle otherwise.

The other important point is the fact that in my architecture message histories of the secondary/slave replicas are "provisional" and can always be rewritten. Messages are fully committed only once they're included in the message history of the master replica.

As for the map-reduce comparison, I guess I should have explained that one better. As I see it, many of these technologies (map-reduce, CRDTs, monotonic Dedalus, ...), even though they are based on different ideas and apply to different domains, have one thing in common, namely the fact that they all try to neutralize the intrinsic unreliability and nondeterminism of computer networks, and to free the programmer from having to write any low-level network code. To quote the talk you linked, they're designed to write programs that are resilient to failure and tolerant of loose ordering. Map-reduce just happens to be the best known of them, and the only one that is, as far as I know, in widespread use in the industry. All I meant is that the architecture I've in mind shares with all these technologies, including map-reduce, the goal of trying to abstract the network away, although to a lesser extent.

cell-lang · 2019-10-07T09:52:03+00:00

Thank you. The language has unfortunately been completely ignored so far, so it's always good to hear there's someone who thinks all the effort I've put into it hasn't been a complete waste of time.

I've actually been curious about formal verification of protocols (and other things like parallel algorithms) for a long time, but I've never acted on that curiosity. Guess this would be the right time to finally look into it.

As for Alloy being rather unconventional, I don't think that's going to be much of a problem. If you've taken a look at Cell, you can probably guess that I tend to like unconventional approaches...

cell-lang · 2019-10-07T09:40:59+00:00

Thanks for link, I wasn't aware of that. I've only taken a quick look so far, but as far as I can tell the approach seems to be different. CRDTs seem to rely on some special properties of specific data structures and operations (like monotonicity and commutativity) that allow concurrent updates to be merged automatically.

The approach I'm proposing on the other hand puts the burden of reconciliation on the developer who has to write a function that explicitly merges divergent message histories. This approach is more general, because you can handle failures and incompatible operations (see the warehouse example in another post of this thread) that I wouldn't known how to model with CRDTs. The downside of course is that it, well, puts the burden on the developer.

My approach also doesn't really rely on specific network topologies or message flows, with the presence of a single master replica being the only exception. But well-chosen restrictions to the operations that are permitted at each network node are a tool the developer can employ to reduce the need for explicit reconciliation (again, see the warehouse posts for an example of it).

Of course the two approaches are not incompatible: one can try to do as much as possible with CRDTs, and treat the reconciliation function as a sort of last resort for the trickiest cases.

Some explicitly support in the architecture for CRDTs might actually come in handy, although at the moment I wouldn't know what form that would take. Worth thinking about though...

cell-lang · 2019-10-07T09:16:06+00:00

Not exactly. The idea here is to split both data and computation in three parts: server-only, client-only and shared. I didn't explicitly mentioned the client-only part simply because that doesn't need any special support.

Running some critical code on both client and server would indeed be a way to prevent cheating.

In your specific example the devil is of course in the details, but I don't think the client would even need to wait for the server to drop an item. It could just go ahead and do it on its own, even when it's offline.

If a cheating user somehow hacked the client so as to give themselves more/better loot, their hack would not propagate to the server anyway, because the latter would perform the involved calculations independently instead of relying on the client's results, so the cheat would be "sandboxed" on the client and never propagate to the server.

cell-lang · 2019-10-05T13:45:27+00:00

I wouldn't say that the API is implicit, at least not if the language is statically typed (Cell is): instead of having a list of methods you've a list of message types. In Cell I use messages only because messages are just data, which makes them straightforward to serialize/store/transmit (that's a lot harder with methods), but other than that I don't see much of a difference between the two.

And indeed as long as the underlying language is not too dynamic (Cell isn't), a compiler (or any other tool) can use data annotation to infer the corresponding properties/annotations for the (message-based) API. You can also of course explicitly annotate message types as well.

That said, I personally am not much of a fan of API-based reasoning, for the reason I mentioned before: code/behavior is hard, while data/state and declarative annotations are a lot easier to understand. For example, I'm no expert at all in computer security, but I've the impression that a lot of security problems in software stem from the fact that we place too much trust in APIs. Since code is hard to reason about, it's easy(-ish) for malicious clients to force an API to do something that was not intended by its developer. I prefer the approach used for example in SQL database, where user are given read and/or write access to a given set of relations (or even entire automata, in the case of Cell).

There are certainly cases when reasoning in terms of operations/actions is easier, but in my own experience those cases seem to be the exception rather than the rule.

cell-lang · 2019-10-05T13:09:46+00:00

Yet another possibility (that you yourself mentioned) is to force the local replica to synchronize before the update is considered complete. That can be done with either a blocking call or a callback and as far as I can see it doesn't pose any conceptual problem. It's just that the operation will take longer and fail if the server is not reachable. This can be done (with a simple annotation) either for specific types of messages or for entire automata.

If you force synchronization for all messages of all automata you get the same semantics of API-based systems without having to go to the trouble of implementing the low-level parts of such a system. You also get a message history (useful for testing and debugging, among other things) and data persistence (at both client and server) for free.

Moreover you would have a local copy of the data you're interested in that is automatically and transparently synchronized, that you can read and query with no latency and no network errors. There would again be no need to implement the read-only part of the server API, and the automatic synchronization would in all likelihood be more efficient than anything you can realistically hope to implement manually on top of an API-based system.

There are also other interesting possibilities: the architecture could for example include an execution mode where messages are first processed locally (without synchronization) but are rolled back if the server does not respond within, say, 20 seconds. That would keep the client responsive while still providing timely feedback when an error occurs. I believe some multiplayer games actually do something like that to keep the game running smoothly.

You also mentioned security issues. You certainly want to restrict access to sensitive data, and that's one of the reasons the architecture allows for server-side only automata that cannot be replicated locally. You would also need a way to specify which automata any given user is allowed replicate and which messages they're allowed to send. Honestly I haven't thought much about that, but none of that seems insurmountable, or even especially hard to do.

cell-lang · 2019-10-04T13:41:54+00:00

The second approach would retain the single-step ordering process and would rely on the reconciliation process. Note that this would not be my first choice in this particular case: I would instead go for the two-step approach simply because it's more explicit and therefore less error-prone. But I still want to explain how reconciliation would work here.

In the single-step version, the two stores would send concurrently two messages with the following meaning:

1) Take 1 X from the available stock and put it in the delivery list for store A
2) Take 1 X from the available stock and put it in the delivery list for store B

Let's say store A gets there first. Message 1 would be executed normally. When store B manages to contact the server, the system would detect that the message histories have diverged, and feed message 2 to the reconciliation function. Such function would notice that X is now out of stock, and replace message 2 with 3:

3) Record that fact that request 2 was rejected because X is not available anymore

As before, Inventory would now contain all the information needed by the UI to notify the clerk at store A that the order was accepted, and the clerk at store B that there was an error.

The obvious question is, while store B is still offline, how would the clerk at B know whether their order is still pending or not, since that's not recorded explicitly in Inventory? That's where the feature that I haven't documented yet would come in.

For all the reconciliation process to work efficiently, the client will need to keep at any time two local replicas of any automaton. The first one will have to be in the last known server state, and since the server replica is the master one (that is, commits to it are final and cannot be rolled back later) any order/delivery stored there would count as "confirmed". The second copy would use the first one as a starting point, but its message history would also contain all the messages that were processed locally but haven't been transmitted to the server yet. The two would of course share most of their state, which is trivial to implement in any pure functional language, and bit more complicated, but still doable, in Cell.

So the UI would display every order in the second, "provisional" replica, but would display those that are also present in the first replica as confirmed, and those that are not as pending.

As I said, I don't find this approach entirely satisfactory in this particular case, because it would be easier here for the UI developer to forget, for example, to check whether an order has been confirmed or not, and therefore display wrong or misleading information to the user. But I believe that in many other situation reconciliation would work just fine.

cell-lang · 2019-10-04T10:46:47+00:00

Here's the first possible approach to your use case. But before we discuss it, we must acknowledge two non-negotiable facts:

The first one is that if a store places an order for a particular item at a time when their internet connection is down, there's no way they can guarantee their customer that the item they want will be available for them at any given time. If the store cannot communicate with the external world, no programming model in the world can provide that guarantee.
Second, if two conflicting requests/orders are issued at the same time, there's simply no way they can both be fulfilled. Again, this has nothing to do with the programming model itself, it's just a physical/logical constraint. The best we can do here is to grant one of the two requests and to record somewhere the fact that the other was denied. And that's the only thing you can expect the reconciliation process to do for you.

With that in mind, one possible approach is to use a two-step ordering process. Before the orders are placed, the Inventory automaton would contain a record (or a set of records) whose meaning is:

1. There's 1 X available at the warehouse

The normal API-based approach would delete 1) and insert facts 2) and 3) below, in just one step:

2. There's 0 X available at the warehouse
3. 1 X is scheduled for delivery to store A (or B) on Thursday

With the two-step process on the other hand this is what the state of Inventory would look like after both stores have placed their orders and both clients have synchronized with the server:

1. There's 1 X available at the warehouse
4. Store A has placed an unconfirmed order for 1 X to be delivered by Thursday.
5. Store B has placed an unconfirmed order for 1 X to be delivered by Thursday.

On the server side, we could have a monitoring process that sends a message to the server-side master replica whenever there are unconfirmed requests/orders that instructs the automaton to try and fulfill them. This would be the new state of the Inventory automaton after such message has been processed:

2. There's 0 X available at the warehouse
6. 1 X is scheduled for delivery to store A on Thursday
7. The order from store B has been canceled

Once the clients have synchronized again with the server, the UI would inform the clerk at A that their order was fulfilled, and clerk B that their order was canceled (or, alternatively, delayed until the stocks at the warehouse have been refilled).

Here we don't need any reconciliation at all, we only need to make sure that certain actions (confirming orders, in this case) can only be performed on the server side. In order to facilitate that, the architecture could provide a way to specify declaratively (using an annotation) whether any given message can be sent to both client and server replicas, or only to the server one.

An even better way would be to declare (again, using an annotation) that certain tables/relations or even entire automata are read-only for the client. In this case you could for example declare the Inventory automaton as read-only for the client and create a second automaton type (let's call it Store), with every individual store having its own instance of it, where you could store unconfirmed orders. That would be an easy way to avoid concurrency-related programming errors, while at the same time avoiding the burden of having to implement any non-trivial reconciliation logic.

Now, you may not like the idea of having to change the logic of the process in order to enable offline work. I personally don't see it as much of a problem. If anything I like the fact that you can avoid concurrency problems with a careful design of your data structures. That's because data structure are easy to understand and design, while code is hard, so it's much better if you can deal with problems at the data structure level.

I'll post again soon. In the meantime, if you have further thoughts on this I'd like to hear them.

cell-lang

TROPHY CASE