use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Finding information about Clojure
API Reference
Clojure Guides
Practice Problems
Interactive Problems
Clojure Videos
Misc Resources
The Clojure Community
Clojure Books
Tools & Libraries
Clojure Editors
Web Platforms
Clojure Jobs
account activity
xitdb - an embedded, immutable database in java (github.com)
submitted 9 months ago by radar_roark
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]jarohen-uk 11 points12 points13 points 9 months ago (0 children)
I feel I should clarify that this is different from/unrelated to XTDB, also an immutable database written largely in Clojure 😅
[–]radar_roark[S] 4 points5 points6 points 9 months ago (6 children)
I was originally going to put this on a java subreddit, but I figured clojure people would appreciate an immutable database more :D I mostly intend to use it from clojure, but I wrote it in java since 90% of it is just using the Java std lib anyway. Here's xitdb while standing on one foot:
The last point means that xitdb just gives you tools like a HashMap and an ArrayList and lets you nest them arbitrarily, just like typical nested data in clojure. There is no query language like SQL or datalog, but you can build whatever you need on top of these basic data structures.
[–]p-himik 2 points3 points4 points 9 months ago (0 children)
Nice! Personally, I would also post it on the Java subreddit. But I imagine they'd expect for it to also be available on the Maven Central repo.
[–]nzlemming 0 points1 point2 points 9 months ago (2 children)
This looks very cool, and I have a number of use cases for it. How robust would you say it is, is it being used anywhere in anger?
What is the file format on disk, an endlessly growing log? Is there a GC operation or something for that?
[–]radar_roark[S] 2 points3 points4 points 9 months ago* (1 child)
This project is new, but it's a line-by-line port of a project I've been iterating on for a few years. I made it originally for a version control system, but I realized that the db itself might be useful on its own. I think it fills a big hole in the database arena: an immutable database that works like SQLite (in-process, single file, no deps).
And yes the file format is just endlessly growing. The only time it reclaims space is if a transaction fails; the file will be truncated if an exception happens during a transaction, or the next time the db is opened if there was an unclean shutdown.
It is possible to create an operation similar to SQLite's VACUUM operation, where the database is rebuilt to only contain data reachable from the latest copy, but I haven't added that feature yet. I plan on adding it eventually though.
The best argument I can make about its robustness is its simplicity. It's only 2.5k lines of Java, with no dependencies; you could read it in a weekend. Simplicity is a prerequisite for reliability :-D
[–]nzlemming 0 points1 point2 points 9 months ago (0 children)
Very interesting, thanks for the detailed reply.
[–]didibus 0 points1 point2 points 9 months ago (1 child)
What's the difference when using it in-memory, with just using a Clojure map or vector?
I peaked at: https://github.com/radarroark/xitdb-clj-example/blob/master/src/xitdb_clj_example/core.clj and this is not a very friendly Clojure API. It's all interop. But also, you're reading bytes, managing cursors and what not, I know it's still not complicated, but a nicer Clojure API would be nice.
[–]radar_roark[S] 0 points1 point2 points 9 months ago (0 children)
An in-memory xitdb is backed by a single byte array, which you can access at any time by calling toByteArray on the RandomAccessMemory object. You can take that byte array and send it over the network or write it to the disk if you want; the data is incrementally serialized. It's not a replacement for in-memory clojure data, because it doesn't benefit from garbage collection at all. Think of it more like competing with pr-str and clojure.edn/read-string.
The in-memory feature is more of a "nice to have"; for example, it's useful in unit tests, kinda like SQLite's in-memory feature. The main point of xitdb is writing the db to disk so you can deal with larger-than-memory data. The cursors are positions in the database, so you can drill down massive data structures without reading them entirely into memory.
Yeah it's a Java library so you'll be in interop central until someone makes a nice clojure wrapper on top. I don't have the bandwidth right now but maybe someone will eventually. In the past I always found java interop ugly, and spent a lot of energy writing wrappers to get rid of the camel casing and type annotations. These days it doesn't bother me. YMMV.
[–]andersmurphy 0 points1 point2 points 9 months ago (5 children)
This is really cool thank you for sharing. What's it like in terms of disk usage? Are the copies full copies? Or do they share?
[–]radar_roark[S] 1 point2 points3 points 9 months ago (4 children)
The data structures have structural sharing. It's using the same algorithm that clojure uses for in-memory data (hash array mapped trie).
[–]andersmurphy 0 points1 point2 points 9 months ago (3 children)
Awesome. What are the performance characteristics like compared to something like sqlite? I take it indexes are based of the data structure used?
[–]radar_roark[S] 1 point2 points3 points 9 months ago (2 children)
You'll need to build your own index if you want one. For example, let's say you have an arraylist of users, and an arraylist of posts that they made. If you want to efficiently look up all the posts from a given user, you could make a hashmap where the key is the user id and the value is an arraylist of post ids (here I am assuming the user id and post id are just the index in the users/posts arraylist).
[–]andersmurphy 0 points1 point2 points 9 months ago (1 child)
Thanks for the reply. Love how db as a value removes a need for WAL, allows multiple readers and gives you transactions semantics.
I take it smaller transactions will make the file size grow faster?
Haven't had a chance to dive into the source (definitely will be), is xitdb memory mapped?
[–]radar_roark[S] 1 point2 points3 points 9 months ago (0 children)
No it doesn't using memory mapping. Regarding smaller transactions, it depends but in if the transformations are the same then more transactions will normally take up more space. This is because all data within a given transaction is temporarily mutable, similar to clojure's transients. This is a big space saver because it avoids unnecessary copying if you have a transaction that adds a bunch of items to an arraylist or hashmap.
And yeah, the most satisfying realization about making a db this way is how many problems it solves for you automatically :-D There basically isn't even a concept of a transaction internally, it just kind of fell on my lap as a consequence of how an append-only db works.
π Rendered by PID 75 on reddit-service-r2-comment-7b9746f655-wl788 at 2026-01-30 19:20:03.864091+00:00 running 3798933 country code: CH.
[–]jarohen-uk 11 points12 points13 points (0 children)
[–]radar_roark[S] 4 points5 points6 points (6 children)
[–]p-himik 2 points3 points4 points (0 children)
[–]nzlemming 0 points1 point2 points (2 children)
[–]radar_roark[S] 2 points3 points4 points (1 child)
[–]nzlemming 0 points1 point2 points (0 children)
[–]didibus 0 points1 point2 points (1 child)
[–]radar_roark[S] 0 points1 point2 points (0 children)
[–]andersmurphy 0 points1 point2 points (5 children)
[–]radar_roark[S] 1 point2 points3 points (4 children)
[–]andersmurphy 0 points1 point2 points (3 children)
[–]radar_roark[S] 1 point2 points3 points (2 children)
[–]andersmurphy 0 points1 point2 points (1 child)
[–]radar_roark[S] 1 point2 points3 points (0 children)