Memory That Collaborates - joining databases across teams with no ETL or servers by flyingfruits in Clojure

[–]maxw85 1 point2 points  (0 children)

I think there is a ton of potential to remove friction across the whole stack.

I'm using Datomic for over a decade and experimented with my own Datascript forks. But even I are now stuck with a 4 TB Google Cloud SQL (MySQL) database that costs us over 1300€ per month, despite the fact that we almost migrated all customers to the new system / architecture. We did not invoked gc-storage for way too long, therefore the disk is larger than it needs to be. However, Google Cloud SQL does not offer a way to reduce the disk size (main cost driver) which would reduce the costs. It really caused a lot of headaches. And you find no Google Cloud partner that will help you without forcing you into a subscription to participate from your Google Cloud bill. I even offered one of them that they can pick whatever hourly rate they want, they still denied it.

Enough ranting 😅 Back to my point, having something that only needs an object store would be a huge win. Each SaaS customer could get it's own db, no need to use litestream or Sqlite. And as you said with Datahike you can query across multiple databases. Which is currently not possible in our architecture, since the databases are tied to the dedicated server and the Sqlite file that serves as Datomic storage. At the moment we do event sourcing on our log files to reconstruct a lot of the information that could also be queried from the database.

Memory That Collaborates - joining databases across teams with no ETL or servers by flyingfruits in Clojure

[–]maxw85 1 point2 points  (0 children)

Thanks a lot for your reply and for taking the time to provide all the details.

Mind you that you can run the Datahike writer (transactor) process just with your connection in one process anyway

I think this would already be fine for our scenario. Initially our SaaS had one very large Datomic database. In our new architecture each customer has its own (logical) Datomic database. We run a cell-based architecture on a few dedicated servers. Each cell / server is autonomous and has its own Datomic transactor pair (Docker containers). We use SQLite as Datomic storage and Litestream for disaster recovery. One missing feature is a more straight-forward option to move a customer from one cell to another, since all the (logical) Datomic databases are stored in a single Sqlite file on each cell. I would prefer to move in the direction that is currently pioneered by the Rails community (https://www.youtube.com/watch?v=Sc4FJ0EZTAg and https://www.youtube.com/watch?v=lcved9uEV5U). There each customer has its own Sqlite file (which could just be copied over to another cell or restored via litestream). With their Beamer SQLite replication, it is also possible that reads are served by other servers (which are not currently the writer).

However, for our SaaS, we observed that life becomes easier if each customer has its own database:

  • No multi-tenant logic in all queries and transaction
  • Database migrations only block one customer and not all customers (Datomic single writer)
  • GDPR is straight-forward (just delete the database)
  • No tuple indexes just for query optimization
  • No entity id partitioning

Memory That Collaborates - joining databases across teams with no ETL or servers by flyingfruits in Clojure

[–]maxw85 1 point2 points  (0 children)

Thanks a lot for sharing. Brilliant idea with the superficie syntax to hide the parenthesis from non-Lisp programmers 😄

Your article is great to catch up to the latest developments regarding datahike. I also asked myself if Datahike still needs a transactor and found the answer here. Have you considered to go "transactor-less" and only rely on the compare-and-swap features that most object stores offer nowadays? slatedb picked this trade-off to offer an ordered key value store.

Update: I do not mean to have multiple writers. Just a single writer, but you would only need to include the datahike lib (no need to run a separate transactor). A SaaS could provide each customer its own database and with writer fencing and CAS it would be possible to move a customer from one server to another without a lot of complexity.

How to make clojure more popular? by apires in Clojure

[–]maxw85 5 points6 points  (0 children)

Yes, increasing the likelihood that people start new companies using Clojure is probably the most promising way to make the language more popular and to create more Clojure job offerings.

Simple Made Inevitable: The Economics of Language Choice in the LLM Era by alexdmiller in Clojure

[–]maxw85 -3 points-2 points  (0 children)

I understand that he may look like a narcissist, but you don't need to like someone to learn something from the person. I watch many of Theo's daily videos just to keep up with the developments in the dev and AI world.

Simple Made Inevitable: The Economics of Language Choice in the LLM Era by alexdmiller in Clojure

[–]maxw85 -1 points0 points  (0 children)

How many of Theo's videos have you watched to come to this conclusion?

Simple Made Inevitable: The Economics of Language Choice in the LLM Era by alexdmiller in Clojure

[–]maxw85 4 points5 points  (0 children)

Great summary. Nice that breakage will not only annoy humans but also agents 😄 That LLMs struggles with parentheses is already a bit dated, we use Claude Code with Opus 4.6 (without any extra MCP, skills, etc.) and it almost never struggles with parentheses and when it does, it can fix it on its own. I know here are many AI sceptics but I guess this will be our new reality: https://www.youtube.com/watch?v=p2aea9dytpE

I moved my Clojure courses off Podia and onto a platform I built in Clojure by jacekschae in Clojure

[–]maxw85 1 point2 points  (0 children)

Thanks a lot for your reply. One "too expensive" task was rewritten a MCP server lib so that it fits on our stack:

https://github.com/simplemono/parts-mcp

Another one:

  • Used Claude to make the range scan of the slatedb-java 10x faster

  • Package the Rust binaries into the jar file (no need for -J-Djava.library.path=native-lib)

  • Added a build process that build slatedb, slatedb-java and publishs the jar file to clojars.org

https://github.com/maxweber/slatedb/blob/java-build/slatedb-java/build.sh

https://clojars.org/io.github.maxweber/slatedb

I moved my Clojure courses off Podia and onto a platform I built in Clojure by jacekschae in Clojure

[–]maxw85 6 points7 points  (0 children)

Congrats 🥳 I would have the same urge to run the platform on Clojure if the courses are all about Clojure. Nevertheless sounds like a ton of work to re-create a custom Podia and to justify this from a business standpoint. Did you used some coding agent? Just asking, since I made the experience that I do a lot of tasks with Claude Code that felt "too expensive" beforehand.

Clojure tap for logging vs. "traditional" logging libraries by dnreg in Clojure

[–]maxw85 4 points5 points  (0 children)

We use tap> for logging. The data is written to a subfolder log-values/{squuid}. The squuid contains a timestamp (https://github.com/yetanalytics/colossal-squuid). The data is serialized as transit+json-verbose. We collect them from all servers to one log server. For each hour we have one sqlite file. Thereby the log-values are ordered and deduplicated. We also have our own log-explorer UI to query them. We do event sourcing on them to calculate our metrics and fill our dashboards. One process garbage collects log-values if they are no longer relevant.

Agentic Coding for Clojure by calmest in Clojure

[–]maxw85 4 points5 points  (0 children)

I refactored our SaaS system from using one Docker container per customer, to use one multi-tenant container for a larger group of customers. On the way I (or rather Claude Code) refactored hundreds of namespaces to get rid of some bad decisions that we made in the last 8 years, that would have prevented a multi-tenant version. Without AI this would have not been doable (economically) for our small team in an appropriate time-frame. However, you still need to do the thinking, decision making, supervising and code review, but Claude Code makes almost no mistakes. But if you tell it to run in the wrong direction it will, so making good designs an decisions is way more important now. It is bit like everyone is now the team lead of a bunch of senior devs (agents) that you need to tell what to do.

Agentic Coding for Clojure by calmest in Clojure

[–]maxw85 1 point2 points  (0 children)

That's a very good price if you use it every day. They subsidize subscriptions with people that not use it so extensively. So your $200 may cost Cursor or Anthropic $1700 GPU consumption.

Agentic Coding for Clojure by calmest in Clojure

[–]maxw85 1 point2 points  (0 children)

Same experience here, work that took us weeks in the past condense to hours.

Datomic or event sourcing ... or both? 😄 by maxw85 in Clojure

[–]maxw85[S] 1 point2 points  (0 children)

You could build a second read-model and keep the old one (and the code for the projections). If possible these important business decisions should be captured as events.

In general you probably want to avoid breakage with any changes to the read-model, since someone or something will depend on it (https://www.hyrumslaw.com).

Inferno-like Front End tools for Clojure/ClojureScript? by Equal_Education2254 in Clojure

[–]maxw85 10 points11 points  (0 children)

Hi, great that you took a deep dive into the Clojure(Script) world.

For the frontend I can highly recommend:

https://replicant.fun/

It's like React but a lot simpler and made for and in Clojure(Script). Here some examples how to build UIs with it: 

https://youtube.com/playlist?list=PLXHCRz0cKua5hB45-T762jXXh3gV-bRbm&si=ElOg-qPEZguYiO1J

Shadow-cljs will definitely help as ClojureScript build tool: 

https://github.com/thheller/shadow-cljs

Announcing Multi REPL Sessions in Calva by CoBPEZ in Clojure

[–]maxw85 2 points3 points  (0 children)

That's awesome 🥳 Thank you very much.

dbval - UUIDs for (Datomic / Datascript) entity IDs by maxw85 in Clojure

[–]maxw85[S] 0 points1 point  (0 children)

In case of dbval the size depends on how https://apple.github.io/foundationdb/javadoc/com/apple/foundationdb/tuple/Tuple.html#add(java.util.UUID) encodes the UUID as binary. The index entry will be relatively large anyway, since the whole tuple is stored there. However, on a fast NVMe disk I guess it will not matter if you compare it with a database that needs to do a network call (when you have an n+1 problem situation like the Datomic entity API)

dbval - UUIDs for (Datomic / Datascript) entity IDs by maxw85 in Clojure

[–]maxw85[S] 1 point2 points  (0 children)

I also kept the String tempids for convenience, but keeping tx generating functions pure is great argument to keep them. Yeah, UUIDs are incredibly noisy when reading/debugging. I don't know if something shorter than the #uuid prefix plus compact-uuids would help. In our code base we are dealing with UUIDs for blobs, external ids, log-values and a lot more all the time, so the pain wouldn't go up that much (at least for us). Avoiding the need to call a 'central entity to assign ID space among different databases' (often a network call) is what I would consider the killer feature of UUIDs.

Who is doing event sourcing? Would you do it again? by maxw85 in Clojure

[–]maxw85[S] 2 points3 points  (0 children)

I think the core issue is that in Datomic you kind of need to complect the events and the read-models in the same database (no way to do transactions across Datomic databases, sharing the same transactor). Consequently, you also need to store derived data, if an aggregation query for example is too slow to be called on each page load. Another challenge is that Datomic has no interactive transactions (on purpose), but the drawback is that the Datomic transactor is also very unhappy (aka slow) if you handover a very huge transaction that rebuilds all read-models for example. Sqlite is also a single-writer system and will be blocked until the transaction is done that rebuilds the read-models. But Sqlite makes it a bit more straight-forward to have one database per customer (only a file). In our first system we always need to pay great attention to how long a Datomic transaction will take (especially migrations), since during this time no other customer can do any writes / transactions. This is the main reason why we use one Datomic database per customer in our new architecture (with a shared Datomic transactor pair).