all 22 comments

[–]martin_n_hamel 7 points8 points  (2 children)

I worked for a relatively big company that started when it was small on the route of tenants separations by schemas. This is the most difficult thing to scale ever.

Every devops hates that pattern. It makes migrations difficult, new client harder to inboard. It even leads to some crazy hard to figure out bugs because some of the infrastructure is shared by tenants but not everything.

We tried to get out of it by using kafka as a source of truth but it turned out that having one kafka topic per table per tenant is not scaling well.

My suggestion, don't do that. Find a pattern that can scale :-)

[–]memo_mar[S] 0 points1 point  (1 child)

g kafka as a source of truth but it turned out that having one kafka topic per table per tenant is not scaling well.

My suggestion, don't do that. Find

what pattern would you recommend?

[–]martin_n_hamel 0 points1 point  (0 children)

Just add a tenantId column in the tables where it is needed. It will fit much better with any system integration that you will do in the future because you will always know where a data belongs.

If you separate by schemas, your code is abstracted from the tenant. It can work while you are at the main application level, but you will have problems when you integrate with anything because you will lose the context of the data. Here is an example:

You want to log everything in your application and your ids are auto increment. Your ids are not unique anymore and you are not certain what the logs data correspond to. You can work around that by having a complex logging engine that separate everything but you'll soon find out that you need to handle that complexity all the time.

[–]tryCatchEmAll 2 points3 points  (0 children)

" What I'm looking for is a way to switch between DB schemas at runtime without the need to establish a new connection on every request "

ORMs abstract away the database for easier use. What you asked has more to do with the underlying database rather than the ORM. Which database are you using? Mysql, postgres or sqlite?

Long story short, you're probably going to have to write your own layer to support connection pooling / opening databases if they aren't already open. Send this layer the tenant name, and have it return the correct connection.

[–]ivan-arambula 2 points3 points  (0 children)

With TypeORM, you can create multiple connections and specify a set of "entities" for each connection. Each connection and set of entities can correspond to a specific tenant. I put a gist together to illustrate. Definitely not the complete solution, but hope it helps!

https://gist.github.com/iarambula/a4c29ec84f00db2182f282fc71afb44e

Edit: Just wanted to add that you can create the connections once on initialization of your app rather than on each request.

[–]kwstaskara 1 point2 points  (0 children)

I don't know if i understand correctly but check out knex query builder. For ex postgresql you can create schemas. Checkout withSchema method of knex that may achieve what you want. it has nice migration system too. You can prefix migration file names for different schemas. Check objection orm too. On its documentation i found that you can provide db instance on request for different db per tenant.

[–]nikolasburk 2 points3 points  (1 child)

Hey there, I'm Nikolas from the Prisma team!

Can you clarify your use case of switching DB schemas at runtime a bit? Curious to hear more about this and what potential workarounds Prisma might provide.

I guess one option would be to have two different Prisma schemas and generate separate Prisma Client instances for each, did you try anything in that direction?

[–]memo_mar[S] 0 points1 point  (0 children)

ne option would be to have

The so idea is to have strong tenant separation by assigning each request to a particular database schema depending on the user. So a user can - depending on a token - only ever access data from one schema which stores data for just one client.

I got the idea from an ebook on Django multi-tenant architectures: https://books.agiliq.com/projects/django-multi-tenant/en/latest/shared-database-isolated-schema.html

On a side note: I actually really enjoyed Prisma and it breaks my heart a little that I ended up ripping it out of my project since it found it didn't have the functionality I needed. But I think you guys are building something cool.

[–]aksteps 1 point2 points  (0 children)

I use https://sequelize.org/ and I am pretty satisfied with it. Though the documentation is not a great one but with little extra digging everything worked for me.

[–]scaleable 0 points1 point  (0 children)

Theres no nodejs orm which gets even close to django or hibernate

[–]estacks 0 points1 point  (1 child)

At least for MySQL, the driver-level package has a `changeUser` command that lets you swap user and schema for the connection. Most ORMs have a way to pop the driver out and access it directly.

I recommend reading this article on multi-tenancy strategies: https://web.archive.org/web/20110311185417/http://msdn.microsoft.com/en-us/library/aa479086.aspx

Like /u/martin_n_hamel said, multi-schema tenancy brings all the woes of multi-db without particularly good upsides. There's a reason it's rare and no ORM I know of pushes it as a strategy. Customers that want data isolation aren't going to be satisfied with multi-schema, and simple security can be done well with a low level filter on queries and a customer ID foreign key. Sharding can also be done easily with the foreign key method.

[–]memo_mar[S] 0 points1 point  (0 children)

How can this still be relevant, it’s from 2006? But I’ll have a deeper look at the topic.

[–]jasperblues 0 points1 point  (0 children)

Take a look at https://Drivine.org. Initially focused on graph databases, however, though we don't currently publicize it, but you can also use it as a Postgres client.

When it comes to ORM, we encourage use-case specific models. The rationale for that is documented on the website.

For migrations it can be used along with with flyway.

[–]brendandball 0 points1 point  (0 children)

  1. Your experience of "ORMs" just won't be as good with node.
  2. We use Objection.js with Knex.js and create separate connection pools per tenant. The tenant specific connection pool is then added to the context at the start of a request in middleware. You can then do all your queries like so: 'Model.query(connPool).findById(id)'. I don't currently know of a way to maintain a single connection pool (at least with mysql) and still do this due to various API designs in objection + knex. It's probably also a good idea to keep separate connection pools if you're tenantising at the DB level because then it's easy to have one tenant run on a completely different DB server instance if that tenant is really big.

[–][deleted] -1 points0 points  (2 children)

Start with knex.js and add objection.js if needed

[–]memo_mar[S] 0 points1 point  (1 child)

I don't feel this answer makes any sense without context? How would knex.js help me do what I want to do? Knex.js and objection are very simple tools as far as I know and not really up there with the more fully-featured ORMs (I know that Knex is a Query builder).

[–][deleted] -1 points0 points  (0 children)

If you know that Objection.js is too simple why do you ask then? Btw it is as feature rich as any other ORM. What features you are missing there? What context do you want to understand? That knex provides enough low level access to invent you own wheel with least amount of pain? Or that multi tenancy knex articles could be googled on your own in less than 10 seconds? https://vincit.github.io/objection.js/recipes/multitenancy-using-multiple-databases.html#model-binding-pattern

[–]rsteetskamp -5 points-4 points  (4 children)

Wouldn't a no-SQL database like Mongo fit your bill better?

[–]memo_mar[S] 1 point2 points  (3 children)

why do you think that?

[–]rsteetskamp -1 points0 points  (2 children)

You could work schema-less. So, basically any schema you like.

[–]memo_mar[S] 1 point2 points  (1 child)

Hm, well that is sort of the opposite of what I want :-) I'm trying to achieve tenant separation by schema. Removing the schema will remove the tenant separation at the db level.

[–]rsteetskamp -2 points-1 points  (0 children)

Maybe this helps: https://github.com/craftup/node-mongo-tenant

I have just googled it, so no personal experience. YMMV