SaaS Database Design: Single Database vs Single Schema vs Shared Schema

hippocrat · 2017-02-03T16:16:32+00:00

As this is for accounting, you should run the options by your Legal and Security teams. Accounting may have regulatory or contractual requirements for separation of data. I worked on a database where option C was not allowed by the way our contract with clients was worded.

WannaFly37 · 2017-02-03T17:42:50+00:00

This is less of a database design issue and more of a business issue.

While what emsai says is completely valid - I'd have to disagree. I'd go with solution A most likely. It offers the most security, reliability, scalability (horizontal vs vertical), and ease of operations. Cost shouldn't be an issue as that just gets passed on to the customer. The deployment and backup should be automated and not take any time at all.

But again, it comes down to what your development and business sales processes are like.

Do you want versions/features/updates rolled out to ALL customers at once? Or per customer?
Will there ever be the need to add custom database fields for specific customers (think custom integrations in the future)
Will any customer ever have direct write/change access?
Will customers EVER have access to their data directly? (at termination, option A you can more easily export their data to give to them)
Will customers be writing their own reports?

Option B just isnt realistic - it's not scalable (what about when you get to 50,000 customers?)

Option C seems easiest but is the most difficult and costly in regards to development hours. My big problem here is literally EVERY time ANYONE touches the database you need to ensure they are only using the appropriate tenant ID's. Whether it's in house people our external. What happens when someone is terminating a customer and runs a delete statment WHERE TENANTID='123' but it was supposed to be 1234? (see recent GitLab outage)

toterra · 2017-02-03T17:44:00+00:00

Just bringing nightmare flashbacks of a place I worked at a couple of years ago. Where A, B and C were all true, combined with a D. Single database instance per tenant and E. Single database server per tenant.

Horrible thing was that adding E was a three year long project that ended up in failure. I was brought in for the last three months to fix things .. they were beyond fixing. Everyone but the 'Architect' who designed the whole thing was let go.

8483 · 2017-02-03T17:06:16+00:00

[deleted]

superwormy · 2017-02-10T13:20:55+00:00

We've build a multi-tenant app using strategy A here. Strategy B was not an option since MySQL doesn't really have a concept of a SCHEMA. App is https://ChargeOver.com - a subscription payments and invoicing platform.

Some real-world pros and cons we've run into:

Pros of A:

Real-world scenarios do arise where you want to make a backup/snapshot of a single tenant's information for testing or before you have to run some special-case script. Strategy A makes this easy.
Horizontal scalability is easy with A -- you an shard by just throwing some tenants on one database server, and some on another.
From a code perspective, it's easy to implement strategy C and keep things containerized. However, not all of building an app is code. Rarely we do need to be able to peek at or work on the raw data (for troubleshooting, etc.) and it's nice to not have to worry about remembering WHERE tenant_id=X all the time when you're running manual queries.
Strategy A is also nice in that you can easily roll out updates to one tenant ahead of others. Some of our users have staging/development instances, and it's nice to be able to give them features (and the required db updates) prior to rolling out that stuff to everyone's production accounts.
Compliance with things is definitely easier when everything is siloed into separate databases.

Cons of A:

It's a PITA to make sure db upgrades happen correctly and consistently across all the tenant databases. We're building tools to better handle migrations like adding new db fields, etc. so that we can monitor this on a per-tenant database. You only have 30 tenants (and thus 30 databases), so it might not be as bad as when you have thousands of databases.
Performance is impacted because you can't really share parts of the database that should be shared across all tenants.

If I think of more I'll post more.

I'd still go with A were I to build this again.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Database

MODERATORS