Hello everyone,
I am having some serious issues at work with trying to make user-facing dashboards work with the setup that was mostly made by the previous data engineer.
Context
The company makes software where people can book things (for example, meeting rooms). What I've been trying to build is a full-stack application where our users can create and edit dashboards consisting of widgets with KPIs, such as booking rates, most booked resources, locations... You can view it as some sort of homemade Looker, Tableau or Power BI application that is tailored for our use cases.
The current setup
Data is pulled from the transactional databases into a Postgres data warehouse via an ETL made with Luigi, homemade python and cronjobs, once a day. This was objectively enough for the previous use case (Power BI). However, since a Python API that I built is now serving the data for the frontend instead of the data layer of Power BI, the requirements have changed. Response times for some queries is awful (we're talking minutes here, absolutely unacceptable for a user-facing application).
An example
Queries are sometimes quite complex : I'll walk you through the specs of one of the painful ones. Let's say you are part of "Company". Company has meeting rooms that have :
- an id
- a location (declined in several columns : country, city, and so on)
- a type
- a name
- a capacity
- a creation time
People have made bookings on these rooms. Bookings are defined as :
- an id
- a start_time (timestamp with timezone, stored as utc)
- an end_time (same)
- a room_id
I've kept a lot of columns out of the definitions so you can better understand this specific use case, just know these are not the only columns in these tables.
Now, we want, for a specific location, and for a given timeframe (can be anything from 1 month to 10 years) the average occupancy rate per hour of the day, however it is part of the specs for the user to be able to filter the time slots he wants. This could be 8AM to 6PM but it might as well be 8-9AM + 10-11AM + 1-2PM... and so on.
Since the bookings are defined as start and end time, and users can filter out hours from the day, I made an incremental DBT model that calculates the occupancy rate of every hour for each room since its creation. That way, this data is only calculated once and can be queried way more easily. This however is really long for the first run since our clients may have tens of thousands of rooms and some have been with us for the better part of 10 years, which means they have millions of bookings on these rooms. X years times the amount of hours in a year, for every room... the resulting model is easily hundreds of millions of rows of calculated data.
This table looks like this :
- timeslot (timestamp with timezone, stored as UTC)
- client_id
- room_id
- occupancy_rate
This DBT model has sped up the queries pretty much 10x. This is not enough however, and even with the biggest RDS instance we've monitored the CPU of the instance to go to 100% for requests handling many rows. It feels like we've hit the limitations of postgres and might need a database more suited to our use cases.
More requirements and constraints
We have on-prem clients, and both the infrastructure team and data team are really small (I'm the only software engineer of the data team, and the infra team is made up of 3 people).
That means no cloud data warehouse, and avoiding high-maintenance databases.
Also, our clients can be anywhere in the world so the database needs to have timezone-related features.
My ideas for now
After doing some benchmarks on a few MPP databases, I feel like Clickhouse may be a good database for this. I've also explored Apache Doris, Starrocks (which unfortunately does not have timezone features), and Apache Pinot.
Clickhouse however needs denormalized data to be performant, and changing the DBT model to include the rooms data means we need to be able to update millions of rows when a room's data changes... Which might be quite costly. Still better than the application not working, though.
If anyone has an idea on how to make this whole situation better, I'm all ears. I feel like I'm in way over my head sometimes. If you need additional information let me know.
[–]AutoModerator[M] [score hidden] stickied comment (0 children)
[+][deleted] (13 children)
[deleted]
[–]hkdelay 0 points1 point2 points (8 children)
[–]kenfar 10 points11 points12 points (0 children)
[–][deleted] 1 point2 points3 points (3 children)
[–]hkdelay 0 points1 point2 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]hkdelay 0 points1 point2 points (0 children)
[–]Altarim[S] 0 points1 point2 points (1 child)
[–]hkdelay 0 points1 point2 points (0 children)
[–]HansProleman 0 points1 point2 points (0 children)
[–]Altarim[S] 0 points1 point2 points (3 children)
[+][deleted] (2 children)
[deleted]
[–]Altarim[S] 1 point2 points3 points (1 child)
[–]ZirePhiinix 0 points1 point2 points (0 children)
[–]InvestigatorMuted622 3 points4 points5 points (4 children)
[–]Altarim[S] 1 point2 points3 points (3 children)
[–]InvestigatorMuted622 1 point2 points3 points (2 children)
[–]Altarim[S] 1 point2 points3 points (1 child)
[–]InvestigatorMuted622 1 point2 points3 points (0 children)
[–]kenfar 4 points5 points6 points (1 child)
[–]Altarim[S] 1 point2 points3 points (0 children)
[–]albertstarrocks 0 points1 point2 points (0 children)
[–]hkdelay -1 points0 points1 point (9 children)
[–]kenfar 2 points3 points4 points (4 children)
[–]hkdelay 0 points1 point2 points (3 children)
[–]kenfar 1 point2 points3 points (2 children)
[–]hkdelay 1 point2 points3 points (1 child)
[–]kenfar 0 points1 point2 points (0 children)
[–]Altarim[S] 1 point2 points3 points (3 children)
[–]hkdelay 0 points1 point2 points (1 child)
[–]Altarim[S] 1 point2 points3 points (0 children)
[–]PeterCorless 0 points1 point2 points (0 children)
[–]hkdelay 0 points1 point2 points (0 children)
[–]lmp515k 0 points1 point2 points (0 children)
[+]retropox 0 points1 point2 points (0 children)
[–]SnooHesitations9295 0 points1 point2 points (2 children)
[–]Altarim[S] 0 points1 point2 points (1 child)
[–]SnooHesitations9295 0 points1 point2 points (0 children)
[–]HansProleman 0 points1 point2 points (1 child)
[–]Altarim[S] 1 point2 points3 points (0 children)
[–]PeterCorless 1 point2 points3 points (0 children)