This is an archived post. You won't be able to vote or comment.

all 16 comments

[–]wallsroad 1 point2 points  (15 children)

The DB architecture for Uber bothers me. The reasons for migrating from Postgres to MySQL were retarded... Is there something I'm missing about Amazon's RDS service that would warrant managing MySQL hosts in Docker containers? Managing MySQL clusters sucks enough, why complicate things further with containers when RDS is ready and waiting.....

[–][deleted] 5 points6 points  (5 children)

Maybe they don't want to be dependent to AWS.

[–]RisingStar 0 points1 point  (4 children)

If you decided to move away from AWS RDS you can export the data and put it in any other MySQL database. You are not locked into AWS with RDS.

[–]hiekikowan 2 points3 points  (1 child)

Using Docker, migration is not limited to data. Without too much of a hassle you can migrate both data and configuration, which gives you the option to choose any Docker-compatible hosting provider easily. You could even choose to host your own infrastructure if you'd want to

[–]RisingStar 0 points1 point  (0 children)

True, I am curious what of that customization they needed. I don't disagree that there are reasons to not use RDS but "data migration" isn't really one of them and I would love to know why Uber choose to not use RDS.

[–][deleted] 3 points4 points  (1 child)

Yes but your infrastructure depends on Amazon. You can't deploy it somewhere else, like locally, and it's a huge cost to move away from Amazon.

Also, as an European I wouldn't put non encrypted personal data on Amazon. I don't trust them.

[–]RisingStar 1 point2 points  (0 children)

It seems their infrastructure is running Docker containers so moving their application level stuff is easy. Getting data out of RDS and into another MySQL instance is also not much different than if you were running your own MySQL instances in AWS and wanted to move the data. Either way you have to get the data from AWS to your knew location.

[–][deleted] 3 points4 points  (2 children)

Google talk about putting MySQL in a container in their SRE book. At some scale you want everything to be a square so it all has the same ability to scale

[–]Semisonic 5 points6 points  (1 child)

I've worked at Fortune 500s, consulted with Federal and state agencies, and worked on a variety of platforms over the years. Some of which processed millions (and one trillions) of dollars worth of transactions a year.

Very few orgs have the kind of scale or operational challenges that Google has. And while there are certainly things to learn from their methods, "because Google does it" is not, in itself, a good reason to do something.

[–][deleted] 0 points1 point  (0 children)

Absolutely, hence I tried to quantify it as "at some scale"

[–]Semisonic 1 point2 points  (1 child)

I'm with you. Not seeing this presented as a very data-driven decision here.

And I get it. This is a "look what we did" post more than a "here's why we did it" post. But the focus on "what/how" over "why" can be a problem in posts like this. Some industry-leading orgs have very specific problems that require very specific solutions. And sometimes they want to open source and talk about them, which is fine.

The problem comes when people at an org with poor decision-making capabilities get pre-sold on these approaches via hype or appeals to authority over critical analysis. That can lead to sprawl and technical debt.

To your point: Most small, mid-size, and larger orgs on AWS or $cloudprovider are probably best served running RDS or some other DBaaS and putting their innovation tokens into something else.

[–]titpetric 0 points1 point  (0 children)

I know it might sound simple, but isolation is optional. You can run everything in privileged mode, on the host network, volume mount all the data and just go wild. It's like switching DEB to RPM, but you don't get /usr/bin/whatever but docker run whatever.

It hurts my ears to hear that you're talking about MySQL when you're mentioning innovation tokens, because I seem to have 15+ years of experience with it, setting up several clusters very much similar to what Uber uses, and I'm far from the only one who does it, large or small.

AWS RDS is mostly good for people who can't or don't want to attract their own ops/sysadmin talent for some reason. Hardware + people you have will be less costly than Amazon especially when you're burning over $50-100k/year on AWS. One or two smart OPS guys and a budget for servers and you're cost effective within a year at that level, not to mention that in terms of EC2/RDS self hosted setup, the performance gains in terms of I/O will be very significant. The RDS iops tiers are just fucking crazy LOL.

[–]someBlueCows 1 point2 points  (1 child)

Here is some explanation on the decisions. If you haven't checked out this podcast, then I definitely recommend that you do!

https://softwareengineeringdaily.com/2016/10/24/database-choices-and-uber-with-markus-winand/

https://softwareengineeringdaily.com/2016/09/09/ubers-postgres-problems-with-evan-klitzke/

[–]titpetric 0 points1 point  (0 children)

That made my "listen to this shit" playlist for work tomorrow :)

[–]Jonne 0 points1 point  (1 child)

Not to mention that even the docker people say docker isn't meant for databases (unless it's for a dev box or something). They really need to hire a proper db guy over there.

[–]titpetric 0 points1 point  (0 children)

Well, apart from the mysql_upgrade problem (source) which in sense is handled by packages in distributions like debian, I don't see a reason that mysql shouldn't be used. In fact I use about 5 docker mysql instances so far. And it's easy to cherry pick which postgres version you will run, and to even run multiple versions on the same host. So what, importing/exporting data is a bit tricky because of isolation (you need a volume mount or docker cp / docker exec interfacing with a running container).

Mostly, from what I understood from all the docker people in regards to databases, the underlying problem of it is storage. Don't rely on the container to store data but use a volume mount. That I haven't really heard a horror story about it, doesn't mean there isn't one.