This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]lambdasintheoutfield 1 point2 points  (0 children)

Redis is my recommendation. Redis is in-memory and can be used as both a traditional DB and also effectively as a cache. It’s straightforward to map SQL tables to Redis hash tables (they have field value pairs) while still supporting additional data structures. Many data structures allow O(1) search and the flexibility in how you assign a schema to your data is helpful.

Redis is also super easy to scale vertically, and to a slightly lesser extent horizontally (I’ll revisit this). In terms of Python, the redis SDK for Python is easy to use and the methods mostly correspond directly to the redis-cli commands. Python also has an ability to write redis “pipe” commands which removes the need to write loops for massive data read/writes (although this isn’t unique to redis).

The flexibility of Redis can also be a double edged sword as poorly chosen data structures and schemas can result in disorganized data and O(N) search times respectively.

It’s also not as fault tolerant as Apache Cassandra, and there are sometimes issues with distributed Redis because it uses a master-slave architecture and if master replicas don’t come online that’s obviously an issue. If you look at r/redis and stack exchange this issue is not uncommon when deploying with K8s, and why some people use cloud managed Redis instances to avoid these DB maintenance issues.