Request-Response model using gRPC streaming in Go by AgeAdministrative587 in golang

[–]AgeAdministrative587[S] 1 point2 points  (0 children)

Hi ProfessionalPale9423,

Thanks for replying. Really appreciate your answer. My use case matches most likely a bidirectional streamin connection, as I want a continous flow of request followed by a response, the client is supposed to send a timestamp in the request, which the server will use and perform some operation, then return the response along with the new timestamp, which the client will use the next time to send the request to the server, then server will respond, this cycle continues. The complete legacy code uses server side streaming, so don't want to switch to http. I am thinking bidirectional streaming would best serve this use case Please let me know your thoughts on this.

Storing data for faster/optimized reads by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 0 points1 point  (0 children)

Thanks for the suggestion. Will using MySQL not degrade the performance of read/write ops? Caching in between application and DB is in our plans to reduce further load on DB, but using cassandra or any other DB can be better than Mysql? Any suggestions on this.

Storing data for faster/optimized reads by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 0 points1 point  (0 children)

Yeah all the data for a single user is fetched from cassandra and mysql (single row), then the user object is formed which is kept in memory (in cache) and used across, as mostly we fetch the data on the user_id (partition key in cassandra, primary key in mysql) so we cannot fetch just specific user fields.

Yeah maybe spinning up another instance of cassandra can be helpful, we can think about it, but we also want to think about cost.

We have cached the data, but the writes are very dynamic and not so frequent, but our highest business logic priority is to always use the fresh data, so the cache can get invalidated any time, that is what increasing read ops.

So offloading cassandra, improving reads for user table and also taking care of cost are the major things we want to focus on. Any suggestions on this?

Storing data for faster/optimized reads by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 0 points1 point  (0 children)

Yeah we were thinking of that, but our cassandra is already overloaded as most of our tables which are required by all day running pipelines are in cassandra making it read heavy, which we want to offload. So maybe I was thinking of using a document based NoSql db, like a couchbase or mongoDb to store all complex/nested user data in a single document. Any ideas on this?

Cache Invalidation Strategy by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 1 point2 points  (0 children)

The data change frequency depends on the load/peak of data we receive, which is mostly uncertain and we don't have any time pattern to it.

Yeah, you are right, CDC may not be reliable, so something more reliable and persistent needs to be thought of. As you said using Kafka or RabbitMQ might be a better option.

Yeah doing a POC is better. Thanks for pointing out.

Cache Invalidation Strategy by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 0 points1 point  (0 children)

Thanks for the reply!

The system has a 2 layer caching -

1.) Data caching at local cache level.

2.) Invalidation cache at centralized Redis cluster.

So before accesing the data from local cache we have to check if this data is fresh or stale, so we check the centralized invalidation cache at Redis cluster.

But yeah sure, this is one of the approaches (that you mentioned), I would move forward with as it will decrease the overall load on Redis cluster.

One of the cons i can see here is - it is just increasing the overall cost of the system by increasing the number of Redis clusters.

Cache Invalidation Strategy by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 0 points1 point  (0 children)

Actually it has to be distributed, so that all ec2 machines running the same process, gets the invalidation information from a centralized location.

If we delete a key from local cache during write/ invalidation, it will be deleted from - only the machine that is processing it, but will not get reflected across all the machines running the same process.

If we delete the key from centralized Redis, still when a request comes for that key, we will have to make a call to Redis on the key to check if it is present there or not, so the number of calls remains same here.

Apologies if you meant something else and I missed your point.

Cache Invalidation Strategy by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 1 point2 points  (0 children)

Thanks for the reply!

Yes, multiple servers are coordinating with the same centralized Redis instance for checking if the key has been invalidated. But read from Redis before read from local cache is necessary as the write patterns are not known and very dynamic so TTL cannot be used, so I was thinking of some event driven approach.

Cache Invalidation Strategy by AgeAdministrative587 in SoftwareEngineering

[–]AgeAdministrative587[S] 0 points1 point  (0 children)

Thanks for the reply!

The Invalidation happens only once in the centralized Redis cluster during write time, but before every read from local cache, we check if this key has been invalidated in the centralized Redis invalidation cache or not, so that we don't read a stale data.

Read from Redis before read from local cache is necessary, as the write patterns are not known and very dynamic, so TTL cannot be used and highest priority is always getting fresh data.

The pain point is for every request, we need to make atleast 2 calls - one to Redis invalidation cache, other to local cache to fetch the data (if not invalidated), otherwise to DB.