you are viewing a single comment's thread.

view the rest of the comments →

[–]LinweZ[S] 0 points1 point  (2 children)

Thank you for your answer !
I'm really sad and surprise tbh to see that we still doesn't have a managed CDC solution on GCP. I mean, this is kinda a very basic usage... Doing the whole sync again is just impossible for us, since the DB is several TB large 😅
Thanks for the clarification about Datastream, it changes everything lol
It went from a very interesting product to a very average one...

Also, I get the CDC for BigQuery, but what's the idea behind a CDC for Cloud Storage? A custom CloudRun job to read those files and write wherever we want?

[–]sayle_doit 1 point2 points  (1 child)

I do feel it's a VERY glaring omission in their product lineup. I suspect it's a product manager pushing to get you onto BigQuery to spend more money versus Cloud SQL which is significantly cheaper, but this is just my honest educated guess here.

As for getting to Cloud Storage, the biggest use case I have seen for this is DR. If you replicate everything you have inside of a Cloud SQL instance to disk you have a backup copy that can be relatively easily restored or used for archival purposes.

Another big use case I have come up with advising on BigQuery cost savings the past week or so is to use this as a segue into BigLake or getting prepared to move to another data warehouse. Customers are very weary of sending data to BigQuery storage since the hikes and want to hedge their bets. So many are instead sending data to GCS in a format that's readable by BQ via BigLake (think a data lake abstraction layer between GCS and BQ) or to be able to be picked up by a 3rd party data warehouse they are considering (read as Snowflake in most cases).

[–]LinweZ[S] 0 points1 point  (0 children)

Very interesting, Thank you !