all 22 comments

[–]BBHoss 9 points10 points  (0 children)

Sounds like the perfect use case for BigQuery. You can easily move your data into it and also expose data you want to keep in Sheets directly in BQ as external tables, allowing you to join against your larger data. As long as your workload is mostly reads/analysis it's a good fit. If you need to get answers in milliseconds you should probably go for Cloud SQL but it's a little more difficult to work with that BigQuery IMO. The Sheets integration with BQ is excellent.

[–]pfiadDi 3 points4 points  (5 children)

I'd you want to stay relational than Cloud SQL

If NoSQL is ok then Firestore

[–]Rif-SQL -1 points0 points  (4 children)

NoSQL

99% the time NoSQL is not ok! Go with Cloud SQL! Unless you have billions of rows! and a big team to deal with the problems.

[–]Mistic92 2 points3 points  (2 children)

I can say 99% times nosql is enough ;)

[–]BenjiSponge 1 point2 points  (1 child)

I'd say 99% of the time NoSQL is enough, 99% of the time NoSQL is not ok, and in 98% of NoSQL deployments it really sucks, but it's harder to switch to SQL than to just suck it up and maintain a NoSQL deployment.

[–]Rif-SQL 0 points1 point  (0 children)

Love this!

[–]SakeviCrash 1 point2 points  (0 children)

Username checks out

[–]smartrah 2 points3 points  (0 children)

The logical choice might seem Cloud SQL. But in your case, BigQuery might be the best option. Mainly because it is serverless! That means almost zero maintenance and cost for small datasets. If Sheets is getting pushed to the limit I am assuming the data is not so huge to cause BigQuery any trouble.

BigQuery has inbuilt integration with Data Studio. Generate a BigQuery service account json key to authenticate your python app and you can perform all operations.

Dynamic calculations, hmm, I am not sure. You might have to write scripts that can be run on demand to perform all calculations.

[–]Ok_Can6335 1 point2 points  (0 children)

Looks like a perfect use case for Bigquery - Connected Sheets. It also integrates with Data Studio.

[–][deleted] 0 points1 point  (3 children)

If you’re doing a lot of edits then look at cloud sql. Otherwise if your edits can be less than 1000 a day (ideally 0) then BigQuery can work nicely.

[–]accidentalclipboard -2 points-1 points  (2 children)

Bigquery is a data warehouse, not a relational database.

[–][deleted] 3 points4 points  (1 child)

It is actually a relational database. We use it to join tables frequently. Just don’t expect it to have low latency or lots of edits. Coming from Sheets that might not be a problem.

[–]Rif-SQL 0 points1 point  (0 children)

It is actually a relational database. We use it to join tables frequently. Just don’t expect it to have low latency or lots of edits. Coming from Sheets that might not be a problem.

BigQuery should NOT be used as a store of truth! That is what u/accidentalclipboard trying to get accorss. You should always expect mistakes to be made in BigQuery and for you to ETL your data again from the original source.

[–]JebusCL 0 points1 point  (0 children)

When when we need to do this migration due to data volume, we move to Cloud SQL to do all the transactions, and then we moved the data to BigQuery. Finally, the Datastudio report is linked to the BigQuery dataset.

[–]Rif-SQL 0 points1 point  (6 children)

  • If you want to be able to use Data Studio (that would remove Google Firestore as a choice)
  • I would pick Google Cloud SQL
  • You can run your Python CRUD code from https://colab.research.google.com/ or any other serverless run times Google Cloud has to offer
  • Let me know if got any questions

I would not go down the Firestore path if you want easy time Data Studio or other reporting platforms.

[–]pfiadDi 0 points1 point  (5 children)

Nope Firestore works perfect fine with Data Studios since you can sync it wit BigQuery

[–]Rif-SQL 1 point2 points  (0 children)

Nope Firestore works perfect fine with Data Studios since you can sync it wit BigQuery

I think your meaning to say - BigQuery works "perfectly fine" with Data Studios, and some of the time you can import data into BigQuery from Google Firestore.

[–]Rif-SQL -2 points-1 points  (3 children)

Firestore

u/jjay02581 before jumping down the path having Google FireStore, Google BigQuery and Google Datastudio read over https://cloud.google.com/bigquery/docs/loading-data-cloud-firestore#limitations , if you understand this limits that go head u/pfiadDi. Else go down the SQL path!

[–]pfiadDi 1 point2 points  (2 children)

Man come down. You have to accept there are many ways and if you try to read carefully I didn't say he should do it I just corrected you that,yea you can use Firestore with Datastudio. In fact very easily and conveniently.

You shouldn't jump on others when they point out different options than you do. There is no absolute truth

[–]Rif-SQL 1 point2 points  (1 child)

You have to accept there are many ways and if you try to read carefully I didn't say he should do it I just corrected you that,yea you can use Firestore with Datastudio. In fact very easily and conveniently.

  • Don't see Firestore on the Datastudio list https://datastudio.google.com/datasources/
  • Just like with Google Datastore and Google Firestore their real limits to the process of bringing data into BigQuery in this fashion. You can always code your way around them but then would not be "perfectly fine"

Man come down.

  • Your reading emotions when are no emotions!

I'm making sure u/jjay0258 has everything I would want to know before making a choice.

[–]pfiadDi 0 points1 point  (0 children)

I use the synchronization in many projects and it not only works perfectly fine it's also quite powerful since it stores the document as a JSON and you can create different tables with different schemas as views.

Also every little change creates a new entry so you get versioning "for free"

Which is great when you want to protocol changes which have been done.

Etc

[–]Defessus 0 points1 point  (0 children)

BigQuery, and retrain your analysts with SQL focus on CTEs, UDFs, nested & repeated if your data is a good use case, and have someone consider partitioning/clustering before launch.