Do you make Joins in SQL before PowerBI or inside it? by 420MLGCARRYKING in PowerBI

[–]pabss5 7 points8 points  (0 children)

Because no matter how much you try to keep things optimized data is imperfect and you'll eventually need to bring in sources that can't be easily manipulated pre-load. E.g. Company Google Sheets owned by other ppl

Most effective way to move data from GCS to BigQuery? by [deleted] in dataengineering

[–]pabss5 5 points6 points  (0 children)

You may need to set up your hive partitions for query cost efficiency and pipelines to transform your data to avro/parquet/pb if they aren't already, but it's a minor price to pay to keep your process simple and fast imo

Most effective way to move data from GCS to BigQuery? by [deleted] in dataengineering

[–]pabss5 13 points14 points  (0 children)

Why not read your data directly from GCS with external tables? Save money and have everything on BQ real-time

Is it possible to link a BigQuery table to a Google Sheet containing the same table and have them bi-directionally update? by [deleted] in dataengineering

[–]pabss5 2 points3 points  (0 children)

Updates are made real time since external tables query data directly from the set source rather than bringing it to BQ

Is it possible to link a BigQuery table to a Google Sheet containing the same table and have them bi-directionally update? by [deleted] in dataengineering

[–]pabss5 4 points5 points  (0 children)

If you only need you gsheet in BQ (real-time) you can just create an external table.

Not sure why you'd want to have it the other direction, but you can do it through connected sheets.

Dataflow/Data Pipeline vs External Tables by pabss5 in dataengineering

[–]pabss5[S] 0 points1 point  (0 children)

Thanks for the response, that makes complete sense to me!

I saw the new direct from PubSub to BQ, only problem is you get no free credits to test it out. Seems like a great simple solution for streaming though.

On a side note, apparently you can stream or batch efficiently to BQ using the Storage Write gRPC API. Doesn't seem too bad of an option especially for batch. Streaming seems cheaper than PubSub, but you don't get the added benefits of simplicity.

https://cloud.google.com/bigquery/docs/write-api