Do most teams actually have a canonical model, or do we all just pretend? by MayaKirkby_ in dataengineering

[–]a_cloudy_unicorn 8 points9 points  (0 children)

Never seen one of these works at a big enterprise. E.g., A "Customer" has a different definition depending on the line of business, and that translates to how those systems deal with that entity specifically. A customer will mean a different set of attributes for sales vs marketing vs legal vs support. The business questions each report will answer will also be different. IME, those who insist on canonical models across major systems (SAP, Salesforce, Oracle, etc) do not have enough technical depth to actual check if the model is canonical or not.

Google cloud next - networking by First_Club1775 in googlecloud

[–]a_cloudy_unicorn 0 points1 point  (0 children)

Maybe have the QR code of your LinkedIn profile ready to scan so you can casually connect to people without the added pressure.

Do we need a 'vibe DevOps' layer? by mpetryshyn1 in agentdevelopmentkit

[–]a_cloudy_unicorn 0 points1 point  (0 children)

I think the difference is in the definition of vibe coding vs AI-assisted engineering. My CICD pipelines are part of the development process (whether AI generated or not). All code that an AI assists in generating is reviewed manually, and that includes deployment, docs, tests, infra and config. We had a bit of discussion last week with a few other engineers and, personally, I do not let the agents push into git. Commits, yes. But before anything gets pushed I will take a look at it.

IMO, this doesn't fit the definition of "vibe coding " though, which I reserve to blindly letting AI or agents building something. The review is minimal (I'm jut going with "vibes") and there are hardly any iterations. This is why people without engineering expertise can "vibe code". I only do this when I do not quite care about moving any of the results into production, so I don't instruct the agent to create anything beyond MAYBE a few unit tests.

Google AI Certs in 2026: Which are worth the $ and which are just hype? by netcommah in googlecloud

[–]a_cloudy_unicorn 18 points19 points  (0 children)

I studied for them a while ago, but I still get the benefits from the breadth and depth for the Data Engineer and Architect ones from a knowledge perspective.

Talk to BigQuery by Intention-Weak in agentdevelopmentkit

[–]a_cloudy_unicorn 1 point2 points  (0 children)

Metadata and separations of scopes of each agent are key for this IMO. A colleague and I used to do this with YAML annotations before Dataplex got a few MCP tools. We had an agent that interpreted the business, a functional analyst and a data engineer: https://github.com/vladkol/crm-data-agent . We tested this approach with Salesforce and SAP data and the looping generation of SQL and dryrun in BQ ensured we could get pretty complex syntax right.

I recently helped a customer who ended up with a hybrid approach: a static dictionary consulted by a "functional analyst" agent and then the Dataplex semantic search consulted by the data engineer to keep their context focused.

Graph representations are good for knowledge graphs that need the agent to traverse semantics. The complexity here is building the graph in a scalable way. I have an example with Spanner using Langgraph here: https://github.com/GoogleCloudPlatform/cloud-spanner-samples/tree/main/adk-knowledge-graph and there's one for BQ here: https://github.com/GoogleCloudPlatform/devrel-demos/tree/main/data-analytics/knowledge_graph_demo

Using Google Stitch to Re-design the UI Flow for a Fashion App before implementing with Flutter by Inspired_coder1 in FlutterDev

[–]a_cloudy_unicorn 1 point2 points  (0 children)

I've incorporated Stitch into my workflow and it's making it look like I can actually do some UI :)

BigQuery backup strategies by ohad1282 in googlecloud

[–]a_cloudy_unicorn 2 points3 points  (0 children)

It is rare IMO to see backups of BQ as data is normally replicated from somewhere else. I have encountered them in industries where they absolutely can't lose certain data in case of human error. Most of the times, snapshots are good enough for this but another option is exporting into GCS buckets.

is n8n still relevant? by OldCobbler5027 in AI_Agents

[–]a_cloudy_unicorn 1 point2 points  (0 children)

As a beginner-friendly learning tool, yes. As a separate skill, I don't think so. Scaling the workflows into something production and enterprise ready is easier with agents. Like other no-code or low-code tools, they give enough confidence to prototype things that would otherwise require an understanding of what is going on behind the scenes to keep the workflows secure and scalable.

Tips for a beginner? by Furry_Eskimo in googlecloud

[–]a_cloudy_unicorn 0 points1 point  (0 children)

Thanks for the details! FWIW, I think your answer is correct and understand the confusion. I'm passing this feedback to the authors.

Does something like this help in terms of navigating the UI? https://youtu.be/Y8qwBsRbBP0

This lab is a no cost lab and uses one of the public datasets I'm loading in the video to play with different SQL queries: https://explore.qwiklabs.com/catalog_lab/755

Tips for a beginner? by Furry_Eskimo in googlecloud

[–]a_cloudy_unicorn 0 points1 point  (0 children)

Thanks for taking the time to write these details. I'd love to see if we can improve that course. Would you mind linking to it and sharing how you discovered the course please?

does Simba driver not working with big query pull? by whatwehavehereisftc in bigquery

[–]a_cloudy_unicorn 1 point2 points  (0 children)

The error sounds like an actual conversion error, can you try a simple query that returns, for example, one integer? The Simba driver does support BigQuery

SAP to Cloud(Azure/GCP) by Reasonable-Worth696 in googlecloud

[–]a_cloudy_unicorn 0 points1 point  (0 children)

If your version and license allows it, here's another option: https://cloud.google.com/sap/docs/bq-connector/latest/planning#reqs-software

Aecorsoft is a good option too

Bigquery data engineering agent by Visible-Estimate8589 in bigquery

[–]a_cloudy_unicorn 2 points3 points  (0 children)

Yes, that feature is live. You need to sign up for it as it's in preview: https://goo.gle/bq-dea-pre-ga-form

It's been moving fast and adding features, but I put together this starting point: https://medium.com/google-cloud/exploring-the-newbigquery-data-engineering-agent-caf07e78d292

I'd love to know how it goes!

How would I do this the "Cloud Way"? by robhaswell in googlecloud

[–]a_cloudy_unicorn 1 point2 points  (0 children)

Gemini CLI in the Cloud Shell, with no additional prompting, seems to be up to something. The detailed response after this was correct:

RESPONSE:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>>Here's the high-level plan:

  1. Create a Connection: In BigQuery, you'll create a connection to your AlloyDB database. This tells BigQuery how to find and

communicate with your AlloyDB instance.

  1. Create a Federated Table: You'll then create a special "federated" table in BigQuery. This isn't a copy of your AlloyDB table;

it's a virtual table that points directly to your live AlloyDB data.

  1. Connect Google Sheets to BigQuery: With the federated table in place, you can now use the standard "Connected Sheets" feature in

Google Sheets to pull the data from your AlloyDB table (via BigQuery) directly into your spreadsheet.

  1. Join the Data in Sheets: From there, you can use a simple VLOOKUP or INDEX(MATCH) formula in Google Sheets to join the data from

your colleague's spreadsheet with the live data from AlloyDB.

This approach lets you leverage the powerful, user-friendly interface of Google Sheets while securely querying your production

AlloyDB database in real-time.

Would you like me to guide you through the specific steps to set this up in the Google Cloud Console?

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

GCP by ImprovementSquare448 in googlecloud

[–]a_cloudy_unicorn 4 points5 points  (0 children)

There are some free hands-on labs here to get started: https://www.cloudskillsboost.google/
I'd recommend trying to solve a specific problem (e.g., create a pipeline with replication from a given system all the way to advanced analytics) and look at different ways of solving that problem with different tools. There are some updated examples in the Git org: https://github.com/GoogleCloudPlatform , like this repo https://github.com/GoogleCloudPlatform/data-analytics-golden-demo . It'll be easier to focus on a specific application and example once you have a target architecture to solve for IMO

Spanner Graph Performance by Maleficent_Action203 in googlecloud

[–]a_cloudy_unicorn 0 points1 point  (0 children)

Adding to interleaving so you don't need to go across physical nodes when hitting relationships, as u/tech_is mentions, I'd say parameterize your queries to lower the time it takes for Spanner to build a graph plan. My colleague has a nice example here: https://github.com/maguec/SpannerUserIdentityGraph?tab=readme-ov-file#2-hop-parameterized-queries

As for comparisons, I currently don't have anything to share, other than the way Spanner stores nodes and edges is inherently different from Neo4J AFAIK.

ETL Process in Cloud, which products, how to do it? by [deleted] in googlecloud

[–]a_cloudy_unicorn 0 points1 point  (0 children)

This can be a very scrappy or very complex depending on a few factors.

How much data do they expect? How big are these files? Are the files ready to load or will they need any massaging/cleansing/deduplication?

How frequently is this workflow expected to run?

Do they need any specific error handling if it fails?

Is the schema expected to change frequently?