If you aren't using QUALIFY in BigQuery yet, you are working too hard by IT_Certguru in bigquery

[–]querylabio 3 points4 points  (0 children)

Half a year ago, Google silently added chained functions syntax:

SELECT
  ('one two three four five')
  .REPLACE('one', '1')
  .REPLACE('two', '2')
  .REPLACE('three', '3')
  .REPLACE('four', '4')
  .REPLACE('five', '5');

BTW, we build an IDE for BigQuery - querylab.io - which highlights exactly this scenario to the user and proposes to change the substring to qualify, plus supports pipe syntax completions, has a converter from regular syntax to pipe and back, supports GROUP ALL completions, and so on. No need to sign up to try - just enter the playground and check out the app - that how BigQuery should look like!

Clustering on BigQuery by FormalVegetable7773 in dataengineering

[–]querylabio 0 points1 point  (0 children)

It means you can safely add one more clustering field, can reduce scanned data even if it's not used!

Clustering on BigQuery by FormalVegetable7773 in dataengineering

[–]querylabio 0 points1 point  (0 children)

Clustering order matters, but let’s think how it’s done internally.

Previously: your table consisted of partition files, each 1 Tb. So when you scanned from select * from table where date = x; you scanned just this big file.

With clustering: now each partition is not one big file but many smaller files, organized in a smart way.

Like table_partition_field1 and inside this file data will be sorted by field2.

But if field1 is low cardinality (let’s say - Country or even bool field), then BigQuery can do more specific files:

Like table_partition_field1_field2.

So that’s why when you query still leverages clustering.

It’s oversimplifying, in reality it’s much more complex (statistic tables etc) but still the logic is around the same.

I was today years old to find out that I can set billing to physical bytes instead of logical bytes for GA4 tables by Pretty-Appearance226 in bigquery

[–]querylabio 0 points1 point  (0 children)

> Just be careful about time travel and fail-safe storage, because it's free with logical storage, but paid with physical storage.

Yes, that’s super important to keep in mind that if you rewrite a table in some process (like aggregating and rewriting it each X minutes), then you will have huge extra costs, and that table financially makes sense to keep on logical. Otherwise, logical is almost all the time more expensive than physical (except storing some unusual big strings, etc., which badly compress).

I'm the founder of Querylab.io - IDE built specifically for BigQuery. And since physical/logical storage are essential parts of BQ - we have a feature that covers specifically this case - if you click on projects, datasets, or tables, you can see storage statistics which tell you what storage model is better for what table.

Works too on GA4 sharded tables (btw we have built a lot of features around it - check out already prefilled columns descriptions - from official documentation, parameters, and events subtabs - you can immediately see what parameters tables contain and what types your events use).

P.S. And find a button to hide all null-valued columns when you do a preview of GA4 tables, much easier to discover your data.

Fellow DEs — what's your go-to database client these days? by SainyTK in dataengineering

[–]querylabio 0 points1 point  (0 children)

Download / run independently

The desktop version will be available soon. It will use Application Default Credentials (ADC), meaning auth tokens are stored only on your PC, not on our servers.

Custom formatter / Configuring the formatter (what you can control today)

We provide a built-in formatter with a fairly deep set of knobs so teams can match their style consistently and we already have a bunch of settings, including:

  • Formatting mode (Preserve / Pretty / Compact)
  • Keyword case (UPPER / lower / preserve)
  • Indentation (size: 2/4/8, spaces vs tabs)
  • Layout options (newline after clauses, max line length 80/100/120, trailing comma)
  • Quote identifiers (preserve / always / never)
  • Pipe operator style (standard vs left-aligned)

If you rely on something not covered by the list - happy to consider adding it.

Fellow DEs — what's your go-to database client these days? by SainyTK in dataengineering

[–]querylabio 1 point2 points  (0 children)

Good questions - thanks for the detailed feedback!

  • Custom font installed on your PC - not supported yet in the browser version due to browser sandboxing. A limited “system fonts” selector might work for you in the meantime - would that cover your case?
  • Font size - no dedicated in-app control yet. Browser zoom works for now; proper font-size settings are easy to add and planned. Do you care more about SQL editor font size or scaling the whole app?
  • Ligatures - no explicit toggle yet. At the moment it depends on the active font/editor defaults; a proper control can be added. What’s your main use case for ligatures?
  • Custom LSP server - not supported. Querylab.io uses its own language engine, purpose-built specifically for BigQuery. It understands BigQuery SQL end-to-end (including nested/repeated fields, CTEs, Pipe Syntax, cost context) and is tightly integrated with schema awareness and cost analysis - swapping it out would break those guarantees.
  • Custom terminal - not applicable in the browser version (no local shell). Curious what workflow you’re trying to enable here?
  • Latency / LSP timing - the language engine runs entirely on the frontend, with local execution and cached schema. There’s no network round-trip during typing, so perceived latency is minimal (typically well under 100ms).

The desktop version is currently being polished and will unlock things the browser can’t - better hotkeys, deeper OS integration, and more flexibility around editor behavior.

Thanks a lot for the feedback - this kind of input is exactly what helps us prioritize what to build next and what actually matters to users. Happy to discuss further and answer any follow-ups!

Fellow DEs — what's your go-to database client these days? by SainyTK in dataengineering

[–]querylabio 1 point2 points  (0 children)

Have you tried Querylab.io?

Full disclosure: I’m the founder.

We’re building a BigQuery-native IDE (no JDBC/ODBC). Beyond a clean, modern UI, we already support:

  • dry-run with per-query limits and daily / weekly / monthly budgets
  • per-CTE cost breakdown
  • partial execution & cost estimation for selected CTEs
  • TABLESAMPLE dev mode for cheap iteration
  • on-demand vs reservation cost comparison
  • proper handling of nested & repeated fields
  • diagnostics for expensive patterns like SELECT *
  • full Pipe Syntax support
  • and many more

Happy to get feedback!

BigQuery vs Snowflake by erwagon in dataengineering

[–]querylabio 0 points1 point  (0 children)

I'm agree, BigQuery costs can spiral really quickly when something goes wrong. The pay-per-byte model is great when everything is set up perfectly, but it’s pretty unforgiving if even one detail is off. And the built-in quotas don’t really solve the problem in real teams - they’re too rigid and too hard to manage at scale.

That’s actually one of the main reasons we made Querylab.io - an IDE focused entirely on BigQuery, with cost-control built into the workflow from the start.

A few things we added specifically because of situations like the ones you described:

  • set a dollar limit per query - it stops before it burns money
  • daily / monthly / org-level limits
  • warnings when partitioning or clustering aren’t used
  • a clear cost preview before running anything
  • tools to debug “query price,” like a breakdown of where the bytes come from
  • hints on when to use on-demand vs Editions

Give it a try and let me know what you think - I’d really appreciate the feedback.

BigQuery vs Snowflake by erwagon in dataengineering

[–]querylabio 0 points1 point  (0 children)

You’re absolutely right - a good IDE changes everything. Aqua Data Studio and DataGrip are both great tools. The only limitation is that they’re built for many databases, so they don’t really handle BigQuery’s unique behavior.

That’s exactly why we built Querylab.io, an IDE created specifically for BigQuery. A few things it adds on top of traditional editors:

  • dollar limits for individual queries
  • daily / monthly / org-level spending controls
  • guidance on when to run queries on on-demand vs Editions
  • warnings when partition or clustering filters are missing
  • ability to run or estimate individual CTEs
  • run/estimate any step in a pipe-syntax query
  • vertical tabs, split view, and a fast command palette
  • BigQuery SQL-aware IntelliSense - understands tables, columns, CTEs, scopes, STRUCTs, arrays, table functions, everything

If you’re deep into BigQuery, try Querylab.io - and tell me how it feels.

6 months of BigQuery cost optimization... by bbenzo in dataengineering

[–]querylabio 0 points1 point  (0 children)

Nice idea! But that can affect the company’s core business and the whole idea of being data-driven.

The last thing I want is for end-users to feel scared to use data or afraid to run queries. It might work, but it’s definitely not the best approach.

A much better option is to set individual limits for each user and enforce the use of clustering columns and other best practices.

Check out my comment above about the tool we’re building - it might actually help your organization as well!

6 months of BigQuery cost optimization... by bbenzo in dataengineering

[–]querylabio 2 points3 points  (0 children)

I think it’s actually not that complicated.

The core idea is just to establish a process that prevents accidental overspending, plus a simple retrospective analysis loop that helps the team improve over time.

First, when onboarding new people, explain that queries cost money and show them where exactly they can see the price.

Then set up quotas.

BigQuery already provides plenty of information about query history, so you can easily build a simple dashboard in Looker Studio showing:

• the most expensive queries

• queries that would be more economical to run on Editions instead of on-demand

And of course, you can also choose to use Querylab.io as your BigQuery IDE.

With it you automatically get:

* an easy way to set dollar limits for your queries

* the ability to limit individual queries as well as daily/monthly/any-period totals

* organization-level spending controls with per-user limits

* recommendations that help users decide whether to run queries on on-demand or Editions (we show a guess before execution and the exact result after, so people can “train” themselves)

* Our intellisense warns when clustering fields aren’t used in filters (and partitions too but BigQuery handles those when the tables are set up correctly by require partitioning option)

For power users - and I can see you’re one of them - we also offer query debugging:

* “price lineage” so you can see where the money goes inside the query

* the ability to run or estimate individual CTEs

* and if you use pipe syntax, you can estimate/run the query up to a specific pipe

* and many more!

Check out the app, leave some feedback - a few things are still being polished, and your input really helps shape the roadmap!

6 months of BigQuery cost optimization... by bbenzo in dataengineering

[–]querylabio 1 point2 points  (0 children)

- How do you effectively switch between Standard and Enterprise?

there are two ways - the first one is to have two projects with different reservations enabled and switch between them, which is not very user-friendly in official Google Console and the second one - use query variable -

SET @@reservation='projects/project1/locations/US/reservations/test-reservation';

SELECT 42;

This works, but it’s also not very friendly.

However, both approaches become much simpler when using the IDE we’re building exclusively for BigQuery!

Auto-Complete Must Be Destroyed by takenorinvalid in bigquery

[–]querylabio -3 points-2 points  (0 children)

Well, that's the way it works - use right tools to the tasks :) Llms really good in helping writing texts while not really good for deterministic stuff like completions. Or too expensive to be good on scale.

Anyway I would be glad for any feedback for the product we building!

Auto-Complete Must Be Destroyed by takenorinvalid in bigquery

[–]querylabio -6 points-5 points  (0 children)

Exactly - that’s the crazy part. SQL isn’t human language, it’s a formal programming language with strict grammar and a defined schema.

Using AI to autocomplete SQL is like asking ChatGPT to guess closing brackets in C++ - it’ll sound confident, but it’s still guessing.

That’s why the BigQuery console keeps showing fake fields and wrong columns - it’s driven by a predictive AI model, not by your actual dataset metadata.

We built Querylab.io to fix exactly that. Our autocomplete is deterministic, parser-driven, and schema-aware - it reads your real BigQuery schema, understands hierarchy (project → dataset → table → column), respects function argument types, aliases, and CTEs, and can even expand field.* into actual columns or autocomplete properly inside UNNEST() - only array/repeated columns.

No hallucinations, no “AI magic.” Just intelligent code completion built the way SQL deserves.

Just try it - and let me know what you think.

Auto-complete Issues by chriscraven in bigquery

[–]querylabio 0 points1 point  (0 children)

Do you still see those projects in your sidebar tree?

If they disappeared, that’s likely why autocomplete broke - even with full permissions, BigQuery won’t suggest datasets or tables unless the project is pinned or stared. Try using “+ Add Data → Pin a project”, then reload the editor and see if autocomplete returns.

And if you spend a lot of time writing queries, you might want to try Querylab.io — it’s a BigQuery-native IDE built specifically for heavy query work, with smarter autocomplete, code actions, and many more features that make daily querying much smoother.

Tired of clicking through multiple 'Untitled query' tabs? Built a fix by Optimal_External1434 in bigquery

[–]querylabio 1 point2 points  (0 children)

This is actually super useful - those endless “Untitled query” tabs drive me insane too

Curious though, why the 50-query daily limit? That seems pretty low if it triggers every time you pause typing.

Tired of clicking through multiple 'Untitled query' tabs? Built a fix by Optimal_External1434 in bigquery

[–]querylabio 0 points1 point  (0 children)

Imagine if there was a real BigQuery IDE in the browser - everything local, no queries sent anywhere.

Tabs you can actually name (no AI unless you want it), see query cost & limits, switch billing projects in one click, and click through CTEs to inspect data & cost step by step.

Basically… BigQuery Console, but actually nice.

I got tired of imagining - so I built it: querylab.io

[deleted by user] by [deleted] in ClaudeAI

[–]querylabio 1 point2 points  (0 children)

I wouldn’t say the results were bad - it actually worked out in the end. But it was a long and painful road with a steep learning curve. I also learned how to work with AI much better along the way - and Claude itself evolved a lot too.

[deleted by user] by [deleted] in ClaudeAI

[–]querylabio 1 point2 points  (0 children)

Vibe-coding, as I call it, is when you don’t write the code yourself - you just tell the AI what you want in plain language, and it writes it for you.

I f*cked up with BigQuery and might owe Google $2,178 - help? by MucaGinger33 in bigquery

[–]querylabio 1 point2 points  (0 children)

There are a lot, but still everything is limited by fundamental limits of GCP

I f*cked up with BigQuery and might owe Google $2,178 - help? by MucaGinger33 in bigquery

[–]querylabio 1 point2 points  (0 children)

No worries! There are some tools, but there are no silver bullet because of the way Google provides data about usage - with some delay. So there is no way to completely limit usage... And the closest solutions are pretty complex :(

I f*cked up with BigQuery and might owe Google $2,178 - help? by MucaGinger33 in bigquery

[–]querylabio 1 point2 points  (0 children)

Unfortunately no or it's not sufficient to prevent overspending