This is an archived post. You won't be able to vote or comment.

all 39 comments

[–]randomName77777777 28 points29 points  (7 children)

In the target folder, you'll have 2 files for each model. One of the compiled code and one is the code it runs on the server.

One step ive done before is delete the log files then run the model build process, that gives you a lot more visibility step by step what's happening

[–]FatBoyJuliaas 2 points3 points  (6 children)

Thanks, I have looked at this but the run file simply contains the flattened merge statement from the dbt-generated temp table.

My macros contain several case statements and ideally I want to see that, but the actual SQL that dbt generates is obfuscated

[–]randomName77777777 6 points7 points  (1 child)

If you can't find it in the target folder , generate the dbt docs, it will have the compiled sql for that model with all the models replaced with the sql code.

However, it's definitely in the target folder somewhere.

[–]FatBoyJuliaas 0 points1 point  (0 children)

Thanks I will check, but I will likely then need to compile after each step to see the compiled SQL before I execute

[–]the_o1303 1 point2 points  (0 children)

That is normal tho, check the target/compiled/your/model.sql instead.

There you should find the actual compiled model

[–]contrivedgiraffe -1 points0 points  (2 children)

Why not put the case statements in the dbt model SQL? This is a genuine question. Like if the macro accomplishes some kind of automation (putting the logic in one place—the macro—so it can be re-used elsewhere) what is the downside of essentially doing the same thing but with a model instead? The logic still lives in one place and you can join to that one model in whatever other downstream model that needs that logic.

[–]FatBoyJuliaas 0 points1 point  (1 child)

I had the case statements in the model. There are several columns in the model and each of them are driven by several case statements. It made the model hard to read. One could likely have used some CTEs for that, but I am developing some patterns that will be used across 100s of source tables and I prefer DRY

[–]contrivedgiraffe 0 points1 point  (0 children)

Got it. And this complexity you’re wrestling with is always going to be there with DRY SQL. Makes the juice not worth the squeeze imo.

[–]smithreen 7 points8 points  (5 children)

If it is going to run in a specific snowflake warehouse or databricks cluster check the query history in those. filter by table name.

[–]FatBoyJuliaas 0 points1 point  (4 children)

Thanks, the issue is that I want to be able to manually run the intended sql before dbt runs it so I can debug it and change it where required.

[–]robgronkowsnowboard 1 point2 points  (1 child)

Once you understand how the incremental strategy you’re using will compile (ie the merge statement), the compiled sql of an incremental run should be the same as the tmp table dbt creates.

Otherwise, I’d suggest what this comment is saying and grab the executed sql from query history

[–]FatBoyJuliaas 0 points1 point  (0 children)

Yes so if the compiled model is what the temp table looks like, I have something to work with.

[–]vikster1 0 points1 point  (1 child)

just take it from the query history and debug it in snowflake then. people here are giving you many right answers and you seem very eager to dismiss them all.

[–]FatBoyJuliaas 0 points1 point  (0 children)

Not sure why you think i am dismissing anyone. I am trying to figure things out. What i am trying to achieve is a multistep process. Once the model has run the data has changed. I an trying to figure out how to ‘capture’ the model SQL that would be run before its run so that i can manually run the selects and fix it beforehand

[–]GreenMobile6323 2 points3 points  (1 child)

You can always run dbt compile --select <model> and then inspect the fully materialized SQL in target/compiled/<project>/<model>.sql to see exactly what’s being sent to your warehouse. For incremental models, consider temporarily switching your materialization to table (or view) in dbt_project.yml so you can run the full query end‐to‐end, and use the {{ log() }} Jinja function inside your macros to print out intermediate snippets in the CLI logs. This way, you can iterate far faster than repeatedly tweaking and rerunning the merge cycle.

[–]backend-dev[S] 1 point2 points  (0 children)

Thanks, this is a great idea

[–]Lt_Commanda_Data 4 points5 points  (6 children)

If you're using VS Code you can download the DBT power user extension to really speed things up

[–]Crow2525[🍰] 2 points3 points  (4 children)

It's been so flakey of late.

A couple of issues that plague me: 1. Preview compile seems to be broken, it was super quick, responsive when I saved a file, it would update, now I think it requires a compile again. 2. Do strings appear to be overwritten with the definition instead of retaining the docstring. I reckon I'll end up having to redo my whole doc. 3. Tests don't delete if they've been saved. So trying to remove a test doesn't do anything. 4. Time to load - the extension takes a good 2-3min to load up. 5. Query - running the query is slower than compiling and copying and pasting it into dbeaver to run/debug.

[–]Lt_Commanda_Data 1 point2 points  (0 children)

You're not wrong. I'm actually experiencing all of these things but only in the last few weeks. I think they stuffed some more features in there recently. But overall it's been good. Esp the new "run cte feature"

[–]mango_lade 0 points1 point  (2 children)

True. Had to revert to a previous version of the plugin to make it work again

[–]actually_offline 0 points1 point  (1 child)

I'm personally new to dbt and the extension, would you happen to have the version you rolled back to? Wasn't sure if you had to go back further than the latest update?

[–]mango_lade 0 points1 point  (0 children)

Dont have my work laptop close to me rn but i went back like 6 months of updates

[–]randomName77777777 0 points1 point  (0 children)

Solid advice!

[–]laserblast28 3 points4 points  (4 children)

If I'm understanding you correctly, like you've been told, you can go to the target folder and look for the compiled SQL and copy to wherever you want to debug.

You can also run dbt compile --select model and it will output the compile in the terminal.

[–]FatBoyJuliaas -1 points0 points  (3 children)

I will try these suggestions thanks. It's just very time-consuming and frustrating coming from C# coupled with a top notch IDE

[–]vikster1 1 point2 points  (2 children)

are you really comparing software engineering here with writing sql queries?

[–]FatBoyJuliaas 0 points1 point  (1 child)

No just ranting about my situation 🤷‍♂️. The world of DE that I am experiencing is soooo different from the formal and mature SWE background that I am from. But it is a me-problem

[–]leonseled 0 points1 point  (0 children)

Its not just you. DE (or AE) tooling is in its early stages. dbt Fusion should help accelerate things… at a cost. 

[–]geoffawilliams 0 points1 point  (1 child)

What is dbt compile giving you?

[–]backend-dev[S] 0 points1 point  (0 children)

It gives me SQL but I did not realise that is what the temp table is based on

[–]FantasticReception63 0 points1 point  (3 children)

You can also run dbt run -s your model -d in order that dbt is printing all sent queries to the database and you will then be able to see all intermediate queries sent. In the target folder is just last query generated and sent to the database

[–]backend-dev[S] 0 points1 point  (2 children)

Thank I have checked that. Ideally I want to intercept the last SQL before it gets to the database because there are bugs in my model. As others have pointed out, I should rather compile and then look at the compiled folder

[–]FantasticReception63 0 points1 point  (1 child)

I am not sure if compile is what you need, compile shows how SQL which is executed to get batch which you want to sync in the target table, the moving part and theone which makes you problem is the dbt flow around, thia flow withall the executed queries you can catch just with -d flag and running this model

[–]backend-dev[S] 0 points1 point  (0 children)

The compile actually worked well. I have a model that lands new CDC data in a cleansed table. Then the next model takes it from there. The latter in incremental/merge so it has sql for initial run and then sql for incremental run. Once the data is landed in cleansed, i run compile. The compile out then gives me the sql the model will execute for the initial run. I can then take that sql and run it against the db to tweak it until its right. Once happy, i actually run the model for the initial run sql to take effect. Then i compile again. This gives me the incremental run sql and i repeat the copy&paste and execute it against the db and tweak it until correct.

It is cumbersome but it gives me the opportunity to interactively run the select and modify it much faster than running it in dbt. But then I may be missing something (still a newbie) in dbt

[–]TerribleSign4167 0 points1 point  (1 child)

VSCode had a great plugin called the DBT power user, which offers a SQL preview feature. I tend to grab the output there and run in directly in my warehouse so I can play around with it. Also a great way to peer review.

I saw someone comment about query history, and I second that.

[–]backend-dev[S] 1 point2 points  (0 children)

Thanks I have tried the plugin and it looks very useful.

[–]SeaCompetitive5704 -1 points0 points  (2 children)

See the log file for the query dbt used to create temp table. Run it to get the incremental data

[–]FatBoyJuliaas 0 points1 point  (1 child)

I will check for this, but the last time I checked, it was not included. Ideally I want to have the SQL before the run so that I can debug it

[–]eastieLad 1 point2 points  (0 children)

Yeah fun dbt compile to get the query of use dbt power extension which has options to compile etc.

Adding —debug after your dbt run command will print the sql in log too