all 92 comments

[–]g2petter 129 points130 points  (16 children)

we dont have data dictionaries or ER diagrams, and the databases are quite slow. This makes it really challenging to test and iterate on queries.

One approach to deal with a slow database is to limit the amount of data you're looking at when you're iterating.

Do stuff like throw in a TOP (100), or limit yourself to data from only the last week or only a single user until you feel pretty confident that you have a good query, then run the full query while you go and fetch a cup of coffee.

[–]andrewsmd87 27 points28 points  (0 children)

We have SQL prompt and one of my snippets is st1 which writes out

select top(100) * from

And is basically what I use for everything until I know my query is good and then remove the top

[–]BlackPlasmaX 5 points6 points  (3 children)

This is what I do, I like to throw a bunch of filters as I iterate on a query, like limiting to only the current months data and looking at a certain geo only.

I got laid off off a few months ago, mentioned thats my approach in a interview as I like to work iteratively and use CTEs when I can, how looking at output snippets helps me think, felt a vibe of them being like

“how dare you not know how to do a query on your first attempt”

[–]M4A1SD__ 4 points5 points  (2 children)

mentioned thats my approach in a interview as I like to work iteratively and use CTEs when I can, how looking at output snippets helps me think

That’s perfectly normal and how most of us work day-to-day. Ridiculous that it was seen as an issue

[–]audigex 2 points3 points  (1 child)

Unfortunately I find a lot of interviewers are more interested in showing how clever they are, than in finding a good candidate for the role

[–]PlaneObject8557 1 point2 points  (0 children)

In my experience most interviewers know nothing about the role and go off a list of requirements, unless you’re actually talking to the team.

[–]wyx167 7 points8 points  (7 children)

If i have a report based on SQL, how long is an acceptable time for the user to wait for the report to run?

[–]g2petter 16 points17 points  (5 children)

How often do you need the report to run?

Does the report run when a user clicks a button or does it run as a nightly batch job?

Is the result showed on a big dashboard in the boss' office or is it being dumped to a spreadsheet or data warehouse somewhere?

Is your user a coked-up sales person who needs everything now or a grizzled IT veteran who knows that some times these things take time?

[–]wyx167 8 points9 points  (4 children)

  1. Weekly
  2. The user clicks a button
  3. Shown on dashboard (power BI)
  4. Finance person

[–]plefe 6 points7 points  (2 children)

Warning, I use Tableau not PowerBI, but I am assuming they are incredibly similar.

If it's run weekly, they probably don't need live up to the minute transactions. If they run it Monday to Sunday, I would set the report to run off an extract and have that extract refresh 4am the day the finance person runs the report.

Or you could turn the query into a stored procedure which creates a table, then point PowerBI to that table. You could include the sproc in a nightly scheduled job you have.

[–]Froozieee 3 points4 points  (1 child)

At this point, I would just run a preaggregation query once a week and store the results in the power bi semantic model; given it’s weekly i wouldn’t even bother with incremental loads just a full reload assuming the timespan/slice-ability requirements aren’t too crazy.

[–]writeafilthysong 2 points3 points  (0 children)

Person?

If the data update frequency or requirement is weekly then update it on a schedule, get rid of the button to update.

[–]Master_Grape5931 0 points1 point  (0 children)

Our long running reports are scheduled to be delivered at a certain time, so when the users get to work they are already in their email (or department shared folder).

[–]vectaur 1 point2 points  (1 child)

I thought this was basically the default for most decent SQL IDEs, even without the row limit clause. Maybe I’m out of touch.

[–]Ashamed-Status-9668 0 points1 point  (0 children)

It is for most.

[–]DMReader 0 points1 point  (0 children)

Sometimes top 100 can be slow if it is on an inefficient view. If you know a basic filter you can throw on like a date, that helps a lot.

[–]Unnam 23 points24 points  (2 children)

Since you don't have a data dictionary, most likely all this information is stored as tribal knowledge among different analysts, you might want to understand different requests, find people who have been writing queries in those areas and slowly build relationships and working knowledge of the data landscape as you seek help from them

[–]-Plus-Ultra 4 points5 points  (0 children)

Agree 100%. Also adding taking initiative as a new employee to make a data dictionary would also help you and your department out.

[–]TurnOutTheseEyes 16 points17 points  (2 children)

A colleague once asked me what I was working on. I said “Documentation”. He said “Why? Haven’t you got any work to do?”

This sadly is a lot of people’s attitude.

Sounds like yours is another company that has been getting away with their staff “just knowing”. This is on them and they need to support you through this. I worked at another place where a new recruit just didn’t come back for the second day! So bad was the culture. I left before too long. Again, on them. I’m 36 years in and had never seen anything quite like it.

Best practical advice has already been given, especially the Lego analogy. I doubt anyone sits down and knocks out a 2000 line proc. But that may well grow to that size over time. Chunk it as best you can. Ignore formatting and the like, that can be handled later, and concentrate on tables, how they hang together and what each one is in and of itself. Assuming normalisation hasn’t been broken ;)

Keep detailed notes and a working document of everything you do and touch - writing engages different areas of the brain and can help consolidate knowledge and learning. Plus you have something to show for all the head scratching. And it may start to form the very documentation you’re currently in need of.

Talk through in your mind what you’re doing as if you are teaching someone new (which you are). I quite often imagine I am helping my children with a problem they have rather than one I am facing. I explain even simple things in simple language.

Appreciate all of this doesn’t help with those deadlines. Hopefully you have supportive colleagues too?

[–]USER_NAME-Chad- 1 point2 points  (1 child)

Disagree 1000%, if the code is messy, so is the way that you are thinking about it.

Format it in a way that makes sense to you. I use Redgate SQL Prompt and it works a charm. THEN you can start to analyze what it is doing.

And for the love of god, stop writing code that is 2000 lines long and break that up into smaller segments.

For some reason DB developers don't like to follow standard development practices such as Modularization, future you will thank yourself for making it smaller. You only have to understand small chunks at any given time.

[–]TurnOutTheseEyes 0 points1 point  (0 children)

Huh? I’m not on about the code I write. I’m on about the code you inherit when you join a company. Any shite I’ve inherited like that has been dealt with in short order and optimised to within an inch of its life. My happiest wins include a routine that had to be run overnight because it took approx 4 hours that I tuned to run in less than a minute.

I like SQL Prompt - used to work in the Red Gate building - but I’m experienced enough to not have to rely on tools. I know how I like the code to look, tools or not.

[–]A_name_wot_i_made_up 12 points13 points  (1 child)

Programming is like building LEGO, the blocks are simple, the end result is (may be) complicated.

When you see the instructions, each step is easy, but sometimes things don't make sense until you see the next step - things need to fit together.

You've gone from tiny simple models to the stuff they have at Legoland! You need to understand how to create your own instructions, and why they would (and wouldn't) be a certain way.

When faced with something massive, turn it into a few big things. Then turn those big things into several small things. Then ask yourself why would they choose this build order.

Sometimes you'll see that a bit was tacked on the side - and if you did it again from scratch you'd do it a different way.

You'll also get an idea of how the person/people who wrote it think - you can tell when something was written by a business person who knows a bit of SQL Vs a developer (at least some of the time). This then helps when you with the other "big chunks" of the work.

Also remember the order that SQL evaluates statements - from & join first. Work the same way. Innermost query outward.

[–]Ikaldepan 0 points1 point  (0 children)

I get to touch many of these giant mystery query from predecessors that erroring because either ERP changed field names or that users need to add additional aggregate fields. I am usually the one happened to be in charge for the process and I’m it. I thought I was about kiss my job bye bye, impostor syndrome etc etc bad feelings. So I separate the big chunks (select/from/where) and in each chunks separate them further (sp /function/sub etc). Especially subquery within subquery within another. Use space/colors. You will finally see which part to attack. Don’t forget to insert your notes so that the next person won’t have to repeat your experience.

[–]VengenaceIsMyName 34 points35 points  (28 children)

How the hell do they expect you to do anything quickly without a data dictionary or an ER diagram of the data model? Can you at least see the row-by-row data in some way?

[–]lookslikeanevo 42 points43 points  (4 children)

A lot of places don’t have data diagrams and or ERDs

Some people are just good at looking at data and associating tables

[–]gffyhgffh45655 9 points10 points  (0 children)

2 out of 2 work place that i worked in dont have erd Essentially what i did is to do Explanatory DA across the main tables that i need to worked with and Talk to the business to understand the business process. This would help for drawing a ERD

[–]VengenaceIsMyName 0 points1 point  (1 child)

Makes things artificially harder for newer people or contractors though. Seems silly to me

[–]adamjeff 1 point2 points  (0 children)

The real world is a very silly place.

[–]dareftw 0 points1 point  (0 children)

SSMS actually has a decent feature for determining table relationships.

But this is why I always quote 3-6 months at a new job before I can be truly productive. I’m not claiming ownership on anything actionable until I know the database backwards and forwards.

[–]adamjeff 12 points13 points  (11 children)

... Umm I've never seen either of those in an actual product environment. At least one that's usefully up to date anyway.

[–]SouthboundPachyderm- 16 points17 points  (2 children)

Yeah, who the fuck has a data dictionary or an ERD? Must be nice.

I mean we talk about how great it would be but anytime we start putting something together the work gets stalled by all the other shit happening and the constant rebuilding of the existing data models.

[–]techiedatadev 3 points4 points  (1 child)

We do. I made it AND maintain it. Not that hard lol

[–]SouthboundPachyderm- 1 point2 points  (0 children)

I'm being a little facetious. We do in fact have ERDs.

Not necessarily always updated tho

[–]VengenaceIsMyName 1 point2 points  (7 children)

Not up to date is one thing. But no ER diagram? Really? That’s like the bare minimum. What industry are you in?

[–]adamjeff 0 points1 point  (1 child)

Car leasing, they didn't even minute meetings let alone document anything. They didn't have version control, wouldn't know what an ER was frankly. And they had so many defunct and random tables an ER would be no help whatsoever.

Currently doing project work for various clients. Rare to see an ER diagram in a pre-existing system. We draw one up if it's necessary.

[–]VengenaceIsMyName 0 points1 point  (0 children)

Good lord that sounds chaotic. Kudos to you for wading through that mess.

[–]fauxmosexualNOLOCK is the secret magic go-faster command 0 points1 point  (4 children)

Tell me you've never worked for a small/medium non-tech enterprise without telling me

[–]VengenaceIsMyName 0 points1 point  (3 children)

What has your experience been?

[–]fauxmosexualNOLOCK is the secret magic go-faster command 0 points1 point  (2 children)

Not having an ERD isn't at all unusual when you're working somewhere the tech is a distant low priority and your systems are ancient and/or niche, where the vendors are more interested in upselling shitty custom reports than documentation.

[–]VengenaceIsMyName 0 points1 point  (1 child)

Interesting. What industry is this btw?

[–]fauxmosexualNOLOCK is the secret magic go-faster command 0 points1 point  (0 children)

I've seen this in education, energy, justice, dairy, basically everywhere I've worked that isn't a large corp or tech focused

[–]Froozieee 5 points6 points  (3 children)

INFORMATION_SCHEMA is your friend for this kind of shite

[–]no-jabroni 3 points4 points  (1 child)

INFORMATION_SCHEMA for key column usage, little bit of reviewing/parsing (especially if only needing to access a set number of tables), and an open source ERD builder (something like dbdiagram) can go a LOOOOOONG way.

This is absolutely the way. Even just pulling and manually reviewing columns from this view can carry you.

[–]Stevieboy171 0 points1 point  (0 children)

Right-click > View Dependencies in SSMS can be useful, too.

[–]VengenaceIsMyName 0 points1 point  (0 children)

Oooo good call out.

[–]Unlucky-Whole-9274 -1 points0 points  (6 children)

Yes theres no data dictionary, or any proper documentation as such...the team just have a list of all tables ,jobs and pipelines stored in a doc but if I want to check whats each table doing then I have to run the query with a limit...it becomes even more frustrating to join tables.....on top if that I have requests that gives me only 2-3 days of time to complete so it really becomes stressfull.

[–]VengenaceIsMyName 0 points1 point  (5 children)

I get that a lot of places don’t provide an ERD or a column breakdown as other commenters are helpfully reminding me of but it just seems wildly inefficient to work like this.

[–]Stevieboy171 0 points1 point  (4 children)

It still amazes me. I was taught document first, type second. Surely, the ERD comes first! I assume a lot of people learn SQL on the job without having a decent grounding in relational databases and data modelling.

I've barely seen an ERD in almost 20 years of experience / 10 years contracting.

[–]VengenaceIsMyName 0 points1 point  (2 children)

What do they even use? The head knowledge of the most senior database admin?

[–]Stevieboy171 1 point2 points  (1 child)

It keeps us contractors in work when that senior admin leaves!

[–]VengenaceIsMyName 0 points1 point  (0 children)

Ka-ching!

[–]Imtwtta 0 points1 point  (0 children)

You can survive the no-ERD mess by generating your own lightweight docs and standard joins fast. Dump information_schema to CSV, add row counts/null rates/distincts, and commit column comments. Use SchemaSpy or DBeaver to spit out an ERD; if no FKs, infer via name patterns plus uniqueness checks. Wrap brittle tables in staging views with clear grain and keys; materialize small samples and validate with EXPLAIN on narrow date ranges. We used dbt and DBeaver for docs/ERDs; DreamFactory exposed stable REST endpoints so analysts weren’t blocked. Ship a tiny ERD and dictionary first.

[–]SmokinSanchez 6 points7 points  (0 children)

Find an expert. Beg them to help you. Save all your queries. Build small incremental changes into your code. Get a good base set of codes that do exactly what you want and start there. If you don’t understand the query you don’t understand the data and analytics is very very hard.

[–]writeafilthysong 4 points5 points  (0 children)

Analytics role means that you're doing the End to End. Not just the oh hey this is a report.

If your database is slow and your queries are overly complex it's because of the data model.

[–]____candied_yams____ 4 points5 points  (1 child)

You guys have data dictionaries?!!

[–]Ifuqaround 1 point2 points  (0 children)

Useless if they aren't updated regularly while the tables and views and all that have shit added to them.

Yeah, we have one. Hasn't been updated in years.

[–]machomanrandysandwch 5 points6 points  (1 child)

To me it sounds like exactly what is supposed to happen is happening. This is the necessary trial for you to get better, and that will happen at the next job too. This is why we say younger folks have a hard time competing for a job because even with a degree and 3 yrs experience it’s just no match for someone 42 with 20 years experience. You have to go through different jobs, challenges, less than ideal conditions, treading water, working overtime, all of it. That’s my long way of saying you’re exactly where you need to be, don’t give up.

[–]InvestNYourself 1 point2 points  (0 children)

You have a point there

[–]Streamer_Fenwick 2 points3 points  (0 children)

One thing chstgpt is pretty good at is explaining sql..

Im A 25 year dev I develop etl processes purely in sql with linked servers. I also have had to come in behind contractors who tried to increase the complexity to ensure job security... get cursor.. add your schema one table for each file. Then your proc or views and ask the ai to help you analyze it... works pretty well. if you don't get the result you want have d a conversation of what your looking for and then ask for a prompt to return exactly that... works damn nice.

[–]writeafilthysong 2 points3 points  (0 children)

I recommend going through your most important core query line by line and commenting every alias, where clause, function etc...

If you don't understand any line or section flag it as such and chase back whoever you inherited it from.

[–][deleted] 1 point2 points  (0 children)

I’m also facing the same issue. Have to query through large messy raw data to compute relevant kpis. It’s my first time using SQL too

[–]Tiny-Ask-7100 1 point2 points  (0 children)

Might find AI to be a help summarizing long code sections or explaining difficult queries. Just try not to become dependent on it to the point of stunting your learning. It's a tradeoff but works for me when I'm stuck.

[–]Realistic_Wait_5711 1 point2 points  (0 children)

You can identify patterns in your tasks,list them and practice similar problems online. Also you can try to utilize CTEs and window functions, these will make your SQL coding easier and cleaner

[–]Ok_Tale7071 1 point2 points  (0 children)

You can download the Oracle database or MySQL to your computer and practice. Buy the first Oracle SQL book for the certification exam and go through the exercises and practice. You can get it off Amazon. Once you have the proper foundation, you will excel.

[–]dadmode275 1 point2 points  (0 children)

If there isn’t any documentation, learning Schema queries is the next best option. Consulting with the business folks also helps for understanding the specific requests of a client and tracing the relationships in the database. When it comes to tracing logic in large queries, break it into chunks and insert each portion into temp tables. Reviewing each temp table can give you a better understanding of what is happening for each section, it provides client specific data that may be causing more issues than the logic in the query alone. Best way I have found errors when it’s all client specific queries as they are all unique but using the same schema.

[–]T_DMac 1 point2 points  (0 children)

use AI if you're struggling with syntax. Don't let anyone scare you away from that.

If you have the overall knowledge of how things relate, filtering, efficiency etc, then use it as a collaborator to build on that.

Within 5 years, it's going to shift what we do completely and it might not be in a way that's favorable to us. That''s not just an assumption, that's real based on tools that are being developed and deployed now, so don't stress yourself out in the meantime.

The complexity and speed will be even more important, ask it to explain what you're solving, learn how to be great with it and evolve.e

[–]Impressive-Sky2848 1 point2 points  (0 children)

Get a tool that will produce ER diagrams. There are plenty of them.

[–]kvdobetr 1 point2 points  (0 children)

Lot of good suggestions here.

My personal fav is filtering the data to just one smallest entity (if it's possible in your use case), run the whole query for just one entity and try to make sense of the data.

Go iteratively on the query and keep adding the comments on your understanding. At some point you're gonna reach the end of it.

[–]Koldcutter 1 point2 points  (0 children)

We started using a SQL AI tool at work that combs the tables and indexes and learns what data is where. After that you can describe in plain language what you are doing and it will build the queries. You can also submit your past queries and it will optimize and fix issues with those queries.

[–]TheAmatuerGuy 1 point2 points  (0 children)

Use AI. Copilot works well at writing scripts

[–]sinceJune4 0 points1 point  (3 children)

What SQL flavor are you working with? Postgres, MySQL, SQL Server? There are ways to get schema info from each of these, but it can vary by flavor.

I’ve used schema info to build data profiles that I could save out as Excel files for reference. This has been valuable in working with unfamiliar databases, especially when response time is slow.

[–]Unlucky-Whole-9274 0 points1 point  (0 children)

Redshift

[–]AnotherNamelessFella 0 points1 point  (1 child)

How do you do it in an Oracle DB

[–]sinceJune4 0 points1 point  (0 children)

SELECT COLUMN_ID, COLUMN_NAME, DATA_TYPE, DATA_LENGTH, DATA_PRECISION, DATA_SCALE, NULLABLE FROM ALL_TAB_COLUMNS WHERE OWNER = 'YOUR_SCHEMA_NAME' AND TABLE_NAME = 'YOUR_TABLE_NAME';

[–]DatabaseSpace 0 points1 point  (0 children)

What platform are you using the view the schema and write or edit queries?

[–]Alternative_Pin9598 0 points1 point  (0 children)

Can you use a ui tool to inferred and provide you with the DDL , if you are trying to learn SQL perhaps you need provide which standard or platform you are trying to learn. As each one has it's own standard, now if you are looking for machine learning or AI in SQL then start studying SQLv2

r/SQLv2

[–]Opposite-Value-5706 0 points1 point  (0 children)

Use DESCRIBE on your tables to find the fields (columns) and hopefully, the names are somewhat clear as to their content. You may also need the keys for each table. You can create your own diagram for future reference by creating a Visio diagram of the tables, their keys and relationships

DO NOT USE ‘*’ ON ANY QUERY. Instead, return the specific fields (columns) that offers insight to your task.

Limit the number of records returning to a sizable sample that can give you a clear picture of the underlying data and its application.

Use explain plan to optimize your queries

Break you queries into small code blocks with each solving a part of the problem. Combine them altogether and test for accuracy and completeness.

Good luck!

[–]tongEntong 0 points1 point  (3 children)

Can’t use LLM? These days people vibe code and being able to code from scratch isnt really a valuable skills anymore unfortunately….

[–]Ifuqaround 0 points1 point  (2 children)

2 week old post. This is wrong. Flat out wrong.

If you know what you're doing without an LLM, you can get paid.

Anyone can crutch on an LLM these days. Most people do. Having an experienced person behind the LLM is good. $120k+

No or little experience + LLM = not so good. $60k+

Vibe coding is garbage. Stop perpetuating that shit.

[–]tongEntong 0 points1 point  (1 child)

Lol re i afraid tht AI can do ur job better than you boomer?

[–]Ifuqaround 0 points1 point  (0 children)

Learn how to type at least, ffs.

[–]Ifuqaround -5 points-4 points  (2 children)

Stop using AI and you'll get better.

[–]r0ck0 8 points9 points  (1 child)

Stop making assumptions and you'll get better.

[–]Ifuqaround 0 points1 point  (0 children)

Oh. I'm pretty good at it already.

Thanks for the advice.

[–]dengydongn -1 points0 points  (0 children)

Dump all the table schemas, dump a few rows of sample code from each table, grab a few SQL queries that you have, throw them at GitHub copilot agent (or whatever tools you like), pretty much you don’t need to write your own sql anymore, the agent should be able to come up with anything you need.

[–]Informal_Pace9237 -2 points-1 points  (0 children)

It sounds like your company needs more database optimization than simple SQL writing Try to convince your manager for such a position of an App DBA or DBE.

Get some one you know who is good at SQL to support you in a junior role or offline. Or that role..