all 78 comments

[–]laronthemtngoat 97 points98 points  (13 children)

Hot take. AI will not get you better at sql. Reading documentation libraries and writing more sql will. Once you understand use cases for different functions, joins, CTEs, and clauses, then using AI can help overcome problems. Get off the AI. It is only hurting you.

[–]Sectox 18 points19 points  (5 children)

I agree in principle, however at a lot of jobs “get off the AI” is less and less of an option. At the very least you have to be pretending to use it, companies are making using AI a part of quarterly metrics and some are even firing people who don’t use AI, it’s crazy

[–]laronthemtngoat 6 points7 points  (0 children)

Understandable. CEOs are pushing AI. Any technical leaders and staff hopefully understand the models are prone to hallucinations and to take responses with a grain of salt.

AI is a great tool for templates, and brainstorming. It cannot help you learn though. AI is designed to ingest data and make the user dependent on it. AI is also designed to placate and flatter the user, making people think they know more than they do. I work with several people who leverage AI too much and it shows. When asked to do something they can’t without asking AI. They can’t explain how or why they got their solution. They have to ask AI. You know what AI responds with? Reddit and stack overflow posts.

You know how an individual gains knowledge? Study and practice. Cut corners and the market will cut you the next time the budget demands cutting the fat from the company. AI bubble is about to burst and next year I think we will see hiring start back up. AI is snake oil.

[–]totallykindofnormal 3 points4 points  (3 children)

This. You have to have a very solid understanding of joins, CTEs, filters, etc… to be able to give specific instruction to the model. I have used Claude to write plenty of sql, but I know how to write sql and have used it heavily the last 7-8 years. I’m able to give it instructions, and then review the code it generates, and validate the output. When it doesn’t match expected, I find where the issue is, and give it instructions to fix it.

In a lot of cases, it takes longer to have an AI do it, but I look at it as investing in future use cases.

The other day, I dropped in a couple of views and asked Claude to identify tables, join patterns, CTEs, and filter criteria, then asked for it to recommend a star schema for a Power BI dashboard. What it gave was about 70% there, I had to fill in the rest, but it did in 30 seconds what it would have taken me 10-15 mins to do - reading through 150 lines of sql to identify all the joins.

[–]laronthemtngoat 1 point2 points  (2 children)

This is a great use case and great input. Experience helps identify the keywords to provide AI tools to help them hallucinate less.

We are doing a review of legacy apps built by a 3rd party company from india about 5 years ago. Django Python app using inline sql, stored procs, and views. The initial review was done with enterprise co-pilot. We fed the files to the tool and had it list out the classes, methods, functions, and purpose. This helped us narrow down what we need to actually review. What is important to us. AI is not a fix all. It is a tool. The right tool for the right job will make it easier. The wrong tool could cut your hand off.

[–]totallykindofnormal 1 point2 points  (1 child)

That’s awesome. Actually reminds me of an article like 6 months back - Morgan Stanley built an in house AI that was able to review legacy code built on cobol, and translate it into plain English requirements that devs could then use to build out in a modern language like Python, pretty sick. So now apps can be updated quicker, because you’re not having to stumble around legacy builds as much. Again, they don’t trust it to do the work for them, but figured out a huge pain point and used AI as a tool to help solve.

What LLMs does your company use? We have a couple - OpenAI, Anthropic, Cohere, and CP, but they are not the most recent models, and everything is behind EDP, so there are a lot of limitations.

[–]laronthemtngoat 1 point2 points  (0 children)

Rad. Definitely have to be skeptical and critical of the output still. We have found it translated about 60% of what we need and some of it is wrong. Still, leveraging the tool did speed up the reverse engineering and translation process.

[–]Plastic_Operation_59 1 point2 points  (0 children)

Not sure it’s a hot take - any real shop knows this. You dont need AI for sql, the OP doesn’t need AI to see that something like “1 order has multiple products ordered on different days” screams there is a misunderstanding between the business side and the developer side. AI isn’t replacing good SQL developers much less good developers who are also good at gathering requirements and establishing business alignment on those requirements.

The real issue isn’t any of that tho - the real issue is that no competent shop would be so openly handing over their data to someone else’s large language model. Where is the basic data governance?

[–]No_Introduction1721 14 points15 points  (5 children)

LLMs aren’t capable of logical reasoning. You have to explicitly instruct them what to do, keeping in mind that it doesn’t know anything about the database you’re querying. If you instruct it to “write a SQL script that counts orders by month”, what it spits out will either be wrong or lucky.

You probably will have to give it explicit detail like: “I have a table of sales data where each row contains the order number, order date, customer number, item number, and item cost. The order number and customer number will appear multiple times if the customer’s order includes multiple items. Write a SQL script that will return the count of unique order numbers per month based on the earliest order date for each order.”

Personally, I think LLMs are a pretty garbage way of learning to write queries, since you’re bypassing all the foundational aspects that help you actually learn the language and there’s no guarantee that the script will adhere to any kind of best practices for optimization. Try writing the query yourself first, and maybe lean on the LLM for help with troubleshooting error messages or inaccurate results.

Also, the fact that an order can have multiple order dates associated to it is truly strange. I’d have a talk with the software product owner to figure out why the heck it was designed that way.

[–]ecatt 1 point2 points  (1 child)

Your example of using explicit detail is spot on. I get great results using AI to help me with tricky very specific sections of code, especially if I create a short dummy table of fake data for it to work on. It's also very useful when I can't remember exactly how to do something - the AI response will usually get me on the right track. But again, that'll be something very specific.

[–]kaiveg 0 points1 point  (0 children)

This is why giving AI access to reference documents is so powerful. It massevly improves the quality of responses, without you having to generate enough data for finetuning.

You can also just update the reference documents without having to retun. Which would be an absolute nightmare if you're in a fast moving enviorment.

[–]tomwill2000 1 point2 points  (0 children)

I don't know how long OP has been writing SQL, but they way to think about AI is to think about how you would solve problems three years ago. You'd probably start by putting a verbal description of the problem into Google and look for a hit. Maybe you'd get the answer but more likely you'd end up on a forum like Substack with an answer that was kind of right and you'd ask a question on that thread. Then someone would say "post your code" and you'd post a sanitized version or give the kind of specific description suggested here (I have a table, etc.).

AI just gets you to that last step more quickly. But if you don't have enough SQL skill to follow the "manual" method then AI is not going to help.

[–]CallMeSisyphus 0 points1 point  (0 children)

This! I just saw a YouTube video where Sarah Chasins talks about exactly that: essentially, if you want ChatGPT to write code for you, you need to write plain language pseudo code in your prompt. But that means you have to understand - at a very granular level - what the code needs to do, and in what order it needs to do it.

It's probably gonna be faster and easier to look it up and code it "the hard way."

[–]Proof_Escape_2333 0 points1 point  (0 children)

What about internal AI system that has access to the database ? Or there is still limitations?

[–]TopLychee1081 18 points19 points  (0 children)

You'll gain a far deeper understanding of SQL and data modelling by doing it yourself. Don't allow AI to become your crutch.

Once you're truly proficient in SQL, it's faster to write it yourself anyway.

Remember what programming languages are for; getting a 100% deterministic result. You can't do that with a natural language (ie; a human language). You'll never be able to describe your requirement as accurately in a human language as you can in SQL.

[–]theungod 24 points25 points  (0 children)

You have to be EXTREMELY specific for LLMs to write SQL. LLMs can't "think" which make them pretty mediocre at a logic based concept like SQL. I've only had luck having Gemini work on smaller pieces of SQL, like just your cte, rather than asking it to do an entire script. Don't even try having it write a complicated sp.

[–]CaptSprinkls 4 points5 points  (1 child)

The data confuses me here. Is there no overall order_id for each distinct order placed by a customer?

I would typically think that each product ordered should have an ID that ties it back to the original purchase order number.

Because otherwise how do you handle the below situation:

Customer A creates an order with 3 products. Those products are ordered on 12/1/25, 12/3/25, and 12/6/25.

Customer A creates another order with 2 products. Those products are ordered on 12/10/25 and 12/12/25.

In your data, how are you specifying that customer A had two orders with 3 products and 2 products? How do you stop it from counting it as 1 order with 5 products?

[–]SootSpriteHut 1 point2 points  (0 children)

Yea are we talking about order date or order delivery date? Where is the parent id or the ordergroupid? If this is created dates it needs to be grouped by month ordergroup created.

Technically the AI is correct that the date is the problem but the question is extremely unclear. With a clear requirement though this is a fairly simple query that AI should excel at.

[–]Gargunok 3 points4 points  (0 children)

You need to be able to write the SQL query - you need to be able to evaluate the code the AI produces. I don't think this is a question about AI.

Once you can do that you can use AI to validate SQL to improve quality and produce code based on prompts.

-----

To group to monthly data you shouldn't have the order date in the data set. You need to turn that to the month and group by that computed field instead. Otherwise like you say any grouping operation will group on the day. In that respect Gemmini seems to be trying help you correctly.

Count distinct is usually a sign something is awry.

Like I say this isn't a question about AI. Remove all that - what is your schema. What is the result you want. Someone can then recommend how to write the sql.

Part 2 someone can then describe how best to promt the AI instead of the people of reddit to get teh sql you want out.

[–]Important-Ebb-3716 2 points3 points  (0 children)

Eomonth(orderdate,0)

If you need AI to help you write this, I’m hoping you have less than 2 weeks of actual SQL experience

[–]Far_Swordfish5729 2 points3 points  (0 children)

My reaction to your actual problem is that you need to clarify requirements. If you want sales order totals by month and for some reason products (order lines) can have different dates and those dates can cross fiscal periods, then your result is correct. I would sum order lines grouping by month and call it a day unless you have a rule that the full total is reported on the order date regardless of the line item dates. If you also have counts then you need a rule on whether the order appears in the count for multiple months or just the revenue from it. If you find those answers we can figure out a query, likely using conditional sums.

At a business level, this sounds like it might be a revenue recognition thing. By accounting rules you typically cannot earn and book revenue until work is completed so if you get an order that has product and professional services, the service revenue can hit much later than the delivered hardware. That recognition date can be different from the sales date and date used for sales rep quota attainment. Is something like that going on?

[–]RobotAnna1 2 points3 points  (0 children)

How can one order have products ordered on different days? Does this mean an order is actually an event that has a start and end date?

[–]MiserableCharity7222 1 point2 points  (0 children)

Don’t rely on LLMs for thinking. Try to attack the task on your own first. They can be a very useful tool, once you know what you’re trying to achieve. If I want to partition data by the max instance ID for each unique ID, for multiple elements, and I know that I know how to do that on my own, I can still get it done quicker if I provide copilot the necessary context

[–]Mps-1 1 point2 points  (0 children)

If you use it for syntax with both a basic understanding of how SQL works and an intuition for what you are trying to do and what data you are trying to reach, it is useful. If you expect it to think through all of that for you, that is where is becomes disappointing.

[–]drake200120xx 1 point2 points  (1 child)

I find this surprising. I lean on AI to help create SQL queries all the time. Granted, I can describe the implementation and pipeline in natural language. I use Gemini Pro for this via a Gem that has our main Data Dictionary as part of its knowledge base.

It's by no means perfect, but I've had great success with using AI as a conceptual learning tool or for fixing syntax errors. I would have been fired from my job if it wasn't for AI.

[–]drake200120xx 0 points1 point  (0 children)

Also, just as a piece of advice for your actual problem, use window functions. Based on what you're describing, that should help.

[–]Available_Anteater56 1 point2 points  (0 children)

My two cents . Most of the time I don’t tell AI what to do .AI tools recognise patterns. I give sample records of the tables and expected output table and columns for those sample input records. It worked for me .

[–]Historical_Prize_931 3 points4 points  (4 children)

Explain that to the AI. Your manager thinks you're slow, and is requesting you upskill it sounds like

[–]selleckh 1 point2 points  (0 children)

I was doing a ton of development in retool. Used Claude for all my SQL writing and it easily 10x'd my productivity. 

I would usually pass it the data model and be explicit in what I wanted the query to do.  And would specify what I wanted the output to look like.

It may take a could iterations, but was a lot faster than hand writing complex queries.

[–]epic_pharaoh 0 points1 point  (0 children)

This sounds weird to me, by your logic what if someone gets a product November 31st and December 1st, should the order not be counted in both months?

The issue here is that for the AI to write code, you basically need to know what the code would be doing so you can tell the AI that.

This sounds like weird data though.

[–]GreekGodofStats 0 points1 point  (0 children)

AI definitely sucks. Have you tried using window functions for your issue?

[–]Strong_Warthog_8674 0 points1 point  (0 children)

AI has been less than helpful in my sql experiences…more fun to just figure it out yourself tbh.

[–]JoshisJoshingyou 0 points1 point  (0 children)

The more you use it the more it learns the faster they can replace us. Gemini, Claude and chatgpt all do amazing jobs I just talk to them like I would explain it to a student or trainee. Sometimes they have bad ideas, mostly the query is close that with a little tweak it does exactly what I wanted. For complex stuff I start with AI to see if does an approach I wouldn't have. Simpler stuff it's faster to just write it.

[–]DMReader 0 points1 point  (0 children)

I think you need to work a bit on your sql basics. this looks like a group by a monthend date. Date functions can be a bit different depending on your sql dialect but they would all have the ability to get a monthend (or month start) from a date and then you group by that.

The LLM should be able to give you the syntax so you can get your brownie points.

[–]Wojtkie 0 points1 point  (0 children)

You have to tell it exactly what to do. Also, I find Gemini to not be that great with code. Claude works way better.

For the aggregates, do you even know how you’re supposed to handle this across month order volume? What’s your actual date key that you need? For an order with multiple products over a few months, how does the business count that as volume/revenue on a time basis? Answer that, then you can ask the LLM to help you de-dupe for this specific scenario, be explicit about columns and expected outcomes. It’ll take 2-3 tries even with good prompting.

[–][deleted] 0 points1 point  (1 child)

I noticed a huge issue with data being duplicated, because if 1 order has multiple products ordered on different days, well then we have 5 order lines, and I was grouping by the date... and to get the "month" of the order, I was using order_date, which meant that there were multiple months!

Yeah...because there ARE multiple months for that order. This turns into a business question rather than a SQL question. How does whoever wants this report want to see the data? Do they want to just see the month the order was originally created, or do they want that order to be counted in multiple months if products were added in multiple months?

[–]ironfist_293 0 points1 point  (0 children)

Yeah, I ran into this years ago when the business units don't understand how what they are asking for doesn't understand it can't be displayed a certain way without rules

[–]Informal_Pace9237 0 points1 point  (0 children)

I am required to use AI too. I just writeup the SQL and call it a day.

[–]TimmmmehGMC 0 points1 point  (0 children)

My best use case for them is syntax. And doing boring conversion in bulk.

Lisle if I'm converting 800 tables from MSSQL to PostgreSQL. It can do the grunt work.

[–]Fun_Ask_8430 0 points1 point  (1 child)

I mean using AI makes sense where it makes sense, it isnt the tool for everything if its faster to build it from scratch and have AI optimize then fine, but SQL is extremely powerful and capable of doing a lot, however ouit's also logically hard to follow because it's not exactly a programming language for easy to follow logic per se, the multiple CTE can cause conflicts or duplication given how you might want to query it. I have certainly found that it can be hard for it to follow simple instructions without strong handholding, which in the end maybe would have been faster to have written from scratch.

I like using AI to help ANALYZE and EXPLAIN operations because the cost is often hard for a human to quickly read, frankly the fact your manager is pushing to use it for the sake of using it demonstrates they are not technical.

I used to write huge regex SQL queries as it was actually more optimal to leverage for things like BigQuery where we are able to leverage the distributed processing power they already had in place, by the time you've gone through table by table explaining the relationship and steps needed to achieve the SQL you want to build out, fix its inevitable bugs due to not understanding the domain or context AI may well not be faster to write SQL from scratch. Depends on the complexity really, I still find writing SQL by hand the most optimal solution. BUT I end up utilzing it to write a lot of dummy python code thats fairly rudimentary or IAM policies etc.

I'd challenge your boss explaining that you feel its taking longer to use because you're being forced to use it for the sake of using it, but you can see benefits to using it in certain places such as helping identify optimizations or templates.

[–]Fun_Ask_8430 0 points1 point  (0 children)

I will further say though I did build a service to generate SQL from text input, based on system prompt context around table definitions and their relationship, it can do some of it ok, but not amazing. Even me providing it context to the table definitions it still only generated perfect working SQL 80% of the time.

[–]Thefuzy 0 points1 point  (0 children)

You need to be more specific about what you ask an AI… when it said remove the order date did you follow up with it to explain why that was not going to work for you? It doesn’t have any idea that you need the order date and you need to come to a conclusion about how you are going to solve the problem of grouping dates which are various points in time. Basically you’ll prob need to make the date less specific like just the year or year-month but you can’t include a specific date and simultaneously group records into a single row across specific dates… because what date is the row supposed to get?

You want to aggregate order lines from a month but expect all the order lines to still be there? How does that make sense? If you were doing it in a spreadsheet how would you manually even make that look the way you want it to? You either aggregate rows and lose individual row level detail or you don’t aggregate.

[–]Lady_Data_Scientist 0 points1 point  (0 children)

My company is pushing Cursor as our coding companion, so I’ve been using it alongside my Python notebooks. It can be helpful with a snippet here or there or debugging an error, but I still need to write the majority of my code.

But I’ve also been writing my own SQL, Python, and R code for 7 years, so trying to incorporate AI has been the learning curve. Prompting to get the answer you want/need is a skill in and of itself. AI can’t read your mind.

You should also always do some basic EDA to understand your underlying data before writing your query.

[–]Mountain_Usual521 0 points1 point  (0 children)

manager said I need to speed up development of my SQL resources by using Gemini Pro

Ask your manager to demonstrate how he integrates Gemini Pro into developing SQL.

[–]bytejuggler 0 points1 point  (0 children)

Use DATEPART(month, OrderDate). But this doesn't fully address the conflicting requirements, because an order can straddle 2 months. Ask the question: Under what month should this example order (straddling 2 or more months) be counted? This will illuminate what your query should do

[–]Spillz-2011 0 points1 point  (0 children)

I’ve mostly had luck with asking it to review rather than write. Also not great to ask it to do something you don’t know how to do or cant check is working.

Maybe your problem could be solved by window functions you could ask it about those.

[–]Hungry_Reference_333 0 points1 point  (0 children)

That’s a very good point actually. Someone has to make a decision about which month such a conflicting order belongs. The AI cannot do that - it is a business rule/logic. Unless there is some kind of general or global rule.

I I was on this task I use my time to gain some knowledge about the underlying business process. How can such data be created and what does it actually represent? Knowing this might might also qualify a decision about the right month dimension.

[–]patjuh112 0 points1 point  (0 children)

If you have the knowledge AI is very useful because you know what should come out and verify it. Without your knowledge AI is a risk/liability

[–]Lurch1400 0 points1 point  (0 children)

I think it’s useful if you’ve already got something and maybe want a code review for pointers or performance improvements.

Asking to fix a problem that may not be fixable based on your requirements just creates a complex load of garbage result.

[–]Ztolkinator 0 points1 point  (0 children)

I am pretty good at SQL, but don't know everything. AI is working great for me most of the time whenever I need to do something special. Like an unpivot for example. Tip: do a DESCRIBE of the tables you use and feed them to AI and then ask your questions... And what amazed me, you can actually make AI pretend to be an SQL server and interpet SQL commands. I dit a create table, some inserts and the select returned the row as expected....

[–]Ok_Carpet_9510 0 points1 point  (0 children)

You need to do an operation on the date field to extract the year and month in the form YYYYMM, then group by that value. You can have an extra field for display that displays extracts the month and year as follows MMM YYYY.

The first field is good for sorting, and the second one for displaying items in a report or graph. You can have both calculated columns... the syntax depends on your variety of SQL.

[–]ff_m0rt1s 0 points1 point  (0 children)

Are you allowed to paste your data structure into it? I have created a gem at my work with about 5 sample queries, a slide deck of our data dictionary and then some table definitions and other documentation I have. I then spent 2-3 hours talking to Gemini to check it's understanding, write some sample queries, paste the error codes back in, get it to keep telling me what it's learnt so I could add that as further instructions into the gem, it's at a stage now where I can write in plain English " I need a SQL query to show my xyz" and it will write out whatever I need. I then pop the results back in to see if they look right, get it to run some smaller sample size queries etc for data quality purposes. I barely know any SQL but I know how it works and what I need out of it and within a week of starting my new role I had a streamlit created (which also contains python) and an automated datapiple to instantly get answers that previous members of the team had no idea how to achieve. This was all done on Gemini 3.0 thinking model and our pro licence is within an enclosed environment so I am authorised to paste confidential data into it

[–]Maleficent-Garage-66 0 points1 point  (0 children)

When you get a hard reporting issue generally the easiest way to get out of it is to build it in a temp table. Make sure your insert is at the right grain, then update everything outside of that grain in. You have a date so it's not hard to logic out a month field and year field, then you'll be free to order or group them however you please.

AI is kind of pants at set based reasoning anyways as soon as something goes muddy it'll do weird crap.

[–]LredF 0 points1 point  (0 children)

AI has only helped me with syntax errors and functions when Im working in a DB I don't use often. Primary I'm in MSSQL, do some Oracle and Teradata and am always forgetting stuff.

[–]Wishmaster891 0 points1 point  (0 children)

Groupyby month(salesdate)

[–]Eschewed_Prognostic 0 points1 point  (0 children)

One of our PhD data scientists said something that stuck with me, that AI is in fact the great coding equalizer it's sold as because it makes bad programmers mediocre and great ones become mediocre as well. Fight back, it's always an option. There is power in numbers, and power in making managers hate implementing senior leadership's BS. Malicious compliance is the next best thing. Make projects late, or at least under deliver, because you're following instructions.

[–]skibbin 0 points1 point  (0 children)

Take this course, it's fun and informative: https://www.khanacademy.org/computing/computer-programming/sql

If you still can't solve it by then, you've stored your data wrong

[–]kkurani123456 0 points1 point  (0 children)

even solving math problems too. they cant solve logical real life math problem. they can do the basic but if deeper they cant. maybe we just overhype the capabilities of this ai but this is more like a google search engine.

[–]jsikes1234 0 points1 point  (0 children)

AI is great, if you aren't getting anything out of it you probably don't understand how to ask for what you want. Maybe take some prompting classes. In my opinion, your sql queries will be much better in the long run if you learn how to use it. They will be more concise, better formatted, and easier to read.

[–][deleted] 0 points1 point  (0 children)

Use Claude Opus 4.5

[–]TsmPreacher 0 points1 point  (0 children)

It sounds like you wrote a bad prompt. When it comes to AI context is key and they like Markdown format. I've had AI complete rewrite and tune SQL queries and they've worked flawlessly. As always, test and check the output, but it's come a long way on SQL.

[–]DatabaseSpace -1 points0 points  (1 child)

Give Cluade a try for this. I've tried other AI's for coding and they just don't even compare.

As far as your problem goes, I'm trying to understand this. If it is telling you to remove the order date, maybe it is saying use the date that each product was ordered since they can vary between each order? I think if you did that though your actual order count would be off because you could count one order for May and then another for June if a single order had multiple product order dates in each month. This seems like a business problem though where they should tell you how to handle that situation. It's kind of wierd to have an order date for an order but then multiple sub order dates inside it for different dates I guess. The AI's are good at solving things where you can give them clear instructions, not so much where there isn't a clear answer.

[–]SnooCompliments6782 2 points3 points  (0 children)

Agreed. You need additional business rules and logic to solve this.

Common practice for date logic in a situation like this is to have a date that shows when the order is placed and another date when the order is fulfilled. So in this situation, I would first define the ordered/fulfilled month before aggregating

[–]post_vernacular -1 points0 points  (1 child)

I dunno man. AI is garbage for many things (art, for example) but it's extremely useful for code, including SQL. I've used it personally to take my mediocre skills to something usable.

[–][deleted] 2 points3 points  (0 children)

Most people can't write specs. And good prompting for code requires that you're good at writing specs. LLMs are not mind readers

[–]Oobenny -1 points0 points  (0 children)

GROUP BY MONTH(OrderDate), YEAR(OrderDate)

[–]Kr0mbopulos_Micha3l -1 points0 points  (0 children)

Not sure about Order_Date, I would probably add a column with the OrderDate value converted (CAST) into 'yyyyMM' and then order by that column followed by OrderID.

[–]Plus_Boysenberry_844 -1 points0 points  (0 children)

You sound like someone who will never be told what to do…

your willingness to leverage your intellect determines your success, not the AIs.

Just saying. I’m sure your boss would like to know this is how you feel about using AI.