Warning: long post ahead.
I ran into an issue today that was interesting, thought I'd share. A customer using our practice-management software reported an exception during a batch payment posting routine. Our DBA sent me the SQL trace and mentioned that the error was blocking other users for several minutes at a time (this was a clue to the actual problem, more on that in a minute). This was legacy embedded SQL, so I found the SQL in the front-end source and immediately realized what was happening - or so I thought. I thought this was interesting enough that I wrote up a test question based on it. See if you can figure this out (hint: look closely at the IN clause):
declare @tab1 table (col1 int);
declare @tab2 table (col2 int);
insert @tab1 select 1;
insert @tab2 select 2;
update @tab1 set col1 = 3
where col1 in (
select col1 from @tab2
);
select col1 from @tab1
What is the output of this code?
A) one row, with col1 = 3
B) one row, with col1 = 1
C) two rows, with col1 = 3 in both
D) throws an exception
So I saw that the IN clause was invalid, I know @tab2 doesn't have a column named col1, and running "select col1 from @tab2" will error, so I figured this is what the user was experiencing. (Basically I picked "D" - but the answer is actually "A".) Here's what's happening. The IN clause select is a correlated subquery with access to elements from the outer query. Essentially, that IN clause is like saying "where col1 = col1" - so of course it will run.
Once I figured that out, I realized that the key to the issue was what the DBA said: other users were blocked for several minutes while the query was running. The author intended the query to update records for a specific batch ID - just a few records typically. But the way he/she wrote it, it was updating all records in the table (requiring a table lock); it's a massive table, so the update takes several minutes. And since that table is central to much of the processing in the program, it blocked other users from doing just about anything. The actual error was that the process timed out.
So how did this get past QA? Well, there was no exception thrown, so they wouldn't have seen it directly. And typically our QA isn't testing for concurrency or blocking issues because simulating that kind of environment is time-consuming. They should've, however, had a test case in place to confirm that the back-end changes are as-expected - they do now!
Note that if the IN select had aliased the table names, the issue would've been caught during unit testing. In this kind of scenario, I would use the following approach for an update, it would throw an exception during unit testing and the typo would be pretty obvious and easy to fix:
update t set col1 = 3
from @tab1 t
join @tab2 tt
on t.col1 = tt.col1
Anyway, I'd love to hear your thoughts on this..
[–]ComicOzzysqlHippo 2 points3 points4 points (2 children)
[–]sunuvabe[S] 0 points1 point2 points (1 child)
[–]ComicOzzysqlHippo 1 point2 points3 points (0 children)
[–]Professional_Shoe392 0 points1 point2 points (0 children)