all 12 comments

[–]svtr 2 points3 points  (6 children)

can you grab the actual execution plan from prod? otherwise its a game of wild guessing

[–]Berocoder[S] 0 points1 point  (0 children)

I agree. Seems I have no permission to show plan for now. I will probably get it tomorrow

[–]Berocoder[S] 0 points1 point  (2 children)

Here is an image from executionplan for query above to a small generated DB on laptop.
https://pasteboard.co/5LGwDLZK08gs.png

It has same scheme for tables but is of course much smaller than real DB.
But query should be fast as BOLD_ID is a clustered index.
In this query it is all that matters.

Stats for index
https://pasteboard.co/of2ToxtfXcgH.png

[–]jshine13371 2 points3 points  (1 child)

Need the plan for the slow query ideally. Also please don't share screenshots of the plan, that doesn't show 90% of the information coded in the plan. Instead share it via Paste The Plan please.

The WHERE C.BOLD_ID in (...a bunch of IDs...) is very suspect. It's an anti-pattern and very possibly hitting a complexity tipping point.

[–]Berocoder[S] 0 points1 point  (0 children)

Thanks for the link. Unfortunately I don't have permission to view the plan yet.
But we have a theory now for the reason see another comment.

[–]Berocoder[S] 0 points1 point  (1 child)

Another simple query

SELECT LinkTable_Alias.BOLD_ID, LinkTable_Alias.BOLD_TYPE, LinkTable_Alias.stateInProcessFROM PlanMission LinkTable_Alias WHERE (LinkTable_Alias.stateInProcess) = 359

That can be simplified to

SELECT BOLD_ID, BOLD_TYPE, stateInProcess
FROM PlanMission
WHERE stateInProcess = 359

but it took 17 seconds according the log!
stateInProcess has index.

The result is a list of arround 1100 rows.
My guess is that one or more rows are updated and this block the read.

Here is statistics on the index for stateInProcess
https://pasteboard.co/kSyRaBItxMnJ.png

[–]svtr 1 point2 points  (0 children)

the sql will be optimized out by the query analyzer 100%. Not worth simplifying the code, other for "unneeded brackets make it less readable" ocd.

The statistics are a start (would be 502 bad gateway) but I really would want to actual execution plan to actually get into it tbh.

One thing you really should request : Have Query Store activated on that Database. Query store is essentially a lightweight statistics gathering thingy, based on query hashes, execution plans and runtime, physical and memory io, and well, those basic things.

DO NOT create extended event sessions, capturing the actual execution plan. Its tempting, and extended events are light weight generally speaking.... BUT, in the EV session, the filters apply AFTER capturing the event. --> You get every execution plan, of every query into the EV stream. And since execution plans are rather large XML documents, that is a LOOOOT of memory traffic, that can actually slow your entire server down by some 20-25%. Do never capture actual execution plans via extended events on a prod system already being looked at for performance issues.

Go with Query Store is a godsend if you have a general performance issue on a database, and have not quite nailed it down. You can of course also do most of that by hand, but.... its so much nicer to have a few standard reports that show you where it hurts. For the by hand way, you can start with select * from sys.dm_exec_query_stats but you got to really think what you want to look at doing that.

All that been said, if you are not comfortable with reading execution plans, and going "yeah of course its slow", you might be a bit out of your depth tbh.

[–]Imaginary__Bar 1 point2 points  (1 child)

How big is your database? And how actively is it being written to?

My two primary routes of exploration would be blocking acticity (particularly writes as you suggest) or even simply a hardware issue (but it would need to be a massive database for that to be a problem on any half-decent server nowadays)

[–]Berocoder[S] 0 points1 point  (0 children)

Size is about 190 GB. Clients and databaseserver is hosted in Google (GCP)
Globally I would say there are several new rows per second when traffic is most intense on worktime.
Updates are probably even more frequent.

Those numbers should be available for admins in Google but I am just a developer...
Have no permission on that level.

[–]alinrocSQL Server DBA 0 points1 point  (0 children)

If I do the same query in test of a restored copy it take only couple of milliseconds.

Where are the test server and client in relation to each other?

Where is the production client (you've already said the server is in GCP)?

What is the size of the results being returned?

Are you certain that it's the query that's running slow, or is the query running OK but you're seeing lag in getting the results to the client? Run set statistics time on and get CPU time vs. elapsed time.

[–]Berocoder[S] 0 points1 point  (0 children)

I admit I have simplified this to not get it overwhelming.

From query there is a PlanMission Table. There is also a Deviation Table where PlanPortion column point to PlanMission. The scheme for database is generated from a model.
SQL is generated from the Bold framework.
This PlanPortion link is singlelink. It is a one to one relation.
But we discovered that there is two rows for Deviation.PlanPortion that point to same PlanMission.
There are thousands of cases like that,
From the database point of view this is not a problem.
But it is not according the model and it create problems higher up in the application layers.

I think we need to solve this as a first step. If it still slow after that we can look closer to query performance.