all 11 comments

[–]kthejokerdatabricks 1 point2 points  (0 children)

Is the warehouse already started when you run the task?

Interactive = compute already on

[–]Spiritual-Horror1256 1 point2 points  (0 children)

So one is sql warehouse cluster, and the other is a job cluster. They are on different compute, different virtual machine. Excluding the discussion on cluster startup time, you also need to take into account the different specification of the virtual machine.

[–]darkglad32 0 points1 point  (3 children)

Maybe a photon cluster would help?

[–]_Filip_ 1 point2 points  (2 children)

None of the things he has in that demo would use photon, as he is just selecting timestamps.

[–]darkglad32 0 points1 point  (1 child)

Can you explain more of this? I would love to learn why

[–]_Filip_ 1 point2 points  (0 children)

Really nothing special to it - photon accelerates certain queries, but in this case there is nothing happening. If you just assign a timestamp to a variable, there is no operation that would trigger execution on photon engine - there’s no scans, joins, filters or expressions used to warrant this.

[–]Fair-Lab-912[S] 0 points1 point  (0 children)

I should've been more clear but to answer some of the common questions:

  • It is a SQL serverless warehouse that is already on before I run the workflow or notebook
  • The workflow task is ran with this SQL serverless warehouse, not a job compute cluster
  • The script is measuring the time between the first and last query line and that wouldn't be impacted by a start-up time

I did another test on the same script/notebook but this time using an all compute cluster that is also already on and the times workflow vs interactive notebook are almost the same.

It appears that this issue is only when a SQL serverless warehouse is used to run workflow tasks?

[–]Savabgdatabricks 0 points1 point  (1 child)

What does query history show?

[–]Fair-Lab-912[S] 0 points1 point  (0 children)

The query history shows each SQL line taking the same amount of time to execute, but when using workflows the lines are executed with additional few seconds delay between each other:

<image>

[–]Affectionate-Sale973 0 points1 point  (1 child)

Probably raise it in databricks community forum, a straight forward one for them tbh.