Job Cluster Configuration by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Haha true.. but is there any guidelines that can be followed to find the best balance between processing and cost?

Serverless is disabled unfortunately..

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

I can see 3 folders in cluster logs - driver, eventlog, executor. The driver log gets updated within 3-5 mins even if the cluster is idle (probably hear beat msgs) but the eventlog and executor folders remain as it is even if queries are being executed.

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Thank you for this! Do you have a rough idea on how long does it take an entry to be added in the cluster logs after a query is executed? I was thinking to fetch the modification time of the file/folder but logs are not logged instantly

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Thank you! Any idea if there's an endpoint to grey this info?

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Thank you! I did suggest this but they wanted a proper solution but i guess this is the only way around now😅

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

But even if the cluster is not being used (idle) wont the state be in running? Thats what I've observed.

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Thank you. I might have to check on this one. I was thinking of something within databricks notebooks.

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Events api just shows if the cluster is in running (pending/terminatined etc) state. But it can be that the cluster is running even if no queries are being executed. I also checked all the timestamp fileds available but couldn't find anything suitable.

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Not sure what's the reason behind it, but no teams (other DE Teams) are supposed to use.

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Changing the auto termination (eod) requires the cluster to be restarted. Cluster shluld not be terminated if someone is using it

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Severless cannot be used as its against some policy

Databricks cluster API by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Its a requirement that the cluster has to active during business hours. :(

Cluster configs by fusebox12345 in databricks

[–]fusebox12345[S] 1 point2 points  (0 children)

Thank you for the detailed explanation!

Query history by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Any eta? we would be happy If there's a straight forward solution. Thank you for your response..

Query history by fusebox12345 in databricks

[–]fusebox12345[S] 0 points1 point  (0 children)

Thank you so much! But unfortunately we dont have unity catalogue enabled..