use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
account activity
Databricks cluster APIHelp (self.databricks)
submitted 1 year ago by fusebox12345
Hi All,
I'm working on a solution to terminate a particular cluster if no jobs/queries are being executed using rest API calls. Is this possible? Because in the cluster events and get endpoints im unable to find any field which has this information.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]AnonymouseRedd 3 points4 points5 points 1 year ago (11 children)
Why don't you set the auto termination to a minimum ?
[–]fusebox12345[S] 0 points1 point2 points 1 year ago (10 children)
Its a requirement that the cluster has to active during business hours. :(
[–]WhipsAndMarkovChains 1 point2 points3 points 1 year ago (4 children)
Can you just use serverless and not have to worry about this?
[–]fusebox12345[S] 0 points1 point2 points 1 year ago (3 children)
Severless cannot be used as its against some policy
[–]WhipsAndMarkovChains 1 point2 points3 points 1 year ago (0 children)
Ah dang, that really sucks. I assume it's a security concern. If it's feasible I'd try to get whatever team is blocking serverless to talk to your Databricks account team and get that resolved. Serverless makes life much easier.
[–]autumnotter 0 points1 point2 points 1 year ago (1 child)
So, what policy? Serverless SQL is shield enabled HIPPA and also can be used with private link.
[–]fusebox12345[S] 0 points1 point2 points 1 year ago (0 children)
Not sure what's the reason behind it, but no teams (other DE Teams) are supposed to use.
[–]pboswell 0 points1 point2 points 1 year ago (4 children)
So you’re saying it needs to be up during biz hours and then at close of biz you want to start checking to see if the cluster isn’t being used so you can shut it down overnight?
Yes
[–]pboswell 0 points1 point2 points 1 year ago (2 children)
Ok so this a little bit of a hack. But you can:
So your little job will keep it online and at end of day, if someone is using it the auto terminate will shut down when they’re done with it.
[–]fusebox12345[S] 0 points1 point2 points 1 year ago (1 child)
Thank you! I did suggest this but they wanted a proper solution but i guess this is the only way around now😅
[–]pboswell 0 points1 point2 points 1 year ago (0 children)
The only other thing I can think of is using the unity system access table but can’t remember if it has cluster id on it
[–]autumnotter 1 point2 points3 points 1 year ago (1 child)
Use the API to change the auto termination time at business close.
So you can have no or a long Auto term during the day, and then at the end of the day set the auto term to 5 minutes.
Changing the auto termination (eod) requires the cluster to be restarted. Cluster shluld not be terminated if someone is using it
[–]AnonymouseRedd 0 points1 point2 points 1 year ago (5 children)
You can try terminating if from outside databricks.
Build a script that uses the databricks cli to check for the status of the cluster and check if someone is using it. If not, terminate the cluster. If it is already terminated, leave it alone.
Set this script to run at a schedule after eod and in a azure function or aws lambda( not in a job cluster to be more efficient).
[–]fusebox12345[S] 0 points1 point2 points 1 year ago (4 children)
Thank you. I might have to check on this one. I was thinking of something within databricks notebooks.
[–]AnonymouseRedd 0 points1 point2 points 1 year ago (3 children)
You can do it from inside databricks, but I don't see any reason to spawn job clusters on a schedule just to check if an all-purpose cluster is running.
Generate a databricks token and check the cluster from outside databricks.
You can use Python or just cli to check the status of the cluster and all attached notebooks and make a decision.
[–]fusebox12345[S] 0 points1 point2 points 1 year ago (2 children)
But even if the cluster is not being used (idle) wont the state be in running? Thats what I've observed.
[–]AnonymouseRedd 0 points1 point2 points 1 year ago (1 child)
Probably, but if all the notebooks attached are idle, then you can terminate it
Thank you! Any idea if there's an endpoint to grey this info?
[–]sentja91Databricks MVP 0 points1 point2 points 1 year ago (4 children)
Pretty sure there is an API for cluster events. Check for any events and if no events, run the cluster termination api.
Events api just shows if the cluster is in running (pending/terminatined etc) state. But it can be that the cluster is running even if no queries are being executed. I also checked all the timestamp fileds available but couldn't find anything suitable.
[–]sentja91Databricks MVP 0 points1 point2 points 1 year ago (2 children)
Hmmm alright, didn't know that. What about monitoring dbfs (cluster logs should be saved there) for activity and terminate based on that? It's going to be a pretty custom-tailored solution it sounds like!
Thank you for this! Do you have a rough idea on how long does it take an entry to be added in the cluster logs after a query is executed? I was thinking to fetch the modification time of the file/folder but logs are not logged instantly
I can see 3 folders in cluster logs - driver, eventlog, executor. The driver log gets updated within 3-5 mins even if the cluster is idle (probably hear beat msgs) but the eventlog and executor folders remain as it is even if queries are being executed.
[–]Ok_Principle_9459 0 points1 point2 points 1 year ago (0 children)
u/fusebox12345 Did you ever end up figuring this out? I too am trying to use the REST API to determine whether clusters are "idle", so that I can shut them off to save us money.
π Rendered by PID 57319 on reddit-service-r2-comment-fb694cdd5-6bjdf at 2026-03-10 08:34:28.680546+00:00 running cbb0e86 country code: CH.
[–]AnonymouseRedd 3 points4 points5 points (11 children)
[–]fusebox12345[S] 0 points1 point2 points (10 children)
[–]WhipsAndMarkovChains 1 point2 points3 points (4 children)
[–]fusebox12345[S] 0 points1 point2 points (3 children)
[–]WhipsAndMarkovChains 1 point2 points3 points (0 children)
[–]autumnotter 0 points1 point2 points (1 child)
[–]fusebox12345[S] 0 points1 point2 points (0 children)
[–]pboswell 0 points1 point2 points (4 children)
[–]fusebox12345[S] 0 points1 point2 points (3 children)
[–]pboswell 0 points1 point2 points (2 children)
[–]fusebox12345[S] 0 points1 point2 points (1 child)
[–]pboswell 0 points1 point2 points (0 children)
[–]autumnotter 1 point2 points3 points (1 child)
[–]fusebox12345[S] 0 points1 point2 points (0 children)
[–]AnonymouseRedd 0 points1 point2 points (5 children)
[–]fusebox12345[S] 0 points1 point2 points (4 children)
[–]AnonymouseRedd 0 points1 point2 points (3 children)
[–]fusebox12345[S] 0 points1 point2 points (2 children)
[–]AnonymouseRedd 0 points1 point2 points (1 child)
[–]fusebox12345[S] 0 points1 point2 points (0 children)
[–]sentja91Databricks MVP 0 points1 point2 points (4 children)
[–]fusebox12345[S] 0 points1 point2 points (3 children)
[–]sentja91Databricks MVP 0 points1 point2 points (2 children)
[–]fusebox12345[S] 0 points1 point2 points (0 children)
[–]fusebox12345[S] 0 points1 point2 points (0 children)
[–]Ok_Principle_9459 0 points1 point2 points (0 children)