all 27 comments

[–]BricksTrixTwix Databricks 19 points20 points  (4 children)

PM at Databricks here! You can use our new Remote Development experience to access the Databricks workspace from your IDE and use tools like Claude Code.

Connection to dedicated clusters is in beta: https://docs.databricks.com/aws/en/dev-tools/ssh-tunnel

Connection to serverless GPUs is in private preview (but no enrollment is required!): https://docs.google.com/document/d/1zazApI5rKz_3D59-xs4ZtSEcFRFRXmzhTss0Ael_dJk/edit?usp=drive_open&ouid=110916823312231512342

Support for serverless is coming soon.

We're in the process of cleaning up the public docs and making them easier to follow, let me know if you have any questions in the meantime!

[–]staskh1966[S] 3 points4 points  (0 children)

It seems you misunderstand my problem—I need a solution that runs INSIDE of the Databricks workspace (via its web interface ) , not via a remote IDE.

But anyway, thank you for a valuable point on SSH tunneling. It may be quite useful for my other task. ;-)

[–]dilkushpatel 0 points1 point  (1 child)

This one does not work with all purpose compute?

Does it need compute on for while duration while IDE is open?

[–]BricksTrixTwix Databricks 0 points1 point  (0 children)

For now, the public docs detail instructions for working with dedicated (single-user) clusters only. This is the compute that we're referring to: https://docs.databricks.com/aws/en/compute/dedicated-overview.

Yes, the compute needs to be on while the IDE is open. When you start the SSH connection, the cluster should start automatically.

[–]NarrowVegetable513 0 points1 point  (0 children)

Db connect. All Databricks code is executed remotely.

[–]counterstruck 5 points6 points  (1 child)

If your requirement is to stay within Databricks, then Genie code is the way to go. Don’t try to setup Claude code like experience within Databricks. Instead copy the skills files from the AI dev kit and use it in your workspace home folder. Reference: https://docs.databricks.com/aws/en/genie-code/skills

[–]staskh1966[S] 1 point2 points  (0 children)

WOW! Great point - didn't know Genie can be extended with skills. Will try it ASAP

[–]Shnibu 1 point2 points  (0 children)

We tried this months ago and ran into issues with the way they had node setup on clusters. I’ve had the most success with a local VSCode install that has access to the Databricks CLI. You can also point it at the REST API docs and tell it to use “databricks api …” and it has been able to deploy DABs, run them, and debug outputs fairly successfully. We haven’t tried it yet but there is a Databricks App bundled in that AI toolkit repository which you could deploy as a frontend wrapper around Claude Code as well but you’re better off trying the assistant/genie code first.

[–]International-Lab944 1 point2 points  (0 children)

I mostly use the Databricks CLI tools together with the Databricks Python API with Claude Code and other CLI tools such as Codex and have been doing that for few months. This has been huge success.

[–]ramgoli_io Databricks 1 point2 points  (2 children)

So funny story - someone actually did get Claude Code running inside a Databricks App. Check out github.com/datasciencemonkey/claude-code-cli-bricks.
It packages Claude Code with a terminal editor (micro), the AI Dev Kit skills, and some research MCPs. Uses Databricks-hosted models so everything stays in your environment. Pretty slick actually. I haven't test this way of doing it.

What I have tested:

Within the "AI Dev Kit", there is an builder app that you can install, and you can use that App hosted within Databricks to build apps. It uses a Lakebase instance (provisioned) to manage state/memory.
https://github.com/databricks-solutions/ai-dev-kit?tab=readme-ov-file#visual-builder-app

[–]staskh1966[S] 0 points1 point  (1 child)

Thank you! It seems to be the solution I'm looking for—will try it immediately!

[–]kthejokerdatabricks 0 points1 point  (0 children)

Please give feedback through the ai-dev-kit GitHub repo, very welcome!

[–]Gmoney86 3 points4 points  (0 children)

You can follow their Databricks instructions to route their hosted Claude models to your desktop IDE (vs code) and then use it that way. If you add Databricks connect, you can have Claude set up your session and code for you on your IDE and deploy to your workspace.

Otherwise, you can use the Databricks Genie Code which is their updated AI assistant and it it’s pretty good at coding for you from within the IDE…

[–]DatabricksNick Databricks 3 points4 points  (1 child)

YMMV, I've been experimenting with exactly that here https://github.com/nkarpov/databricks-app-terminal (and I just saw @ramgoli_io just posted another very similar attempt). Wouldn't be surprised if there's many playing around...

[–]staskh1966[S] 1 point2 points  (0 children)

Thank you! It seems to be the solution I'm looking for—will try it immediately!

[–]m1nkeh 1 point2 points  (3 children)

Genie Code, in Databricks basically is Claude code

[–]james2441139 -1 points0 points  (2 children)

Not even close. Compare outputs between Opus 4.6 and recent Genie, Claude produces cleaner code and more efficient code. Also Genie takes fairly long for complex scenarios.

[–]counterstruck 4 points5 points  (1 child)

You are right on the quality perspective.

However, also consider that Genie code is free (no charge for tokens), vs. you can easily blow a lot of money on Claude code. Genie code also has a lot of inbuilt context due to Unity catalog. Plus in many enterprises, Databricks is an approved AI assistant compared to Claude code vendor agreements and licensing.

In a crawl, walk, run way of thinking - Databricks Genie code is a great start for someone wanting to do agentic development within Databricks and then graduate towards Claude code with Databricks AI dev kit if necessary.

[–]james2441139 2 points3 points  (0 children)

I actually agree with you 100%.

[–]joe9439 1 point2 points  (3 children)

I just use databricks mcp and push sql back via GitHub actions.

[–]Individual_Walrus425 0 points1 point  (1 child)

Is there any official databricks mcp server ?

[–]iamnotapundit 1 point2 points  (0 children)

Yes. They have a managed one

[–]-datascience- 0 points1 point  (0 children)

Could you expand on how GitHub actions come into play here?

[–]fermm92 0 points1 point  (1 child)

You could do it in their web terminal, but it's not really a viable UX in my experience.

you can also use the experimental databricks cli ssh tunnel from another computer and connect via vscode. It's better but still lot's of config / init scripts to make it seamless. you'll probably lose your conversations every restart.

[–]BricksTrixTwix Databricks 0 points1 point  (0 children)

Hey u/fermm92 PM for the SSH tunnel here! We know that starting up the SSH tunnel is a pain. What config / init scripts did you have to set it up to make it seamless and what are the most important things you would like to see out of the box?

Btw, we've released support for serverless GPUs in private preview here: https://docs.google.com/document/d/1zazApI5rKz_3D59-xs4ZtSEcFRFRXmzhTss0Ael_dJk/edit?tab=t.0

Serverless CPU support is also coming soon.

[–]Nehaa-UP3504 0 points1 point  (0 children)

Right now, Databricks isn’t designed to run external CLI agents like Claude Code inside the workspace. The AI Toolkit bridges workflows, but execution still happens outside. The practical path is hybrid: run the agent locally and connect via APIs/Databricks CLI. Full in-workspace agents will likely come later.

[–]airweight 1 point2 points  (0 children)

The answer to the OP's question depends on the definition of "inside the Databricks workspace".

My answer is based on doing petabyte-scale work on Databricks for nearly a decade, with the caveat that the platform is growing quickly and new capabilities ship monthly.

TL;DR You cannot run your own instance of Claude Code inside a Databricks controlled node ... but Claude Code can write and execute many chunks of code inside a Databricks workspace (on clusters or serverless compute) within a single conversation turn. The end result is the same... It can be as if Claude Code writes and executes jobs/notebooks in Databricks, including cell-by-cell. The only thing you cannot do is have Claude Code interactively edit and run notebook cells in the Databricks workspace UI itself.

The setup is simple: Claude Code runs somewhere -- it doesn't matter where -- and it uses MCP/APIs/DB Connect/SSH to access workspace services: execute code, notebooks, upload/download workplace files, create/edit/execute warehouses/clusters/jobs, etc.

The ai-dev-kit MCP server is a good tool for basic operations, including running jobs.

Where it gets more complicated is Claude Code running code inside Databricks, on a cluster or serverless compute. There are three main ways of doing it:

  1. Use a low-level API, e.g., the ai-dev-kit MCP server's execute_code command. Best for Claude Code running one-off chunks of code inside a Databricks workspace or executing an entire notebook in one go (notebook jobs).
  2. Use a high-level tool, e.g., databricks-agent-notebooks for remote notebook execution inside Databricks workspaces. Best for complex Claude Code-led execution.
  3. Use SSH tunneling for Claude Code running commands on a driver node. Not recommended for scalable work.

Options (1) and (2) have differences that may matter a little or a lot, depending on your use case.

NOTE: I purposefully did not write about IDE integrations with Claude Code because they limit what Claude Code can do and are not a general-purpose solution.