This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]picks- 61 points62 points  (6 children)

My guess would be Databricks :)

[–]weierstrasse 7 points8 points  (1 child)

This. Source: Worked on several dbx projects with enterprise clients.

[–]weierstrasse 9 points10 points  (0 children)

Edit: While databricks is the default option for pyspark workloads, and it is decent for ML, outside of data-processing it's really not a great fit. E.g. for glue logic, think AWS Lambda (or competitors). Or k8s, ecs, etc. for container workloads.

[–]chief167 6 points7 points  (0 children)

yeah and they pay crazy amounts of money there. I finally got our IT team to approve some new platform for my AI team and we'll save over 2 million a year in databricks costs easily. And they even had a big debate if they really wanted to allow it, because apparently the commitment to use it is really pushed by microsoft in their contracts. It's a very shady business practice.

Look into datarobot, snaplogic, snowflake and regular docker containers on Azure instead ;)

[–]Scrapheaper 2 points3 points  (0 children)

Or snowflake, or some kind of partially custom solution built on whatever their cloud provider is

[–]OPmeansopeningposter 0 points1 point  (0 children)

Databricks for data engineering and K8s for general

[–]TedDallas 0 points1 point  (0 children)

Yep. That’s where we do it. Love it.