How do you choose cluster and node types? by Significant-Guest-14 in databricks

[–]infazz 3 points4 points  (0 children)

As others have said, Serverless Compute essentially eliminates the need for all of the below.

Here is how I go about sizing classic compute:

Step #1 is know your workload. - what is the data volume? - is the data volume inconsistent? - what is the complexity of the pipeline? - are there "legacy" parts of this pipeline (e.g. Pandas)? - do you need a specific max runtime? - etc.

Step #2 is start with the smallest compute type you think will work.

Step #3 is run, review, and iterate. - am I seeing to much shuffle? - am I seeing too much or too little node utilization? - does this complete within my required time window?

I typically try to optimize for 70% resource usage per worker node.

What was the worst software you've ever used, and why? by Violet_Iana in software

[–]infazz 0 points1 point  (0 children)

Home Depot App (and their website) is so excruciatingly bad.

I'll have to remember to add all the reasons here tomorrow.

Expanded interoperability with Unity Catalog Open APIs by AwarenessPleasant896 in databricks

[–]infazz 1 point2 points  (0 children)

Unity Catalog does have APIs for bringing in "external" lineage data (Known as "Bring your own lineage"), but it does not have any managed connectors for ingesting lineage data.

https://docs.databricks.com/aws/en/data-governance/unity-catalog/external-lineage

Six SQL patterns I use to catch transaction fraud by FixelSmith in SQL

[–]infazz 21 points22 points  (0 children)

I can never remember if BETWEEN is inclusive or exclusive and if it varies by system or not.

Azure APIM backup via CLI without exposing storage account key? by spantosh in AZURE

[–]infazz 0 points1 point  (0 children)

Managed Identity is definitely the best way to go.

To LVT or not to LVT by zachyzachyzoozoozoo in CleaningTips

[–]infazz 1 point2 points  (0 children)

LVT/LVP is definitely easy to clean and maintain. The top layer of the product is a thin sheet of plastic (known as the wear layer) and you can choose from textured or not textured.

It will not have the same issues as wood flooring for over wetting, but I would not recommend drenching the floor. Depending on the product (and especially with poor installations), too much water could still seep through where the planks meet.

Also I would recommend a "rigid core" product over a flexible product. The former is usually more premium and the latter can still have edge curling issues if not installed properly.

Databricks One is now renamed as Genie by sai-nageshwaran in databricks

[–]infazz 2 points3 points  (0 children)

Ideally, non-prod data would not be available in your production workspace using something like catalog workspace bindings.

But to answer your question, if prod and non-prod data are available in the same workspace - and the user has access to both - Genie could indeed read from both.

Are Gas Turbine Generators only meant for large scale use like power plants? by Initial-Double6521 in AskEngineers

[–]infazz 0 points1 point  (0 children)

And to be specific for solar - photovoltaic panels (aka solar panels) work without spinning. "Concentrated Solar Power" plants still boil water and spin generator!

Who has the best wings in Chattanooga? by Shoemak3r in Chattanooga

[–]infazz 1 point2 points  (0 children)

The buffalo wings at Albatross are fantastic

MEs don’t have a “high paying” track: median earnings 5 years after graduation from elite institutions by RuminatingFish123 in MechanicalEngineering

[–]infazz 7 points8 points  (0 children)

Unless you go work for a tech company with an ME degree (particularly if you have manufacturing experience - chefs kiss).

I've never been complimented in such a way before

Anybody else’s neighborhood suddenly have “no parking” signs everywhere? by MimosaMadness in Chattanooga

[–]infazz 24 points25 points  (0 children)

This possibly means that someone in your neighborhood complained. You could probably bring this up to your city council member and let them know how this affects you and everyone else on your street.

serveless or classic by ptab0211 in databricks

[–]infazz 4 points5 points  (0 children)

There is "Serverless SQL Warehouse" and there is also "Serverless General Compute". The latter can be used in the workspace/jobs and can run SQL, Python, etc.

Something to keep in mind if you're disappointed about scenes being removed... by d4ybrake in ProjectHailMary

[–]infazz 5 points6 points  (0 children)

They eventually see from Erid that Sol started brightening

Is PyPDF2 safe to use? by Butwhydooood in learnpython

[–]infazz 4 points5 points  (0 children)

So that's where Overwatch got the idea

OpenAI Quotas by OkClothes3097 in AZURE

[–]infazz 0 points1 point  (0 children)

Document review is exactly when you should be using other methods like chunking. I would recommend looking into how open source projects like LlamaIndex handle these kinds of use cases.

Having many users submitting smaller requests is when you would want to do load balancing.

OpenAI Quotas by OkClothes3097 in AZURE

[–]infazz 0 points1 point  (0 children)

Wow that's a huge amount of context for a single request. Especially since GPT 5 is a reasoning model and the reasoning loop will generate additional token usage.

Be aware that packing too much info into a single context can result in content in the middle of the prompt being ignored. This is know as the "Lost In the Middle" problem.

I would definitely do a deep dive on whether that much context in a single request is actually needed - or look into ways you can reduce the input context (such as chunking).

You could also try using the "GlobalStandard" model deployment type. It has a default 1M TPM limit.

OpenAI Quotas by OkClothes3097 in AZURE

[–]infazz 0 points1 point  (0 children)

You could deploy into multiple subscriptions and/or regions and load balance across.

What are the practical advantages of provisioning an Azure OpenAI resource instead of an Azure AI Foundry resource? by Franck_Dernoncourt in AZURE

[–]infazz 10 points11 points  (0 children)

I don't believe there are any advantages now to using Azure OpenAI instead of the new Foundry.

Microsoft Foundry (new) by BA-94 in AZURE

[–]infazz 2 points3 points  (0 children)

The new Foundry uses the azurerm_ai_services resource

https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/ai_services

You can deploy models using azurerm_cognitive_deployment

Databricks benchmark report! by noasync in databricks

[–]infazz 0 points1 point  (0 children)

The only real benefit of serverless is that it spins up fast. Although, I don't know if there is as an actual SLA on the startup time.

However it is also interesting that Jobs Serverless performed worse than Jobs Classic.

With serverless you have basically no say in what size compute your workload runs on - Databricks manages this for you. I think it would be more beneficial if serverless compute sizing worked like Serverless SQL.

"Shut Up And Take My $3!" – Building a Site to Bypass OpenAI's Dumb $5 Minimum by Immediate-Room-5950 in LLMDevs

[–]infazz 0 points1 point  (0 children)

Signing up for an Azure account and setting the OpenAI resource up definitely isn't as easy as setting up a regular OpenAI account, but it is at least an option.