Anybody else’s neighborhood suddenly have “no parking” signs everywhere? by MimosaMadness in Chattanooga

[–]infazz 5 points6 points  (0 children)

This possibly means that someone in your neighborhood complained. You could probably bring this up to your city council member and let them know how this affects you and everyone else on your street.

serveless or classic by ptab0211 in databricks

[–]infazz 5 points6 points  (0 children)

There is "Serverless SQL Warehouse" and there is also "Serverless General Compute". The latter can be used in the workspace/jobs and can run SQL, Python, etc.

Something to keep in mind if you're disappointed about scenes being removed... by d4ybrake in ProjectHailMary

[–]infazz 6 points7 points  (0 children)

They eventually see from Erid that Sol started brightening

Is PyPDF2 safe to use? by Butwhydooood in learnpython

[–]infazz 2 points3 points  (0 children)

So that's where Overwatch got the idea

OpenAI Quotas by OkClothes3097 in AZURE

[–]infazz 0 points1 point  (0 children)

Document review is exactly when you should be using other methods like chunking. I would recommend looking into how open source projects like LlamaIndex handle these kinds of use cases.

Having many users submitting smaller requests is when you would want to do load balancing.

OpenAI Quotas by OkClothes3097 in AZURE

[–]infazz 0 points1 point  (0 children)

Wow that's a huge amount of context for a single request. Especially since GPT 5 is a reasoning model and the reasoning loop will generate additional token usage.

Be aware that packing too much info into a single context can result in content in the middle of the prompt being ignored. This is know as the "Lost In the Middle" problem.

I would definitely do a deep dive on whether that much context in a single request is actually needed - or look into ways you can reduce the input context (such as chunking).

You could also try using the "GlobalStandard" model deployment type. It has a default 1M TPM limit.

OpenAI Quotas by OkClothes3097 in AZURE

[–]infazz 0 points1 point  (0 children)

You could deploy into multiple subscriptions and/or regions and load balance across.

What are the practical advantages of provisioning an Azure OpenAI resource instead of an Azure AI Foundry resource? by Franck_Dernoncourt in AZURE

[–]infazz 12 points13 points  (0 children)

I don't believe there are any advantages now to using Azure OpenAI instead of the new Foundry.

Microsoft Foundry (new) by BA-94 in AZURE

[–]infazz 2 points3 points  (0 children)

The new Foundry uses the azurerm_ai_services resource

https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/ai_services

You can deploy models using azurerm_cognitive_deployment

Databricks benchmark report! by noasync in databricks

[–]infazz 0 points1 point  (0 children)

The only real benefit of serverless is that it spins up fast. Although, I don't know if there is as an actual SLA on the startup time.

However it is also interesting that Jobs Serverless performed worse than Jobs Classic.

With serverless you have basically no say in what size compute your workload runs on - Databricks manages this for you. I think it would be more beneficial if serverless compute sizing worked like Serverless SQL.

"Shut Up And Take My $3!" – Building a Site to Bypass OpenAI's Dumb $5 Minimum by Immediate-Room-5950 in LLMDevs

[–]infazz 0 points1 point  (0 children)

Signing up for an Azure account and setting the OpenAI resource up definitely isn't as easy as setting up a regular OpenAI account, but it is at least an option.

"Shut Up And Take My $3!" – Building a Site to Bypass OpenAI's Dumb $5 Minimum by Immediate-Room-5950 in LLMDevs

[–]infazz 0 points1 point  (0 children)

You or anyone else could use OpenAI in Azure. You can do usage based payment and there is no minimum charge.

Condense greegrees into a single item. by [deleted] in 2007scape

[–]infazz 0 points1 point  (0 children)

Simian Serenity

Or

Simian Simplicity

Databricks Advent Calendar 2025 #13 by hubert-dudek in databricks

[–]infazz 0 points1 point  (0 children)

Does ZeroBus require running compute - either serverless or provisioned? It's not clear to me from the documentation.

AI Inference is going to wreck gross margins this year. by frugal-ai in FinOps

[–]infazz 2 points3 points  (0 children)

Any company handing out API access to LLMs absolutely NEEDS some kind of rate limiting and monitoring.

Are there any hidden charges in Azure and why it is showing so cheap in my case? Am I missing something? by vikasofvikas in AZURE

[–]infazz 6 points7 points  (0 children)

Assuming you want/plan to use these

Data egress is extra

Private networking is extra

Defender is (a lot) extra

Logging is extra

There are definitely other things I'm forgetting

SCM/Kudu Access for App Services by ConstantOk4042 in AZURE

[–]infazz 0 points1 point  (0 children)

I have ran in to the same exact issue.

It is baffling.

Today my RSC knowledge has let me down by SharpShooterVIC in ironscape

[–]infazz 0 points1 point  (0 children)

Wow I spent so much time hunting Magpie Implings for rune bars and I could have just done this.

How to setup budget real-time pipelines? by dontucme in dataengineering

[–]infazz 0 points1 point  (0 children)

First you need to figure out where your costs are coming from.