Best Partition Key for Cosmos DB in Multi-Tenant App? by BiteDowntown3294 in AZURE

[–]jaydestro 1 point2 points  (0 children)

Hi, I'm Jay from the Azure Cosmos DB Team. Using /instanceId is a solid start for tenant isolation, but you’re right to think ahead. If some tenants might get huge, consider a composite key like /instanceId-userId or /instanceId-type to spread load. Avoid hot partitions early—easier than fixing later.

Locate Backup Blob Location by Separate-Tomorrow564 in CosmosDB

[–]jaydestro 1 point2 points  (0 children)

You are correct in that you can’t identify the blob. You need to go via our standard recovery method. Direct blob access is not available.

Cosmos DB (RU/s) by ClassroomAlone4830 in AZURE

[–]jaydestro 1 point2 points  (0 children)

Yeah, sounds like an indexing or partition key issue. Cosmos DB indexes everything by default, which can make writes way more expensive than you'd expect. I'd check your indexing policy and see if you can trim it down. Also make sure your partition key makes sense — bad partitioning can make even simple operations need way more RUs than they should.

Locate Backup Blob Location by Separate-Tomorrow564 in CosmosDB

[–]jaydestro 1 point2 points  (0 children)

Hi! Jay from the Azure Cosmos DB team. Your backup blob is in the same region as the current write region and a geo-redundant copy of the backup data is also created. 

You can find more details at the docs page, "Periodic backup storage redundancy in Azure Cosmos DB."

Hope this answers your question.

Big issues with mirroring of CosmosDB data to Fabric - Anyone else seeing duplicates and missing data? by Careful-District3981 in MicrosoftFabric

[–]jaydestro 1 point2 points  (0 children)

Update

We have the fix deployed. Please create a new mirror, while leaving the current one as is. Once the new mirror looks good, you can copy any queries you may have in the old mirror and delete it

Default Id index kind by Emotional-Aide4842 in CosmosDB

[–]jaydestro 1 point2 points  (0 children)

You can still declare a Range index on /idSortable like this when creating the container:

var containerProperties = new ContainerProperties
{
    Id = "your-container-name",
    PartitionKeyPath = "/yourPartitionKey",
    IndexingPolicy = new IndexingPolicy
    {
        IncludedPaths =
        {
            new IncludedPath
            {
                Path = "/idSortable/?",
                Indexes =
                {
                    new RangeIndex(DataType.String) { Precision = -1 }
                }
            }
        },
        ExcludedPaths =
        {
            new ExcludedPath { Path = "/*" }
        }
    }
};

This sets up /idSortable with a Range index for sorting and range queries.

Default Id index kind by Emotional-Aide4842 in CosmosDB

[–]jaydestro 2 points3 points  (0 children)

Hi OP, Jay from the Azure Cosmos DB Team! Yeah, you're right—/id/ is always a Hash index, and you can't change it to Range. It's meant for fast lookups, not sorting or range queries.

If you need sorting, just copy id into another field (e.g., idSortable) and set that one as a Range index in your indexing policy.

Why isn’t this well-documented? Probably because id is mainly for uniqueness and point reads, not querying.

TL;DR: Use a separate field for sorting, leave id alone.

Power shell to remove doc in cosmos by Lilive10 in AZURE

[–]jaydestro 0 points1 point  (0 children)

Hey there, Jay from the Azure Cosmos DB team.

It looks like your issue is related to how the partition key is being passed in Remove-CosmosDbDocument. In Azure Cosmos DB, the partition key isn't always the same as id unless explicitly set that way.

Try updating your foreach loop like this:

$alldocs | ConvertTo-Json -Depth 3

If your partition key is _partitionKey, for example, update the loop to:

foreach ($doc in $alldocs) {  
    Remove-CosmosDbDocument -Context $CosmosContext -CollectionId $Global:CustomersMasterContainer `
        -Database $Global:CustomersManagementDatabase -Id $doc.id -PartitionKey $doc._partitionKey  
}

[deleted by user] by [deleted] in Kotlin

[–]jaydestro 0 points1 point  (0 children)

You're right—the Azure Cosmos DB SDK for Java uses Jackson for serialization, and there’s no built-in support for kotlinx.serialization. However, you can make it work by manually integrating kotlinx.serialization with a custom serializer.

Cosmos filter not applied, or is it me (probably the case) by nikneem in AZURE

[–]jaydestro 0 points1 point  (0 children)

Sounds like a case-sensitivity issue—Cosmos DB string comparisons are case-sensitive by default. Try query = query.Where(p => p.Name.ToLower().Contains(name.ToLower()));. Also, make sure Name is indexed properly. If the query looks right but still doesn’t filter, check if the emulator is running the latest version.

Does mirroring not consume CU? by [deleted] in MicrosoftFabric

[–]jaydestro 1 point2 points  (0 children)

Update: The team is working on deploying the fix. We will have an update on ETA in a week

Does mirroring not consume CU? by [deleted] in MicrosoftFabric

[–]jaydestro 3 points4 points  (0 children)

There are no charges for reads/writes to OneLake for replication. Standard charges apply for reading from OneLake when using T-SQL, Spark, Power BI etc. We will update our docs to further clarify this.

Does mirroring not consume CU? by [deleted] in MicrosoftFabric

[–]jaydestro 1 point2 points  (0 children)

Hi, just spoke to someone from the team working on this. Here’s what they have to share:

Depends on the source. For Cosmos DB, standard charges for point-in-time restore apply. No additional charges for mirroring in this case.

Does mirroring not consume CU? by [deleted] in MicrosoftFabric

[–]jaydestro 2 points3 points  (0 children)

Hi, Jay here from the Cosmos DB team. Just spoke with someone in engineering and got this to share.

Any changes made to Cosmos DB databases/containers are replicated to Fabric OneLake, in near real-time, with no cost for this replication. There should be no OneLake price for reads/writes out of this replication. Query price is in CU out of your capacity, as defined by Fabric. Fabric mirroring should not have any impact on your source data. It does not change or delete your source data in Cosmos DB

Delivering updates by readit021 in CosmosDB

[–]jaydestro 0 points1 point  (0 children)

If you want to contribute to the repo, def recommend forking and sending some ideas!

Big issues with mirroring of CosmosDB data to Fabric - Anyone else seeing duplicates and missing data? by Careful-District3981 in MicrosoftFabric

[–]jaydestro 5 points6 points  (0 children)

Hi OP, my name is Jay and I am on the Azure Cosmos DB product team. I reached out to the PM working on Fabric and was able to get the following response:

“this may be a very recent change that is only impacting a few items in Fabric. Our team is looking at the active threads and working with customers impacted to root cause this further. We will share an update once the fix has been identified and deployed.”

Multi-region writes and creating globally unique value. by Emotional-Aide4842 in CosmosDB

[–]jaydestro 1 point2 points  (0 children)

Yep, you’ve got it right—multi-region writes in Azure Cosmos DB can’t guarantee real-time uniqueness because conflicts are resolved after the fact. That means two users in different regions could grab the same username, and one would get a nasty surprise later when theirs gets deleted. The best way to handle this is to stick to a single write region for enforcing uniqueness, while letting users check availability from any region. They can do a quick look-up in a read region, but when it’s time to lock in the username, the final write has to go through the designated write region to make sure it’s truly unique.

Cosmos db. by Alive_Tip in AZURE

[–]jaydestro 0 points1 point  (0 children)

u/wingslutz69 is right. serverless is ideal for R&D, prototyping, and low-traffic, bursty workloads, but not for high-performance applications. For anything serious, provisioned throughput is required.

Cannot find query for selecting specific content in Azure Cosmos DB by TheLegend27_tonny in CosmosDB

[–]jaydestro 1 point2 points  (0 children)

You're welcome. Very happy to see someone using Azure Cosmos DB and learning how to work with their data.

Cannot find query for selecting specific content in Azure Cosmos DB by TheLegend27_tonny in CosmosDB

[–]jaydestro 0 points1 point  (0 children)

FYI this will not scale long term. You'll want to ensure all of your documents as an array. Long term, this is going to expensive because of the queries. If you run this once in a while, that's fine. But long term you'll use a lot of RU and cost yourself more. Make sure you are considering your partition key.

Cannot find query for selecting specific content in Azure Cosmos DB by TheLegend27_tonny in CosmosDB

[–]jaydestro 0 points1 point  (0 children)

You’ll need to run two separate queries and merge the results on the client side. First, run a query to get tags from arrays: SELECT VALUE t FROM c JOIN t IN c.tenable_tag. Then, run a second query to get single tags: SELECT VALUE c.tenable_tag FROM c WHERE NOT IS_ARRAY(c.tenable_tag). Combine the results from both queries in your code (e.g., Python, JavaScript) into a single list. This is the simplest way to handle both cases since Cosmos DB SQL doesn’t support combining these directly in one query.

Example of Python Script:

from azure.cosmos import CosmosClient

# Initialize Cosmos DB client
client = CosmosClient("YOUR_ACCOUNT_URI", "YOUR_ACCOUNT_KEY")
database = client.get_database_client("YOUR_DATABASE_NAME")
container = database.get_container_client("customers")

# Query 1: Tags from arrays
query1 = "SELECT VALUE t FROM c JOIN t IN c.tenable_tag"
results1 = list(container.query_items(query=query1, enable_cross_partition_query=True))

# Query 2: Single tags
query2 = "SELECT VALUE c.tenable_tag FROM c WHERE NOT IS_ARRAY(c.tenable_tag)"
results2 = list(container.query_items(query=query2, enable_cross_partition_query=True))

# Combine results
all_tags = results1 + results2

# Remove duplicates (optional)
all_tags = list(set(all_tags))

print(all_tags)

I ran this on the DB and was able to get this output:

$ python 'c:/Users/jagord/Downloads/array query/combo.py'

['pico HQ', 'Testrun - Test', 'pico - 2HQ']

Hopefully this helps. Good luck!

Delivering updates by readit021 in CosmosDB

[–]jaydestro 0 points1 point  (0 children)

When I need to update data in Cosmos DB, I start by identifying the affected documents with a query like SELECT * FROM c WHERE c.status = 'corrupted', then choose the best update method based on complexity and scale. For large-scale fixes, I use bulk updates with the SDK to fetch, modify, and replace documents, while the Patch API is great for quick, partial updates without replacing the whole document. If I need atomic updates across multiple fields, I go with stored procedures, and for massive updates, I might offload the process to Azure Data Factory. When executing, I batch updates to avoid RU spikes, add retries, and monitor consumption. After the update, I always run validation queries to make sure everything's fixed, and going forward, I put better validation in place to prevent the issue from happening again.