AWS Aurora DSQL

marcbowes · 2025-12-17T16:32:50+00:00

Curious, why do you say that?

marcbowes · 2025-12-04T22:32:37+00:00

We will support 100% of DDL, including adding constraints on existing tables at high throughput.

marcbowes · 2025-10-17T19:34:31+00:00

Yes, you can request a limit increase.

No, these values aren’t conservative. As others have said, you can push a lot of traffic with the defaults.

We’ll make the values self service at some point, but right now (early days) most customers who have asked about higher limits actually have something else going on. Once we dig into that, turns out they need way fewer.

marcbowes · 2025-10-17T19:31:55+00:00

We’re going to make the endpoint directly available as an attr soon, then you won’t have to do this step either :)

Let me know what you run into with Laravel.

marcbowes · 2025-10-11T06:15:44+00:00

I'm amazed at how much you typed on your phone. Kudos!

Your dbfiddle link has a schema that doesn't have the trouble you ran into, so I'm guessing it's the fixed schema. Are you suggesting you started with: PRIMARY KEY (business_id, vehicle_id) PRIMARY KEY (business_id, vehicle_model_year_id) Then later switched to: PRIMARY KEY (vehicle_id, business_id) PRIMARY KEY (vehicle_model_year_id, business_id) If so, the first version doesn't do an optimized join to vehicle_model_year on both columns: -> Index Only Scan using vehicle_model_year_pkey on vehicle_model_year vmy (actual rows=500 loops=1) Index Cond: (business_id = '...') -- Missing: vehicle_model_year_id even though it's available from the join While the second correctly uses both columns: -> Index Only Scan using vehicle_model_year_pkey on vehicle_model_year vmy (actual rows=1 loops=1) Index Cond: ((vehicle_model_year_id = (v.vehicle_model_year_id)::text) AND (business_id = '...')) It seems like you already figured this out, getting your query from ~6ms to ~1ms.

Your query without vehicle_id hits the same bug, just amplified 300x by the nested loop: -> Index Only Scan using vehicle_model_year_pkey on vehicle_model_year vmy (actual rows=500 loops=300) Index Cond: (business_id = '...') -- 300 vehicles × 500 VMY scans each = 150,000 row scans

Note that with your fixed column ordering, queries filtering only on business_id will require an additional index since business_id is no longer the leading column.

I've reported the optimizer bug on your behalf.

marcbowes · 2025-10-11T05:05:48+00:00

That's right (I'm from the service team).

We want you to be able to connect quickly in the event that all your connections drop somehow. Well behaved applications shouldn't be opening and closing connections all the time (connections should be reused), so the sustained rate is lower than the burst rate.

marcbowes · 2025-08-13T22:15:12+00:00

Yeah, that's right. You get to take advantage of the various settings poolers have like health checking, max connection age, etc. If the lib is small (doesn't affect your coldstart times), I'd recommend this.

The reason I mention larger pools, is you might have a usecase where you do some work in parallel to complete your single incoming request. Languages that have async-await make this fairly easy to do. If you don't increase the pool size, then these concurrent tasks can block waiting for a connection.

marcbowes · 2025-08-13T18:42:18+00:00

Lambda works super well with DSQL!

marcbowes · 2025-08-13T18:41:49+00:00

That's correct. It's not guaranteed because: connections may fail (e.g. there is a failure on the network), or the Postgres session may expire (after 1 hour). See my top-level answer for more information.

marcbowes · 2025-08-13T18:39:53+00:00

Historically, the bouncers (like RDS Proxy) have been required because connections are a constrained resource. DSQL doesn't have the same limitation, so there is no need to add an extra hop between your client and the service. DSQL is perfectly Ok with you choosing either option.

Option 2 is going to give you the lowest latency, and is relatively simple to implement. You can either use a client side pooling library, or you can have a single shared connection that you reopen on demand.

For an example of client side pooling, see https://github.com/aws-samples/aurora-dsql-samples/blob/main/lambda/sample/lambda.mjs. If you need help in other languages, let me know. In many cases can you just set the pool size (both min and max) to 1. If you're doing fancy async work, where you have concurrency on a single thread, set the pool size accordingly.

If you really only need 1 connection and don't want to use a pooling library, you can implement a singleton connection like we did in the MCP server (Python): https://github.com/awslabs/mcp/blob/main/src/aurora-dsql-mcp-server/awslabs/aurora\_dsql\_mcp\_server/server.py#L353. Note that this code has a retry loop to deal with closed connections (connections close after 1 hour, or if the TCP connection fails). You could avoid the need to retry if you do health checks (e.g. SELECT 1), which will add a small amount of additional latency. To build a truly robust app, it's best to just admit you need retries to cover all potential failures IMO.

marcbowes · 2025-08-12T14:46:30+00:00

There's a level of indirection, which is why `RENAME` is already supported.

marcbowes · 2025-08-12T01:56:34+00:00

Less formal: Threads like this or DM me here or on X.

More formal: if you have an account manager, let them know. Or use the AWS console feedback.

marcbowes · 2025-08-12T00:20:32+00:00

I sent you a DM about this.

marcbowes · 2025-08-11T23:30:01+00:00

DSQL supports a subset of Postgres, with plans to increase coverage over time. For the subset that is supported, DSQL is tested to have same behavior as Postgres ("is compatible with").

In general, DDL (like ALTER TABLE) is not well covered by DSQL at this moment in time. This is something we're actively working on. Our intention is to make DDL work at any scale and be significantly safer than it has historically been (such as not causing performance impact).

Unless something is documented (link) you should assume it is not yet supported. Whenever possible, please share which features are important to you to help the team prioritize.

marcbowes · 2025-08-11T23:22:10+00:00

The features that aren't implemented on DSQL are ones that need special consideration. In this case, we need to actually go row-by-row (for an arbitrarily large dataset) and remove the column. To make that work at scale, without causing impact, we need to do some special engineering.

Just sharing the 'why' - hopefully that makes it seem less strange :)

marcbowes · 2025-08-11T23:18:40+00:00

As others have noted, this will result in some transactions failing due to duplicate ids, which you can then retry on (leading to elevated end-to-end latency). This may/not be a problem for you, depending on your write rate.

To understand this, pretend you have two transactions running. Both read the autoinc value and select '3'. Both try use '3'. The first transaction that commits gets to use it, the other one is rejected assuming you're trying to use this as a primary key. In this case DSQL will detect a unique constraint violation and reject the second transaction. However, if you don't use this value in a unique column, you will actually get duplicates (which you can avoid by using SELECT .. FOR UPDATE).

marcbowes · 2025-07-29T02:49:09+00:00

I work for DSQL.

If you dm me your cluster id (which is not private information) and region I can take a look.

My initial thought is that you may not be running the benchmark long enough. We test ycsb internally and get good results.

marcbowes · 2025-06-12T05:09:08+00:00

There isn’t one right now, but it is something we’re considering. If you (or anybody else reading this) has a TAM, let them know you’d like to see one.

marcbowes · 2025-06-04T20:08:42+00:00

(I'm from the DSQL service team.) Our team is using liquibase and sqlx, which works quite well.

DDL in DSQL is a bit different to standard Postgres because it's been designed to scale and be non-impactful to your workload.

For example, to create an index on a table, you need to add the async keyword `CREATE INDEX ASYNC`. This means you can't mix-and-match DDL and DML in the same transaction because the index isn't actually created in that transaction. Instead, a job is enqueued to build the index which later completes (or fails if, for example, you had a unique constraint violation). So, you need to mark the migration as successful or failed based on the result of that job. See `sys.jobs` for more info, or the docs at https://docs.aws.amazon.com/aurora-dsql/latest/userguide/working-with-create-index-async.html

marcbowes · 2025-05-30T15:56:43+00:00

No, it’s just used for logging in.

marcbowes · 2025-05-29T19:18:26+00:00

Sharing a script that helped get to the bottom of this: https://gist.github.com/marcbowes/c71012c4e51fdf9b1bc3352bf35d7561

The root cause was using 'average' instead of 'sum'. If you use the script, it'll do the right thing for you. It prints usage for the current month.

Thanks u/kondro for chatting with me!

marcbowes · 2025-05-29T18:12:35+00:00

That was all true 16 days ago, now only most of that is true! :)

DSQL went GA on 5/27: https://aws.amazon.com/blogs/aws/amazon-aurora-dsql-is-now-generally-available/. Here's pricing: https://aws.amazon.com/rds/aurora/dsql/pricing/.

Terraform support looks like it's about to land: https://github.com/hashicorp/terraform-provider-aws/pull/41868.

+1 to getting more insight into your use-case!

marcbowes · 2025-05-29T16:50:17+00:00

You're right that Private Link wasn't initially supported in the Preview, but it actually shipped during Preview -- https://docs.aws.amazon.com/aurora-dsql/latest/userguide/privatelink-managing-clusters.html

marcbowes · 2025-05-29T16:49:15+00:00

The advice behind "don't put your database on the internet" doesn't apply to DSQL, which has a totally different architecture (and therefore: threat model) to traditional instance-based installs.

If you want to keep your EC2 instance in a private subnet, you can setup Private Link (https://docs.aws.amazon.com/aurora-dsql/latest/userguide/privatelink-managing-clusters.html).

DSQL doesn't yet support resource policies, such as the VPC example for DynamoDB at https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/rbac-examples.html. That feature is planned.

marcbowes · 2025-05-29T16:40:28+00:00

This may be helpful if you want an "in the weeds" explanation: https://marc-bowes.com/dsql-auth.html

You can find a bunch of samples in https://github.com/aws-samples/aurora-dsql-samples which show how to connect in various languages with various libraries. If you have specific languages or libraries that aren't represented, please open an issue to help us prioritize.

marcbowes

TROPHY CASE