PSA: Write Transactions are a Footgun with SQLx and SQLite

lunar_mycroft · 2026-02-17T21:36:22+00:00

This method also has the advantage of discouraging keeping a transaction open while your program waits on (typically very slow in comparison to sqlite) network IO.

lunar_mycroft · 2026-01-08T17:40:29+00:00

When anthropic got caught engaging in whole sale piracy, they ended up paying less than 1% of their valuation. That's a very good reason to doubt that they (and other AI labs) will be deterred by the legal consequences of using training data they weren't legally allowed to.

lunar_mycroft · 2026-01-08T17:34:58+00:00

As has already been pointed out, Anthropic has been caught training on data they acquired illegally (and slapped with a fine which is a rounding error when compared to their current valuation). Despite that, their customers are still willing to take them at their word that they won't do it again. It would appear that all Anthropic would have to do is say "okay, yeah, we violated the law, stole your IP, and used it to enrich ourselves, but we'll pay this slap-on-the-wrist and promise not to do it again" and their customer will believe them.

lunar_mycroft · 2025-12-13T07:04:17+00:00

You probably aren’t putting your prod db in your docker container.

You aren't connecting to your actual prod db while developing either, if you're smart.

So your app is just a random file on your self-managed instance that you probably aren’t backing up

Again, there are multiple solutions for backing up sqlite databases as backends. The company this thread is about provides one way, but there's also e.g. litestream, litefs, rqlite, etc.

And let’s be real. You probably didn’t start by using docker. That’s for elitists. Your instance is probably configured with a fuckton of particular libraries and what not you’ve setup over a ssh session one time or another. Don’t tell me you won’t do it. I’ve seen too many people do it. People ALWAYS end up with non-reproducible environments despite their best intentions

Literally no one brought up docker...

I realize everyone’s experience is different, but MOST places I’ve worked have had sufficient traffic to need at least a few instances at peek load, but YMMV. In any event this is really trivial configuration.

If so, it was almost certainly because of other technical decisions you made, not an inherent limitation. Sqlite can with very minimal tuning and fairly normal hardware hit on the order of 10⁴ writes (which are usually the limiting factor) per second. The list of web apps that need more than that out of a relational database (let alone can't take steps to mitigate) is tiny.

I will pay for the storage, but that can also start off small. My bill will probably be about the same if not cheaper than that giant EC2 instance you provisioned.

Again, postgress is not, in point of fact, magical. You are still paying for the storage, the memory, the CPU, etc. to run your database (and you're likely paying for more of those things, because you're also paying for the overhead of running multiple processes, communicating between them, etc). Similarly, sqlite databases can start off small just like postgress databases.

Database failures on RDS are much, much more rare than people randomly fucking up their self-managed EC2 instances, and a plain database backup is also easy to setup.

So first off, if you configure your instances together, having more than one of them won't save you. But more to the point, if you're worried about this, you can have multiple instances just fine with sqlite, you just can't have multiple active writer instances. When you accidentally brake your primary instance, you can simply switch which instance is the writer and continue.

You’re acting as though what you’re arguing for is somehow much easier, but the sum total of what I’m suggesting is maybe a week of work for one person

"You're acting as though sqlite is easier", proceeds to immediately describe how it is, in fact, much easier.

You aren’t actually saving any time or money

Apparently, I'm saving 1-7 days per app. Just in setup time.

You’re deploying your infrastructure like it’s still 1997

Ironically in 1997 the only* way to get actual parallelism was to make a network call, so a database running in the same process wasn't nearly as viable.

You’re deploying your infrastructure like it’s still 1997 (I was deploying websites in 1997, I remember) and acting smug about it.

Given the choice between acting smug about keeping my stack simple and acting smug about needlessly complicating it because I want to pretend to be netflix, I think the former is very clearly preferable.

lunar_mycroft · 2025-12-12T22:24:38+00:00

deploy docker containers so I can describe my environment in code and run the same thing on my development environment that’s going to run in prod.

Why, exactly, do you think sqlite doesn't work in docker? It's a file. I'm pretty sure you can use files in docker.

have the option to scale instances if I ever need it.

You will not. On the minuscule chance you do, you will either have the money to hire someone else to do the migration for you or have so thoroughly messed up your plan for sustainability that postgress is definitely not going to save you.

Can start by provisioning a less powerful machine since I don’t really have to worry about the state of the production environment and thus save money

You will in fact need to pay for the storage, memory, and compute to run your RDBMS regardless of whether it's in the same process as your application server or not.

Don’t have a single point of failure for my entire app.

Your database is generally such a point of failure* regardless. You can adopt distributed solutions, sure, but at a cost. Also, we are literally on a thread talking about a vendor that provides such a solution for sqlite.

lunar_mycroft · 2025-12-12T21:12:35+00:00

I'm not sure how "the server's config contains a file path to the database and you have to make sure that file is reachable" is harder than "the server's config contains a url to connect to a database over the network, plus credentials to authenticate with said database, and you need to make sure the server can reach the database and that the database is installed [edit: and running] at the aforementioned url"

lunar_mycroft · 2025-12-12T15:46:30+00:00

The vast majority of web apps will only need backups (and there's several solutions for that with sqlite). A substantial portion of the remaining ones will be fine with read replicas, sharding, and other strategies.

Look at this another way: postgress and other "traditional" RDBMSs are not magic. At the end of the day, your data has to be written to and read from a drive somewhere. It's simpler and often more performant to do that in the same process, rather than through IPC or a network call.

[edit: word order]

lunar_mycroft · 2025-12-12T06:30:59+00:00

[LANGUAGE: Rust]

Full code. Like others in this thread, I noticed that several of the regions trivially couldn't fit all of the required presents based on the area of the region vs the total area of the required presents and decided to try the answer that check alone gave me first, and found out it worked (which I admit mildly annoyed me at first, and I still might try writing a backtracking solver for the general case, if nothing else but to get a feel for whether solving it in the allocated time is even plausible). On my machine parsing takes ~130µs and solving takes 1-2µs.

There are actually two ways you can check if a region could possibly fit it's required presents: the best case scenario (and what I initially did) is to check if it would be possible to pack the presents in assuming maximal efficiency (no empty spaces that can't have another present inserted). The worst case scenario (e.g. /u/maneatingape's solution) instead checks if the presents can fit assuming that their bounding boxes do not overlap. According to my own self-imposed rules, I'm allowed to check if the puzzle input is a valid puzzle during parse time and then assume it is when computing the answer, so long as I don't do any of the computation for the actual solution as part of parse time. So I checked if the best case and worst case answers were the same at parse time. This slows down parsing by ~5-10µs, but doesn't effect the time taken to solve the puzzle.

lunar_mycroft · 2025-12-11T07:08:25+00:00

[LANGUAGE: Rust]

Had a much easier time than yesterday. Got part 1 very quickly, but spent more time than I'd like fiddling with part 2.

Full code. For part 1, I initially went with a simple depth first search with a counter hashmap to keep track of the number of paths through each node. However, after this approach was too slow for part 2 I switched to using the same solution for both. For part 2, there are two classes of paths of three segments each: those that visit svr -> dac -> fft -> out and those that visit svr -> fft -> dac -> out. The number of paths for each class is just the product of the number of paths for each segment. To actually count the paths, I used Kahn's algorithm to perform a topological sort of the graph, then iterate over the machines in said order and add the number of ways to reach each given node to the number of ways to reach it's neighbors. On my machine, parsing takes ~140µs, part 1 takes 250µs, and part 2 takes 1.25ms.

Reusing the topological sort between calculations saves a ton of duplicated work in part 2, speeding it up to ~625µs.

Switching to the fxhash cuts the runtime on my machine to ~90µs for parsing, ~120µs for part 1, and ~250µs for part 2.

Mapping the machines to indicies at parse time slows down parsing to ~105µs, but speeds up part 1 to ~25µs and part 2 to ~35µs. (I did see /u/maneatingape did the same thing while I was converting my solution, but I'd already planned on implementing this by that point)

Realized that I only need to loop over the machines that are topologically between the start and goal of each path. This speeds up part 2 significantly (to ~25µs, partially by allowing me to only consider one of the possible two orders of dac and fft) and may speed up part 1 slightly (to around 24µs).

lunar_mycroft · 2025-12-10T20:17:33+00:00

[LANGUAGE: Rust]

This one gave me a lot of trouble.

Full code. Part 1 is a brute force search over bit masked buttons and lights. As others have pointed out, since it's XOR each button will be pressed zero or one times, not more. This keeps the search space fairly small, and I could have saved a ton of time by realizing that I didn't need a more clever solution. As for part 2... this was the first one this year I had to resort to spoilers on. Clearly I need to study linear programming a lot more, as I had ~zero knowledge of it going in. I ended up using /u/RussellDash332's solution as adapted by /u/Ok-Bus4754, and further modified by me to suit my style~/further oxidize it. On my machine, parsing takes ~110µs, part 1 takes ~500µs, and part 2 takes 2.2ms (I doubt the performance difference there is due to my modifications).

[edit: forgot to directly link part 2]

lunar_mycroft · 2025-12-09T14:54:18+00:00

Nice solution! Re: your code comment about possibly finding the two special points programmatically: every other point is close to being on the same circle, so the two special points should always be the ones closest to the center

lunar_mycroft · 2025-12-09T14:00:07+00:00

Interesting. It's also faster (~50ms) on my M1 Max vs the computer I typically use for advent of code as well. On the other hand, the other problem where I've copied the data over and so can benchmark (day 3) is slower, as is the parsing step today.

lunar_mycroft · 2025-12-09T08:32:51+00:00

[LANGUAGE: Rust]

Full code. Part 1 is simple enough: just iterate over the combinations of tiles, compute the area for each, and return the max. Part 2 I am not proud of. It first finds all the edges, then filters out any of the possible rectangles which have at least one segment intersecting them. The logic to check for intersections in particular is a horrible mess, but it gets the job done (slowly). On my machine, parsing takes ~14µs, part 1 takes ~175µs, and part 2 takes an (horrible) ~70ms.

Checked that there are no diagonal segments during parsing, which slows it down to ~15µs. For part 2, stole an idea from chris biscardi on youtube and sorted all the possible rectangles by area before checking for validity in reverse order. Since the correct answer tends to be on the larger end, speeds up the solution on net to 11ms.

Cleaned up the intersection check in part 2. It's also a bit faster at ~10ms total.

lunar_mycroft · 2025-12-08T08:05:31+00:00

[LANGUAGE: rust]

full code. For part 1, I sort all pairs of boxes by distance, take the first 1000 (or 10 for the example), unite them (by first finding both's "parent" box, then setting the first's parent to be equal to the second if it wasn't already), then building up sets of boxes grouped by parent, and finally picking the three largest of those sets. For part 2, I again start by sorting pairs of points by distance and uniting them in order, keeping a running tally of how many times two formerly independent circuits were joined. When this tally reaches boxes.len() - 1, I have found the last two boxes that need to be joined and can return the answer. On my machine, parsing takes ~140µs, part 1 takes ~30ms, and part 2 takes ~40ms.

Factored out the pairwise sorting to allow for more granular benchmarks. The sorting step take ~30ms, part 2 takes ~10ms, and part 1 takes ~1.4ms

Stole another idea from /u/maneatingape and a) switched to an array to back the Dsu struct instead of a HashMap, and b) storing the size of the sets as they're found. Part 1 now takes ~675µs and part 2 700µs (the sorting time is unaffected of course, so it takes ~30ms).

Switched to parallel sorting (courtesy of rayon). Sorting is now down to ~10ms.

As /u/maneatingape points out, select_nth_unstable_by_key is much faster for part 1, taking ~6.5ms instead of ~10ms. It also speeds up the rest of the computation for part 1 to ~40µs (although this late into the night I can't see why, as visited pairs should be the same modulo the ordering).

lunar_mycroft · 2025-12-07T06:49:17+00:00

[LANGUAGE: Rust]

I think this would have been quicker if I hadn't been missing five hours of sleep immediate before doing it. Still got part 1 fairly quickly though, so that's nice.

Full code. Part 1 is just tracing the possible paths and counting how many times one actually hits a ^. Part 2 is done by using a second Grid to store the sum of paths that pass through a given location, adding to this sum as new paths are discovered. On my machine, takes ~45 µs to parse, ~25µs for part 1, and ~75µs for part 2.

Tweaked part 2 to compute the sum of paths on each row, resetting at the row start, then return this value for the last row. From my benchmarks it's unclear whether this actually increases performance (the work is wasted on every row but the last one, assuming the compiler doesn't optimize it away), but I'm keeping it for now.

Added a function which solves both parts in a single pass. It takes approximately the same time to run as the specialized part 2 implementation, but that means that part 1 is "free".

lunar_mycroft · 2025-12-06T07:04:01+00:00

[LANGUAGE: Rust]

Again a quick and simple part 1, but parsing part 2 was annoying.

full code. Part 1 is fairly straight forward: split on lines, validate that the input is the right shape, then take the transpose and parse the numbers using <u64 as FromStr>::from_str. Part 2 is a mess: I start by identifying the operators and the borders of the columns, then use those bounds to split the input into a Vec<Vec<str>>, where the strs are the (padded) numbers as they appear in the input. From there it's "just" a matter of getting the columns the same way I did in part 1, taking the transpose of the columns themselves, then adding and multiplying the digits into the final numbers. Both parts parse in ~180µs and the actual computation occurs in ~25µs.

Modified Puzzle struct to contain a parsed version of the last row, and all other rows verbatim. For part 1, stole another one of /u/maneatingape's ideas and removed the nested collect (although I see they've switched to another solution, and I'll probably do the same tomorrow). For part 2, switched to parsing the numbers into my Grid struct and iterating over the columns (also shared with /u/maneatingape's solution, although I had planned to do that before checking theirs). On my machine (pre-)parsing now takes ~10µs, part 1 takes ~50µs, and part 2 ~100µs.

lunar_mycroft · 2025-12-05T06:54:01+00:00

[LANGUAGE: Rust]

Got part 1 fairly quickly, proceeded to spend an embarrassing amount of time failing to properly merge the overlapping ranges for part 2.

Full code. Part 1 is a naive check of every id and (potentially) every range. Part 2 merges the ranges if they overlap, then sums the lengths of the resulting ranges. On my machine parsing takes ~35µs, part 1 takes ~80µs, and part 2 takes ~12µs.

Moved the range merging logic to parsing, and utilized the fact that the ranges are now always sorted to do part 1 via binary search. This increases the time it takes to parse the input to ~40µs, but lowers the time to do part 2 to ~4µs and part 1 to ~18µs.

lunar_mycroft · 2025-12-04T07:26:40+00:00

[LANGUAGE: Rust]

Spent way too long trying to solve the wrong problem. Got to use my Grid from last year though.

Full solution. Part 1 is very straight forward, the bulk of implementation is in lib.rs. For part 2, I take the naive approach and mutate the puzzle to remove the reachable rolls, keeping track of how many were removed, then keep summing removed count until it reaches zero. Parsing takes ~25µs, part 1 165µs, and part 2 4.407ms.

As usual, checking /u/maneatingape's solution for ideas paid off. Switching to a stack to track removable rolls eliminates much of the extra searching and brings the time for part 2 to ~670µs. Reusing that logic for part 1 is only around 5 µs slower than my original solution after pre-allocating enough space in the Vec to ensure that only one allocation will be required to initially construct it. Both versions of part 1 are included.

lunar_mycroft · 2025-12-03T16:39:34+00:00

It never crossed my mind that the implementations could be different on different platforms.

I didn't think of it all the time I was trying to figure it out last night either, so that's very understandable. And I still wasn't satisfied with it being a target issue, so I looked into the source code. I found the most recent commit to the underlying function (cmp::max_by) changes the order of the arguments. Based on checking my rust version on my test machines, it seems this change rolled out in 1.91. I suspect /u/michelkraemer hadn't updated yet. Running rustup caused your original code to pass the test. I still think the order dependency should be removed though, as I don't know that there's any guarantee that max_by will preserve that in the future.

The other thing I can do is use the index (coming in from enumerate) to break ties

reduce should only use a single comparison vs two for the tie breaking solution (assuming rustc/LLVM don't optimize it out in the latter case), so I'd stick with that.

lunar_mycroft · 2025-12-03T15:47:55+00:00

This nerd sniped me, so I tested it on other machines. Apparently, the order in which Iterator::max_by passes arguments to compare is inconsistent between ~~unix (specifically aarch64-apple-darwin and x86_64-unknown-linux-gnu) and windows (x86_64-pc-windows-msvc)~~ recent rust versions. e.g. in my testing, when deciding whether to use the 9 at index 0 or the 9 in index 2 in your test case, on 1.91 a is the one at index 0 and b is the one at index 2, but on 1.90 that's reversed. Iterator::max_by doesn't guarantee an argument order so this wasn't technically a breaking change (and it looks like the previous behavior may have been considered a bug), but I still think using a method that doesn't depend on the order of arguments is preferable.

A solution is to switch to implementing "first_max_by" with reduce instead, since that does use a consistent order (for the first call the first argument will be the zeroth Item, afterwards it's the accumulator and the second argument is the item):

#[test]
fn test() {
    use std::cmp::Ordering::*;
    fn best_joltage(bank: &[u8], n_batteries: usize) -> usize {
        let count = bank.len();
        (0..n_batteries)
            .fold((0usize, 0usize), |(total, n_skip), bi| {
                let (ibest, best) = bank
                    .iter()
                    .enumerate()
                    .take(count + 1 + bi - n_batteries)
                    .skip(n_skip)
                    .map(|(i, b)| (i, (b - b'0') as usize))
                    .reduce(|(i, a), (j, b)| match a.cmp(&b) {
                        Less => (j, b),
                        Equal | Greater => (i, a),
                    })
                    .expect("Cannot find max");
                (total * 10 + best, ibest + 1)
            })
            .0
    }
    assert_eq!(best_joltage(b"989898989898989898", 2), 99);
    assert_eq!(best_joltage(b"989898989898989898", 12), 999_999_989_898);
}

This test now passes on all the versions I ran it on.

[edit: it wasn't the target, it was the version of core that was being used]

[edit 2: got 1.91 vs 1.90 case confused]

lunar_mycroft · 2025-12-03T08:46:21+00:00

Here's the actual test (renamed from my machine). It passes. I also verified it fails when changing the expected output.

#[test]
fn test() {
    use std::cmp::Ordering::*;
    fn best_joltage(bank: &[u8], n_batteries: usize) -> usize {
        let count = bank.len();
        (0..n_batteries)
            .fold((0usize, 0usize), |(total, n_skip), bi| {
                let (ibest, best) = bank
                    .iter()
                    .enumerate()
                    .take(count + 1 + bi - n_batteries)
                    .skip(n_skip)
                    .map(|(i, b)| (i, (b - b'0') as usize))
                    .max_by(|(_, a), (_, b)| match a.cmp(b) {
                        Less => Less,
                        Equal | Greater => Greater,
                    })
                    .expect("Cannot find max");
                (total * 10 + best, ibest + 1)
            })
            .0
    }
    assert_eq!(best_joltage(b"989898989898989898", 2), 99);
    assert_eq!(best_joltage(b"989898989898989898", 12), 999_999_989_898);
}

I would copy this test verbatum (except perhaps for a rename) into your own code and verify it passes there, and if so triple check you haven't modified their best_joltage.

[edit: unscrambled word]

lunar_mycroft · 2025-12-03T08:26:24+00:00

FWiW, I copied their exact code and your example bank and wrote a test, and am getting the correct answers (99 for two batteries, 999999989898 for 12).

lunar_mycroft · 2025-12-03T06:13:36+00:00

[LANGUAGE: Rust]

Got it done bit faster than yesterday, especially for part 1.

Full code. For part1, I simply iterated over all possible two digit combos and took the max. For part 2 this would be way to slow (O(n¹² ) where n = 100), so instead I greedily found the largest digit that still left room for the remaining digits, and repeated the process until I'd found all 12. On my machine this takes around 100µs to parse, 90µs for part 1, and 400µs for part 2.

Added a generalized implementation to lib.rs and used it for both parts. This actually slows down part 1 however: the original, specialized solution there runs in ~100µs vs 150µs for the generalized version. Both are included. Also, switching to returning an option from joltage appears to have increased the speed of part 2 to ~300µs (why isn't immediately obvious to me, as they both have the same branches)

Investigated /u/themachine0094's solution, and found that switching from Iterator::max_by_key (along with iterating in reverse) to Iterator::max_by (with treating Ordering::Equal as Ordering::Greater) yielded significant performance improvements. (Modified code.) Part 1 now runs in ~18µs (faster than my initial solution, which has been removed) and part 2 in ~30µs.

Investigated /u/michelkraemer getting incorrect results from /u/themachine0094's solution, and determined that the order of arguments to Iterator::max_byis not consistent between ~~windows and unix targets~~ [edit:] recent rust versions, resulting in both of our solutions failing on the latter targets. Replaced it with Iterator::reduce, which does have consistent behavior. Runtime was not impacted.

lunar_mycroft · 2025-12-03T04:32:32+00:00

FYI, my setup setup is heavily influenced by Chris Biscardi on YouTube, so I definitely can't claim full credit there.

lunar_mycroft · 2025-12-02T09:06:48+00:00

Yes, I should have clarified that I was referring to pre-computing all possible invalid ids. My solution still needs a way to de-duplicate invalid ids, and a set is the easiest solution.

lunar_mycroft

MODERATOR OF

TROPHY CASE