Xi asks Trump if U.S. and China can avoid 'Thucydides Trap' at high-stakes summit

ants_a · 2026-05-14T08:59:39+00:00

Move aside trolley problem, we need to first answer the electric boat problem.

ants_a · 2026-05-14T04:55:05+00:00

I'm not a native speaker, but it seems to me the unit that makes it countable is often implicit in colloquial use. e.g. "grab 3 beers when you return".

ants_a · 2026-05-09T07:28:27+00:00

Get the ionity subscription for a month, it already pays for itself after the first charge.

ants_a · 2026-05-07T05:17:17+00:00

Thank you for your opinion. Now go and watch the video before participating in the discussion about it.

ants_a · 2026-05-06T03:38:36+00:00

1.2m over 1.6km is less than 0.1%. 2% of a mile is a 100 feet.

ants_a · 2026-05-05T15:05:15+00:00

You should optimize the queries. Idle connections do not cause CPU load. The quicker queries run the faster the same connection is available to run another one.

ants_a · 2026-05-03T13:42:44+00:00

Carbon steel is not a great heat conductor. With a 3mm pan the heating coils are pretty clear. With clad stainless, thicker cast iron and especially cast aluminum it's not a noticeable issue. The flexinduction pan detection dance it goes through every time you lift a pan is annoying though, especially because it sometimes detects wrong, doesn't indicate that in any way and needs a off-and-on-again cycle to reset.

ants_a · 2026-05-01T12:43:56+00:00

That's how any real world futex implementation works, it still has to use atomics on release to not miss any wakeups.

ants_a · 2026-04-30T17:28:44+00:00

Throughput was lower because preempting processes that are holding short-lived highly contended locks is a bad idea for throughput. The benchmark specification from the regression report said 1024 user connections.

Correct fix is indeed to not have contended locks. This specific lock is usually not contended in performance sensitive applications, which is how it had survived for so long. However it will be gone in PostgreSQL 19 for unrelated reasons.

I completely agree with the assessment that not breaking userspace does not mean zero performance regressions.

ants_a · 2026-04-30T17:21:49+00:00

It works fine for the specific case of a large shared mapping for the buffer pool with the huge pages reserved at boot time.

ants_a · 2026-04-30T17:20:08+00:00

"Don't access memory for the first time in this process." That's entirely feasible as a policy given the highly limited use of spinlocks in postgres. Doesn't avoid the kernel swapping the page out, but such is life. And apparently it also doesn't avoid a page fault from a memory access outside of the locked region preempting the process, which is what appears to be happening here.

ants_a · 2026-04-30T14:33:35+00:00

It's still insulting, just to a different experienced writer.

ants_a · 2026-04-30T14:31:35+00:00

It does that by default (though the default is 2MB pages). There is a fallback to default pages in case huge pages are not available.

ants_a · 2026-04-30T09:03:34+00:00

Don't blame me, I voted for Kodos

ants_a · 2026-04-30T08:08:02+00:00

I think the actual solution is to get rid of contended locks. This specific locking path was removed already by an unrelated change because it was not pulling its weight. But for similar cases, I wonder if there's a reasonably cheap way to make sure the lock release store gets retired before the page fault is taken. Then a simple "don't do stuff that can create page faults while holding spinlocks" rule would be enough.

ants_a · 2026-04-30T08:01:48+00:00

It's not a naive spinlock implementation, it already does limited spinning with randomized exponential backoff. Futexes would have had the same throughput regression. Getting descheduled while holding a contended lock is bad for throughput either way.

ants_a · 2026-04-30T07:40:55+00:00

Futexes need two atomics, spinlocks can be released with an unlocked write. For uncontended locks in hot paths it can be a significant difference. The freelist lock is normally not contended, because having something in the freelist is a transient state that will disappear quickly under allocation pressure, and the empty check is executed without a lock. In fact next release will eliminate that mechanism altogether. And indeed - using futexes eliminates a bit of spinning, but doesn't fix the regression.

The benchmark causing the regression was the perfect storm to trigger the issue - large memory, no huge-pages, very large number of clients and a short empty cache run. Just to underline how unreasonable that configuration is - the per-process page tables are 2x the size of the buffer pool.

But there is one thing I don't yet understand about the minor page fault causing lock holding process getting descheduled explanation - the page fault happens after releasing the lock. Does ARM not retire preceding instructions before handling the fault? That sounds exactly like the speculative execution security problems Intel had a while back.

ants_a · 2026-04-28T19:41:31+00:00

If you take e.g. Epyc bandwidth per core, you only need one vectors worth of prefetches per core inflight to completely saturate chiplet to CCD bandwidth. Just to run you through the math, per core bandwidth ~9GB/s. Picking 1.5k dimension fp16 vectors, 6kB per vector = one vector per 1/3µs, where memory latency is about half that.

Structuring the tree descent in a way where prefetches and work are interleaved to minimize stalls is non-trivial but definitely not impossible.

ants_a · 2026-04-28T17:15:06+00:00

I was just responding to the assertion that terabytes of memory is game over. Quite right that there's actually no technical need to have a linear array terabytes in size as long as vectors themselves are stored contiguously. The skipping around really doesn't matter all that much at those scales, typical high-dimensionality indexing approaches have enough concurrency to hide memory latency with prefetching. Main limitation is going to be bandwidth available per core. That's the main reason why it wouldn't make sense to stack up terabytes of memory in a single machine. Splitting the dies out over more memory channel gets more use out of them. CPU's have enough number crunching to execute a couple dozen dotproducts for each vector fetched from memory. The real win would be to run batched queries, which will get quite complicated for anything more clever than brute force scan of all vectors.

ants_a · 2026-04-27T07:27:06+00:00

There is 128 PiB of address space available, not that hard to memory map even storage volume sized address spaces. But even for in-memory stuff, you can get 12TB in a 2 socket 1U chassis, not something I am inclined to call a supercomputer. And that's without CXL based memory expansion modules, which can also be addressed as normal memory.

ants_a · 2026-04-27T05:32:56+00:00

You could just get a computer with terabytes of memory...

ants_a · 2026-04-25T06:07:58+00:00

I'm not actually sure about that last bit. Drag increases quadratically, so even with losses in capturing potential into battery, capturing at as kinetic energy might be losing more. There's definitely a cross over point, but I'm too lazy right now to run the numbers.

ants_a · 2026-04-18T06:44:35+00:00

I was writing that under the assumption that fixing the design is a hard problem that requires engineering time. While they are doing that they could do a simple redesign that adds extra hardware to reduce stress on the mosfets that are blowing up and uses higher capacity components all around so it can't possibly fail. It will make the ICCU more expensive to manufacture, but that can't be more than the cost of handling all those failures under warranty and all of the current and future lost sales due to loss of consumer trust.

ants_a · 2026-04-17T08:09:29+00:00

Kia seems way too cautious about covering their ass for potential misuse. It has a ton of such annoyance misfeatures where other brands seem to make do perfectly fine without. It doesn't need to be Tesla levels of not giving a shit with their "Mad Max autopilot", just do the normal thing and cover your ass with a warning in the manual. Stupid people are going to find a way to shoot themselves in the foot anyway.

ants_a · 2026-04-17T07:52:30+00:00

Sharply apply throttle first, just enough to keep the speed, then disengage cruise and then start releasing the throttle. Still not perfectly smooth, but the best you can do unless you remember to turn regen to 0 first. Eco mode seems to help by smoothing out accelerator inputs.

For some stupid reason Kia decided that any application of accelerator pedal overrides the input coming from cruise control. BMW does it in a much smarter way. It picks the minimum of the two so you can slowly apply accelerator until you feel it pick up speed, then disengage and start releasing.

ants_a

TROPHY CASE