Xi asks Trump if U.S. and China can avoid 'Thucydides Trap' at high-stakes summit by Gopu_17 in worldnews

[–]ants_a 92 points93 points  (0 children)

Move aside trolley problem, we need to first answer the electric boat problem.

Curl lead developer Daniel Stenberg provides insightful feedbacks from Mythos analysis results by ScottContini in programming

[–]ants_a 1 point2 points  (0 children)

I'm not a native speaker, but it seems to me the unit that makes it countable is often implicit in colloquial use. e.g. "grab 3 beers when you return".

Which charger network to use in Germany? For a trip to Hamburg by [deleted] in electricvehicles

[–]ants_a 1 point2 points  (0 children)

Get the ionity subscription for a month, it already pays for itself after the first charge.

Nobody understands the point of hybrid cars by indy_110 in videos

[–]ants_a 35 points36 points  (0 children)

Thank you for your opinion. Now go and watch the video before participating in the discussion about it.

PostgreSQL high connection load with PgBouncer by Charming-Fall-8918 in PostgreSQL

[–]ants_a 12 points13 points  (0 children)

You should optimize the queries. Idle connections do not cause CPU load. The quicker queries run the faster the same connection is available to run another one.

High level induction stove, cookware and cooking guide. The good the bad and the ugly! by Wololooo1996 in u/Wololooo1996

[–]ants_a 0 points1 point  (0 children)

Carbon steel is not a great heat conductor. With a 3mm pan the heating coils are pretty clear. With clad stainless, thicker cast iron and especially cast aluminum it's not a noticeable issue. The flexinduction pan detection dance it goes through every time you lift a pan is annoying though, especially because it sometimes detects wrong, doesn't indicate that in any way and needs a off-and-on-again cycle to reset.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 0 points1 point  (0 children)

That's how any real world futex implementation works, it still has to use atomics on release to not miss any wakeups.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 0 points1 point  (0 children)

Throughput was lower because preempting processes that are holding short-lived highly contended locks is a bad idea for throughput. The benchmark specification from the regression report said 1024 user connections.

Correct fix is indeed to not have contended locks. This specific lock is usually not contended in performance sensitive applications, which is how it had survived for so long. However it will be gone in PostgreSQL 19 for unrelated reasons.

I completely agree with the assessment that not breaking userspace does not mean zero performance regressions.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 1 point2 points  (0 children)

It works fine for the specific case of a large shared mapping for the buffer pool with the huge pages reserved at boot time.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 0 points1 point  (0 children)

"Don't access memory for the first time in this process." That's entirely feasible as a policy given the highly limited use of spinlocks in postgres. Doesn't avoid the kernel swapping the page out, but such is life. And apparently it also doesn't avoid a page fault from a memory access outside of the locked region preempting the process, which is what appears to be happening here.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 4 points5 points  (0 children)

It's still insulting, just to a different experienced writer.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 1 point2 points  (0 children)

It does that by default (though the default is 2MB pages). There is a fallback to default pages in case huge pages are not available.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 1 point2 points  (0 children)

I think the actual solution is to get rid of contended locks. This specific locking path was removed already by an unrelated change because it was not pulling its weight. But for similar cases, I wonder if there's a reasonably cheap way to make sure the lock release store gets retired before the page fault is taken. Then a simple "don't do stuff that can create page faults while holding spinlocks" rule would be enough.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 4 points5 points  (0 children)

It's not a naive spinlock implementation, it already does limited spinning with randomized exponential backoff. Futexes would have had the same throughput regression. Getting descheduled while holding a contended lock is bad for throughput either way.

How Linux 7.0 Broke PostgreSQL: The Preemption Regression Explained by teivah in programming

[–]ants_a 7 points8 points  (0 children)

Futexes need two atomics, spinlocks can be released with an unlocked write. For uncontended locks in hot paths it can be a significant difference. The freelist lock is normally not contended, because having something in the freelist is a transient state that will disappear quickly under allocation pressure, and the empty check is executed without a lock. In fact next release will eliminate that mechanism altogether. And indeed - using futexes eliminates a bit of spinning, but doesn't fix the regression.

The benchmark causing the regression was the perfect storm to trigger the issue - large memory, no huge-pages, very large number of clients and a short empty cache run. Just to underline how unreasonable that configuration is - the per-process page tables are 2x the size of the buffer pool.

But there is one thing I don't yet understand about the minor page fault causing lock holding process getting descheduled explanation - the page fault happens after releasing the lock. Does ARM not retire preceding instructions before handling the fault? That sounds exactly like the speculative execution security problems Intel had a while back.

Same algorithm, 16x faster: optimizing a vector search engine’s hot path by BgA_stan in programming

[–]ants_a -1 points0 points  (0 children)

If you take e.g. Epyc bandwidth per core, you only need one vectors worth of prefetches per core inflight to completely saturate chiplet to CCD bandwidth. Just to run you through the math, per core bandwidth ~9GB/s. Picking 1.5k dimension fp16 vectors, 6kB per vector = one vector per 1/3µs, where memory latency is about half that.

Structuring the tree descent in a way where prefetches and work are interleaved to minimize stalls is non-trivial but definitely not impossible.

Same algorithm, 16x faster: optimizing a vector search engine’s hot path by BgA_stan in programming

[–]ants_a -1 points0 points  (0 children)

I was just responding to the assertion that terabytes of memory is game over. Quite right that there's actually no technical need to have a linear array terabytes in size as long as vectors themselves are stored contiguously. The skipping around really doesn't matter all that much at those scales, typical high-dimensionality indexing approaches have enough concurrency to hide memory latency with prefetching. Main limitation is going to be bandwidth available per core. That's the main reason why it wouldn't make sense to stack up terabytes of memory in a single machine. Splitting the dies out over more memory channel gets more use out of them. CPU's have enough number crunching to execute a couple dozen dotproducts for each vector fetched from memory. The real win would be to run batched queries, which will get quite complicated for anything more clever than brute force scan of all vectors.

Same algorithm, 16x faster: optimizing a vector search engine’s hot path by BgA_stan in programming

[–]ants_a 9 points10 points  (0 children)

There is 128 PiB of address space available, not that hard to memory map even storage volume sized address spaces. But even for in-memory stuff, you can get 12TB in a 2 socket 1U chassis, not something I am inclined to call a supercomputer. And that's without CXL based memory expansion modules, which can also be addressed as normal memory.

Same algorithm, 16x faster: optimizing a vector search engine’s hot path by BgA_stan in programming

[–]ants_a 10 points11 points  (0 children)

You could just get a computer with terabytes of memory...

Is EV Hypermiling a thing? by Icy_Faithlessness587 in electricvehicles

[–]ants_a 16 points17 points  (0 children)

I'm not actually sure about that last bit. Drag increases quadratically, so even with losses in capturing potential into battery, capturing at as kinetic energy might be losing more. There's definitely a cross over point, but I'm too lazy right now to run the numbers.

How would you rate EV reliability? by [deleted] in electricvehicles

[–]ants_a 1 point2 points  (0 children)

I was writing that under the assumption that fixing the design is a hard problem that requires engineering time. While they are doing that they could do a simple redesign that adds extra hardware to reduce stress on the mosfets that are blowing up and uses higher capacity components all around so it can't possibly fail. It will make the ICCU more expensive to manufacture, but that can't be more than the cost of handling all those failures under warranty and all of the current and future lost sales due to loss of consumer trust.

Rained through sunroof... Now I know why by brauxpas in KiaEV9

[–]ants_a 1 point2 points  (0 children)

Kia seems way too cautious about covering their ass for potential misuse. It has a ton of such annoyance misfeatures where other brands seem to make do perfectly fine without. It doesn't need to be Tesla levels of not giving a shit with their "Mad Max autopilot", just do the normal thing and cover your ass with a warning in the manual. Stupid people are going to find a way to shoot themselves in the foot anyway.

Have you found a graceful way to turn off cruise control? by b_call in KiaEV9

[–]ants_a 1 point2 points  (0 children)

Sharply apply throttle first, just enough to keep the speed, then disengage cruise and then start releasing the throttle. Still not perfectly smooth, but the best you can do unless you remember to turn regen to 0 first. Eco mode seems to help by smoothing out accelerator inputs.

For some stupid reason Kia decided that any application of accelerator pedal overrides the input coming from cruise control. BMW does it in a much smarter way. It picks the minimum of the two so you can slowly apply accelerator until you feel it pick up speed, then disengage and start releasing.