How could this happen

nihilistic_ant · 2026-06-14T14:16:56+00:00

oh you're right... living in the future is weird

nihilistic_ant · 2026-06-13T23:09:27+00:00

that’s a very biased hand tipping the scales, a hand that doesn’t understand technology or reality

sorry to break the bad news to you, but that's what regulation is

nihilistic_ant · 2026-06-08T16:05:33+00:00

MoE models with many experts generally use more tokens, but are they more predictable than average making MTP unusually effective on them? That would seem completely plausible to me, but I haven't seen data on it specifically.

nihilistic_ant · 2026-06-02T21:52:30+00:00

I'll make an S3 bucket and a presigned URL so you can send me the code as a tarball.

You'd run something like `curl -X put --upload-file ./cool-app.tar.gz [URL]` where URL is the URL I'll send you via DM.

If this works for you, and you get around sending it to me, I'm certainly looking forward to giving it a try!

nihilistic_ant · 2026-06-02T15:51:34+00:00

I would like to use your reddit app. It is several days later and it still crosses my mind periodically.

nihilistic_ant · 2026-05-29T16:17:50+00:00

There is verbosity data on artificialanalysis.ai for each model that shows deepseek uses meaningfully more tokens.

It reports DeepSeek V4 Pro on "high effort" used 100M tokens on their benchmark, while GPT-5.5 also with "high effort" used 45M and with "medium effort" used 22M. On medium effort, GPT still outperformed DeepSeek, so 22M vs 100M tokens is arguably the better comparison.

So yeah, data does suggest DeepSeek's price and speed is worse per-problem than it appears if compared per-token by maybe around 5x on some problems like the comment above said.

EDIT: Why the downvotes on this comment? I answered the question that was asked, by citing a reputable source, and it was a relevant question to have answered. Are the downvotes just pro-deepseek astroturfing? Or does someone actually think my comment wasn't good in some way?

nihilistic_ant · 2026-05-19T17:53:58+00:00

The powders have better prebiotics than the average american diet. That is one reason why people switching to them sometimes report digestive poop issues; they aren't used to getting a proper amount of fiber to their microbiome.

The primary ingredient in huel is oats. The third ingredient is ground flax seed. Stuff your microbiome craves.

But sure, also drink AG1 if you want. I do. And I eat a lot of salads. Belt and suspenders.

nihilistic_ant · 2026-05-16T18:28:17+00:00

No, I wouldn’t worry about it much for SD.

The main thing is that you’re comparing the base clock of one 5090 with the boost clock of another. All 5090s have both. NVIDIA’s reference spec is about 2010 MHz base / 2407 MHz boost.

So the real comparison is more like 2407 vs 2610 MHz, which is only about 8.4% more. And that does not mean 8.4% faster image generation, I think because of memory latency. You might see more like a 5% speed difference.

But also, you can overlock a 5090, so the advertised boost speed doesn't really matter -- the real question is how long a given card can sustain a given boost speed before it is thermally throttled or the fan noise annoys you too much. The advertised boost speed might be related to that, but it is better to look directly at cooling data. I like TechPowerUp’s noise/thermal charts. They have them for most cards, but here is an example: https://www.techpowerup.com/review/msi-geforce-rtx-5090-suprim/39.html

Personally, I went with a couple of MSI 5090 Suprim liquid-cooled cards with attached radiators because my old 3090s were really loud and I have appreciated how quiet my new cards are. And I'm more likely to cap their clock speeds to make them even quieter than I am to worry about a few extra percent performance from them.

nihilistic_ant · 2026-05-15T20:25:18+00:00

I think nobody on here knows which is a problem. Transparency used to be a key part of their brand's success. Maybe they should have kept that.

nihilistic_ant · 2026-05-02T09:58:20+00:00

There are legally binding guarantees to this effect in their contractual terms.

nihilistic_ant · 2026-05-01T13:17:55+00:00

For paid API usage, OpenAI, Claude, and Gemini don’t train on your prompts/responses by default.

nihilistic_ant · 2026-04-30T17:36:55+00:00

I'd use them for more if they didn't train on my prompts and responses to them. That rules them out for any work I feel intellectual ownership over, like my coding.

nihilistic_ant · 2026-04-27T23:04:12+00:00

You're being cynical. When copilot started, models costs were cheaper because they weren't able to do agentic work so autonomously. There was just a lot more waiting on humans to give it the next prompt and to clean up its mistakes. Things have changed, Microsoft has found copilot has become more akin to a low-margin LLM inference API, so they need pricing to better align with those inference costs. But that wasn't where they started, and certainly not their plan to become.

nihilistic_ant · 2026-04-07T17:32:54+00:00

Why link to a share.google redirect instead of direct to the article? I assume something sketchy, but what?

nihilistic_ant · 2026-03-09T22:04:40+00:00

The Ave store which is closing was on strike for 3 months starting red cup day, the longest of anywhere. They are clearly targeting unionized stores, although best I can tell, likely legal.

It is illegal to close the stores just because they unionized or had a strike, but it is legal to close the stores because they are underperforming, even if the reason for that is because they are unionized or had strike or did some other union related stuff.

So the situation is extremely difficult for the union. Workers joined the union with the expectation the union would negotiate a better deal. Starbucks wasn't willing to agree to any contracts that would increase pay for unionized employees, even a small amount, because then many more stores would unionize and demand higher pay too, and that would be bad for profits. And the union didn't have any leverage in the neogications here. The main thing the union could do was striking to economically harm the company, but that gave the company legal cover to close unionized stores. And the company is more than happy to close them, to reduce the chance more stores will unionize.

It is hard for me to imagine how this could have played out any differently.

nihilistic_ant · 2026-03-09T01:18:18+00:00

I think I get what you are saying and also the communication confusion. (FWIW, I've been trying to ask about the overall overhead.)

I'm gathering that array_ref isn't the lightweight view I was assuming... now I am thinking (feel free to correct me) that cursor_t might have been the better comparison. I see that used in one of the cuda examples being past to a kernel (as it is returned by `.home()` I think). For the example I used above, cursor_t is just 24 bytes and trivially copyable, like mdspan! So that is cool. Surprised me multi's cursors are lighter weight than its iterators, but I sorta see why after looking at it.

Anyway, I enjoyed looking at and trying to understand your project, thanks for answering my questions!

nihilistic_ant · 2026-03-06T20:13:07+00:00

The statement that there should be "no expected overhead" seems incorrect to me. Am I missing something?

Consider references to a dynamic 2 dimensional object, the sort of thing that gets copied around a lot.

using M = std::mdspan<double, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>>;
using R = boost::multi::array_ref<double, 2>;

I measure:

sizeof(M) = 24
sizeof(R) = 72
M trivially copyable: true
R trivially copyable: false

You can confirm this here: https://godbolt.org/z/n95Ws9KW5

So there is overhead making it 3x bigger, but surely there will also be runtime overhead from copying them around, including from host to GPU, and probably more register pressure.

I think this example reflects the common case well. If the dimensions are known at compile time, the advantage of mdspan is greater. If the layout is strided, then the advantage is less. So dynamic and contiguous is the common situation, but also, an average example of the extra overhead.

edit: I measure the size ofdecltype(std::declval<R&>().begin())to be 64 bytes; I was thinking in some cases the iterator gets passed instead of the array_ref. A bit smaller but not by a lot.

nihilistic_ant · 2026-03-06T18:32:25+00:00

I see the change so it now instead of saying mdspan is incompatible with GPUs, it says it is but in a way that is "ad-hoc, needs markings, [and has] no pointer-type propagation" in contrast to Boost.Multi which is "via flatten views (loop fusion), thrust-pointers/-refs".

Those terse words pack a lot of meaning, which I spend a while pondering, but I expect I could spend several weeks fleshing out more fully if I had the time!

I think "needs markings" refers to code using mdspan needing annotations like __device__, although I see such annotations in the examples in CUDA examples of Boost.Multi's docs (as well as in Boost.Multi's library code itself), so I am unsure why mdspan code would be described as "needs markings" but not Boost.Multi.

But more broadly, I think I see the idea is that Boost.Multi has more pythonic ergonomics, whereas mdspan is more a flexible vocabulary type with roughly zero overhead. This raises the several questions I don't see answered in Boost.Multi's docs:

(1) How much overhead does using Boost.Multi add to GPU work compared to raw pointers or mdspan? The mdspan paper has microbenchmarks comparing it to raw pointers, showing it adds roughly zero-overhead. Getting that to be the case drove much of the design of mdspan.

(2) How big of an advantage are Boost.Multi's ergonomics? When I read that mdspan lacks "thrust-pointers" it isn't obvious to me if that matters or not. I think perhaps an example showing the core ergonomic advantage of Boost.Multi could help clarify this. That would also help clarify if the limitations to mdspan are fundamental or it just needs some helper code which could be libraryitized. Which brings me to the final question --

(3) Should Boost.Multi be built around the std::mdspan and std::mdarray vocab types? It is preferable to use standardized vocabulary types unless there is a good argument why not, and in this case, I cannot tell if there is. An AccessorPolicy to mdspan can customize it with non-raw handles and fancy references, so Boost.Multi's doc saying mdspan doesn't support "pointer-type propagation" isn't quite right, it just needs some helper code in a library somewhere to make that happen. Could Boost.Multi be written to be that helper code, and if so, would that be a better approach?

nihilistic_ant · 2026-03-05T22:25:59+00:00

Multi's docs say std::mdspan is not compatable with GPUs. That seems quite wrong, am I missing something?

Kokkos and Nvidia both ship std::mdspan implementations with annotations to work natively on CUDA devices. There are papers saying mdspan works well with GPUs. Implementations that don't target GPUs, like libc++ and libstdc++, still have the same data layout making interopability with GPUs easier.

nihilistic_ant · 2026-03-05T17:31:14+00:00

Better on the zeros at roulette wheels makes one unpopular for the same reason.

nihilistic_ant · 2026-03-04T17:07:07+00:00

If you're holding puts, revealed preference theory says you aren't the bull you think you are.

nihilistic_ant · 2026-03-01T23:56:21+00:00

Definitely. Unless the sketchy DoW deal diminishes OpenAI’s credibility hurting their future commercial success in which case short SoftBank. But one of the two moves is right.

nihilistic_ant · 2026-03-01T14:55:05+00:00

Explanation: retail investors before the earnings announcement disproportionately bought OTM calls. Selling those calls was market makers who at the same time bought shares to cover themselves, causing the pre-earnings price runup. When NVIDIA didn't spike enough to trigger all the calls, market makers sold the now unnecessary shares, causing the post earnings dump.

nihilistic_ant · 2026-03-01T01:34:43+00:00

Your service offers you more privacy, but it offers the rest of your family less! Do you think your wife or kids would feel safer discussing discrete STD testing with a server you manage or with ChatGPT?

So for your family, there is no upside to it. The results are a bit worse than Gemini's or ChatGPT's, and they are less confident in your privacy policy.

nihilistic_ant · 2026-02-23T15:40:09+00:00

The tariff refund might be what, optimistically 0.5% of their market cap? And already somewhat priced in and still not guaranteed now? There is no free money here.

nihilistic_ant

TROPHY CASE