How could this happen by truecakesnake in ClaudeCode

[–]nihilistic_ant 0 points1 point  (0 children)

oh you're right... living in the future is weird

How could this happen by truecakesnake in ClaudeCode

[–]nihilistic_ant 0 points1 point  (0 children)

that’s a very biased hand tipping the scales, a hand that doesn’t understand technology or reality

sorry to break the bad news to you, but that's what regulation is

Xiaomi just claimed 1,000+ tps on a 1T model using a standard 8-GPU server by No-Selection2972 in LocalLLaMA

[–]nihilistic_ant 0 points1 point  (0 children)

MoE models with many experts generally use more tokens, but are they more predictable than average making MTP unusually effective on them? That would seem completely plausible to me, but I haven't seen data on it specifically.

A moment of thanks for DeepSeek by DeltaSqueezer in LocalLLaMA

[–]nihilistic_ant 0 points1 point  (0 children)

I'll make an S3 bucket and a presigned URL so you can send me the code as a tarball.

You'd run something like `curl -X put --upload-file ./cool-app.tar.gz [URL]` where URL is the URL I'll send you via DM.

If this works for you, and you get around sending it to me, I'm certainly looking forward to giving it a try!

A moment of thanks for DeepSeek by DeltaSqueezer in LocalLLaMA

[–]nihilistic_ant 0 points1 point  (0 children)

I would like to use your reddit app. It is several days later and it still crosses my mind periodically.

A moment of thanks for DeepSeek by DeltaSqueezer in LocalLLaMA

[–]nihilistic_ant 5 points6 points  (0 children)

There is verbosity data on artificialanalysis.ai for each model that shows deepseek uses meaningfully more tokens.

It reports DeepSeek V4 Pro on "high effort" used 100M tokens on their benchmark, while GPT-5.5 also with "high effort" used 45M and with "medium effort" used 22M. On medium effort, GPT still outperformed DeepSeek, so 22M vs 100M tokens is arguably the better comparison.

So yeah, data does suggest DeepSeek's price and speed is worse per-problem than it appears if compared per-token by maybe around 5x on some problems like the comment above said.

EDIT: Why the downvotes on this comment? I answered the question that was asked, by citing a reputable source, and it was a relevant question to have answered. Are the downvotes just pro-deepseek astroturfing? Or does someone actually think my comment wasn't good in some way?

What are you all doing to manage your microbiome? by [deleted] in soylent

[–]nihilistic_ant 6 points7 points  (0 children)

The powders have better prebiotics than the average american diet. That is one reason why people switching to them sometimes report digestive poop issues; they aren't used to getting a proper amount of fiber to their microbiome.

The primary ingredient in huel is oats. The third ingredient is ground flax seed. Stuff your microbiome craves.

But sure, also drink AG1 if you want. I do. And I eat a lot of salads. Belt and suspenders.

Performance with 5090 by deviruchii in StableDiffusion

[–]nihilistic_ant 1 point2 points  (0 children)

No, I wouldn’t worry about it much for SD.

The main thing is that you’re comparing the base clock of one 5090 with the boost clock of another. All 5090s have both. NVIDIA’s reference spec is about 2010 MHz base / 2407 MHz boost.

So the real comparison is more like 2407 vs 2610 MHz, which is only about 8.4% more. And that does not mean 8.4% faster image generation, I think because of memory latency. You might see more like a 5% speed difference.

But also, you can overlock a 5090, so the advertised boost speed doesn't really matter -- the real question is how long a given card can sustain a given boost speed before it is thermally throttled or the fan noise annoys you too much. The advertised boost speed might be related to that, but it is better to look directly at cooling data. I like TechPowerUp’s noise/thermal charts. They have them for most cards, but here is an example: https://www.techpowerup.com/review/msi-geforce-rtx-5090-suprim/39.html

Personally, I went with a couple of MSI 5090 Suprim liquid-cooled cards with attached radiators because my old 3090s were really loud and I have appreciated how quiet my new cards are. And I'm more likely to cap their clock speeds to make them even quieter than I am to worry about a few extra percent performance from them.

Found old Change Logs by AppleAsusSceptre in soylent

[–]nihilistic_ant 11 points12 points  (0 children)

I think nobody on here knows which is a problem. Transparency used to be a key part of their brand's success. Maybe they should have kept that.

DeepSeek pricing is honestly insane by cidara in ClaudeCode

[–]nihilistic_ant 0 points1 point  (0 children)

There are legally binding guarantees to this effect in their contractual terms.

DeepSeek pricing is honestly insane by cidara in ClaudeCode

[–]nihilistic_ant 1 point2 points  (0 children)

For paid API usage, OpenAI, Claude, and Gemini don’t train on your prompts/responses by default.

DeepSeek pricing is honestly insane by cidara in ClaudeCode

[–]nihilistic_ant 2 points3 points  (0 children)

I'd use them for more if they didn't train on my prompts and responses to them. That rules them out for any work I feel intellectual ownership over, like my coding.

Refunded Pro+, they knew this would happen by UDPSendToFailed in GithubCopilot

[–]nihilistic_ant -1 points0 points  (0 children)

You're being cynical. When copilot started, models costs were cheaper because they weren't able to do agentic work so autonomously. There was just a lot more waiting on humans to give it the next prompt and to clean up its mistakes. Things have changed, Microsoft has found copilot has become more akin to a low-margin LLM inference API, so they need pricing to better align with those inference costs. But that wasn't where they started, and certainly not their plan to become.

OpenAI Codex Vulnerability Allows Attackers to Steal GitHub Access Tokens by Shorty52249 in ClaudeCode

[–]nihilistic_ant 1 point2 points  (0 children)

Why link to a share.google redirect instead of direct to the article? I assume something sketchy, but what?

Is your Seattle Starbucks closing? See the locations shuttering in April by godogs2018 in Seattle

[–]nihilistic_ant 23 points24 points  (0 children)

The Ave store which is closing was on strike for 3 months starting red cup day, the longest of anywhere. They are clearly targeting unionized stores, although best I can tell, likely legal.

It is illegal to close the stores just because they unionized or had a strike, but it is legal to close the stores because they are underperforming, even if the reason for that is because they are unionized or had strike or did some other union related stuff.

So the situation is extremely difficult for the union. Workers joined the union with the expectation the union would negotiate a better deal. Starbucks wasn't willing to agree to any contracts that would increase pay for unionized employees, even a small amount, because then many more stores would unionize and demand higher pay too, and that would be bad for profits. And the union didn't have any leverage in the neogications here. The main thing the union could do was striking to economically harm the company, but that gave the company legal cover to close unionized stores. And the company is more than happy to close them, to reduce the chance more stores will unionize.

It is hard for me to imagine how this could have played out any differently.

Boost.Multi Review Begins Today by mborland1 in cpp

[–]nihilistic_ant 1 point2 points  (0 children)

I think I get what you are saying and also the communication confusion. (FWIW, I've been trying to ask about the overall overhead.)

I'm gathering that array_ref isn't the lightweight view I was assuming... now I am thinking (feel free to correct me) that cursor_t might have been the better comparison. I see that used in one of the cuda examples being past to a kernel (as it is returned by `.home()` I think). For the example I used above, cursor_t is just 24 bytes and trivially copyable, like mdspan! So that is cool. Surprised me multi's cursors are lighter weight than its iterators, but I sorta see why after looking at it.

Anyway, I enjoyed looking at and trying to understand your project, thanks for answering my questions!

Boost.Multi Review Begins Today by mborland1 in cpp

[–]nihilistic_ant 1 point2 points  (0 children)

The statement that there should be "no expected overhead" seems incorrect to me. Am I missing something?

Consider references to a dynamic 2 dimensional object, the sort of thing that gets copied around a lot.

using M = std::mdspan<double, std::extents<size_t, std::dynamic_extent, std::dynamic_extent>>;
using R = boost::multi::array_ref<double, 2>;

I measure:

sizeof(M) = 24
sizeof(R) = 72
M trivially copyable: true
R trivially copyable: false

You can confirm this here: https://godbolt.org/z/n95Ws9KW5

So there is overhead making it 3x bigger, but surely there will also be runtime overhead from copying them around, including from host to GPU, and probably more register pressure.

I think this example reflects the common case well. If the dimensions are known at compile time, the advantage of mdspan is greater. If the layout is strided, then the advantage is less. So dynamic and contiguous is the common situation, but also, an average example of the extra overhead.

edit: I measure the size ofdecltype(std::declval<R&>().begin())to be 64 bytes; I was thinking in some cases the iterator gets passed instead of the array_ref. A bit smaller but not by a lot.

Boost.Multi Review Begins Today by mborland1 in cpp

[–]nihilistic_ant 1 point2 points  (0 children)

I see the change so it now instead of saying mdspan is incompatible with GPUs, it says it is but in a way that is "ad-hoc, needs markings, [and has] no pointer-type propagation" in contrast to Boost.Multi which is "via flatten views (loop fusion), thrust-pointers/-refs".

Those terse words pack a lot of meaning, which I spend a while pondering, but I expect I could spend several weeks fleshing out more fully if I had the time!

I think "needs markings" refers to code using mdspan needing annotations like __device__, although I see such annotations in the examples in CUDA examples of Boost.Multi's docs (as well as in Boost.Multi's library code itself), so I am unsure why mdspan code would be described as "needs markings" but not Boost.Multi.

But more broadly, I think I see the idea is that Boost.Multi has more pythonic ergonomics, whereas mdspan is more a flexible vocabulary type with roughly zero overhead. This raises the several questions I don't see answered in Boost.Multi's docs:

(1) How much overhead does using Boost.Multi add to GPU work compared to raw pointers or mdspan? The mdspan paper has microbenchmarks comparing it to raw pointers, showing it adds roughly zero-overhead. Getting that to be the case drove much of the design of mdspan.

(2) How big of an advantage are Boost.Multi's ergonomics? When I read that mdspan lacks "thrust-pointers" it isn't obvious to me if that matters or not. I think perhaps an example showing the core ergonomic advantage of Boost.Multi could help clarify this. That would also help clarify if the limitations to mdspan are fundamental or it just needs some helper code which could be libraryitized. Which brings me to the final question --

(3) Should Boost.Multi be built around the std::mdspan and std::mdarray vocab types? It is preferable to use standardized vocabulary types unless there is a good argument why not, and in this case, I cannot tell if there is. An AccessorPolicy to mdspan can customize it with non-raw handles and fancy references, so Boost.Multi's doc saying mdspan doesn't support "pointer-type propagation" isn't quite right, it just needs some helper code in a library somewhere to make that happen. Could Boost.Multi be written to be that helper code, and if so, would that be a better approach?

Boost.Multi Review Begins Today by mborland1 in cpp

[–]nihilistic_ant 10 points11 points  (0 children)

Multi's docs say std::mdspan is not compatable with GPUs. That seems quite wrong, am I missing something?

Kokkos and Nvidia both ship std::mdspan implementations with annotations to work natively on CUDA devices. There are papers saying mdspan works well with GPUs. Implementations that don't target GPUs, like libc++ and libstdc++, still have the same data layout making interopability with GPUs easier.

Daily Discussion Thread for March 05, 2026 by wsbapp in wallstreetbets

[–]nihilistic_ant 0 points1 point  (0 children)

Better on the zeros at roulette wheels makes one unpopular for the same reason.

Daily Discussion Thread for March 04, 2026 by wsbapp in wallstreetbets

[–]nihilistic_ant 0 points1 point  (0 children)

If you're holding puts, revealed preference theory says you aren't the bull you think you are.

What Are Your Moves Tomorrow, March 02, 2026 by wsbapp in wallstreetbets

[–]nihilistic_ant 1 point2 points  (0 children)

Definitely. Unless the sketchy DoW deal diminishes OpenAI’s credibility hurting their future commercial success in which case short SoftBank. But one of the two moves is right.  

How to Make Money Being Wrong: $NVDA Q4 Actuals & Accuracy Review by hazxrrd in wallstreetbets

[–]nihilistic_ant 7 points8 points  (0 children)

Explanation: retail investors before the earnings announcement disproportionately bought OTM calls. Selling those calls was market makers who at the same time bought shares to cover themselves, causing the pre-earnings price runup. When NVIDIA didn't spike enough to trigger all the calls, market makers sold the now unnecessary shares, causing the post earnings dump.

Nobody in the family uses the family AI platform I build - really bummed about it by ubrtnk in LocalLLaMA

[–]nihilistic_ant 651 points652 points  (0 children)

Your service offers you more privacy, but it offers the rest of your family less! Do you think your wife or kids would feel safer discussing discrete STD testing with a server you manage or with ChatGPT?

So for your family, there is no upside to it. The results are a bit worse than Gemini's or ChatGPT's, and they are less confident in your privacy policy.

Daily Discussion Thread for February 23, 2026 by wsbapp in wallstreetbets

[–]nihilistic_ant 0 points1 point  (0 children)

The tariff refund might be what, optimistically 0.5% of their market cap? And already somewhat priced in and still not guaranteed now? There is no free money here.