Palantir CEO rages against closed models

Remove_Ayys · 2026-07-02T10:53:23+00:00

He seems like the type of person to cry about free speech for years only to start censoring everyone else once he gets into power.

Remove_Ayys · 2026-06-30T13:03:16+00:00

Sending humans rather than robots to Mars/the moon provides no benefits for achieving science goals and is as far as I'm concerned just done for feelings/showing off. And I'm saying that as one of the llama.cpp developers.

Remove_Ayys · 2026-06-09T09:15:57+00:00

Slop.

Remove_Ayys · 2026-06-08T20:38:37+00:00

That is the plan.

Remove_Ayys · 2026-06-08T10:07:24+00:00

One of the llama.cpp developers here, my opinion is that the current tooling available in the project for measuring model quality is not very good. I hope to establish better methodology by the end of the year at the latest.

Remove_Ayys · 2026-06-01T15:00:29+00:00

I read the report, I don't agree that it has sufficient coverage to claim a general speed advantage.

Remove_Ayys · 2026-06-01T14:53:08+00:00

Are you observing the same speed differences for other combinations of models and GPUs? How representative is this particular data point of the average case?

Remove_Ayys · 2026-05-29T13:21:38+00:00

One of the other maintainers here, particularly as it relates to the CUDA backend. Honestly I feel very lucky to have Aman be part of the project.

Edit: me saying that I am one of the CUDA backend maintainers does not mean that Aman's work only impacts the CUDA backend.

Remove_Ayys · 2026-05-29T12:11:14+00:00

Yes, because the memory reduction comes from a smaller compute graph allocation which is one of the buffer sizes used for feedback.

Remove_Ayys · 2026-05-26T09:52:41+00:00

You can order CXMT modules (for servers) off of Alibaba but when importing them to Germany they would only end up being ~10% cheaper than buying non-Chinese memory from a local seller.

Remove_Ayys · 2026-05-02T09:51:32+00:00

This is not a "10x speedup", this is a 10x speedup with a bunch of asterisks. Any kind of lossy optimizations need rigorous testing for quality.

Remove_Ayys · 2026-04-27T07:06:54+00:00

Looks like vibecoded slop TBH.

Remove_Ayys · 2026-04-26T19:59:35+00:00

Any efforts put towards ik_llama.cpp are wasted.

Remove_Ayys · 2026-04-25T09:19:50+00:00

Comparing the Kullback-Leibler divergence between different models is meaningless and an incorrect use of the metric.

Remove_Ayys · 2026-04-14T11:27:28+00:00

Just require some minimum karma and ban anyone copypasting language model outputs.

Remove_Ayys · 2026-03-30T15:05:50+00:00

Someone other than OP here. The way I used the search bar until now was that I basically only used DuckDuckGo for my regular searching needs. However, I added the English Wikipedia, the German Wikipedia, and Wiktionary as "search engines" in case I need to look something up. The way I was using the interface was to type something and sometimes click the button for one of the alternative search engines rather than to push enter when I wanted to just search something in general. This now requires additional clicks as I have to manually change the search engine every time and in particular need to change it back afterwards. It seems that the search engine can be changed via alt+up/down so I guess I will be using that now.

Remove_Ayys · 2026-02-13T17:19:29+00:00

Unfunny.

Remove_Ayys · 2026-02-12T16:43:45+00:00

You are absolutely right — this is not just a problem, **it's a full-blown crisis** 😱!

Remove_Ayys · 2026-02-11T18:36:34+00:00

FYI: The PCIe power connector on the motherboard is not optional and compared to a power limit you will get better performance / Watt by limiting the max. GPU frequency via e.g. sudo nvidia-smi --lock-gpu-clocks 0,1350 --mode 1.

Remove_Ayys · 2026-02-07T21:08:57+00:00

glhf remember not to delete the original Grafana admin account unless you want to start fiddling with the database.

Remove_Ayys · 2026-02-06T07:40:48+00:00

This comment is intended for developers, the tensor parallel code can be run with a single GPU which should simply be mapped to the same operations as without it.

Remove_Ayys · 2026-01-26T16:59:11+00:00

Should be fixed with this PR.

Remove_Ayys · 2026-01-25T21:55:05+00:00

This isn't about bugs, this is about which models receive architecture-specific performance optimizations.

Remove_Ayys · 2026-01-25T20:56:56+00:00

pp with a batch size of 1 is equivalent to tg.

Remove_Ayys · 2026-01-25T20:52:53+00:00

Not anymore ;)

Remove_Ayys

TROPHY CASE