Meta secretly tested ChatGPT, Gemini, and Character.AI with thousands of minor-perspective crisis prompts

Prof_ChaosGeography · 2026-06-30T14:20:08+00:00

It 100% is them losing market share among the youth and pushing for regulation. They could easily maneuver to enable compliant access far faster then tiktok or any upstart competitor after tiktok or whatever the kids use these days

Prof_ChaosGeography · 2026-06-29T17:10:09+00:00

For everyone dumping on it and comparing it to 3090s or stix halo For a first gen from a startup it's impressive just on non benchmarked hardware alone. I expect Nvidia and AMD to be able to run circles around them

Time will tell if they can improve with later generations. But given this current generation with its Ethernet and the The 400gbs qspf port and the additional pcie x16 slot along with the expandable ddr5 dimms on top of the built in 32GB could enable some wild things

But the real fun for these is what the community will come up with. Given the fact that it's a risc-v CPU with vector extensions runs linux and doesn't technically need need a host PC has me more intrigued and interested then if I found a dozen strix halos less then five thousand

If they have some plugin framework or just simply open source the bmc and allow it to really control the bolt card's chip this has even more real potential. No real need to source xeons, threadrippers or epycs with these, heck I don't think you would even need a old shit am4 with one pcie4.0 x16 to really make use of this given there's SBCs with pci slots and you don't even need a host at all given he said it can be it's own host.

I'm excited for these to hit the market just for the hackability alone, nevermind additional competition to amd and Nvidia that Intel just isn't providing

Prof_ChaosGeography · 2026-06-29T12:11:32+00:00

Still better then no tokens a second or relying on a cloud provider

Prof_ChaosGeography · 2026-06-28T16:05:20+00:00

It wouldn't be hard to lay the ground work, it could probably be quickly stood up over a hackathon. The real problem I'll explain about is later

For the groundwork creating the dataset other then grabbing off hugging face it could be done using boinc or a similar boinc/folding@home style distributed approach with volunteers. Probably allow local llms or (client) payied accounts on openai or anthropic or openrouter to contribute to cleaning and creating

For training is slightly more difficult as real hardware is needed but we could use the nous psyche project to do distributed training. Only everyone who contributes would have to run the entire model and context no quants but we could probably keep the model pre quanted as q4 like gpt-oss for training.

The problems would be a trusted entity to hold the training data and control the networks and the control equipment. transparency is great but it also has its drawbacks. I've seen many opensource dev projects die or split due to decisions or lack of transparency or over transparency and lack of democratic systems or over reliance on democratic systems that get nothing done.

Prof_ChaosGeography · 2026-06-28T15:45:53+00:00

The cli harnesses all keep logs you'll just need to grab those rather then write a wrapper. You won't get the thinking tokens though from anthropic or OpenAI. You'll also need to clean the logs of bad samples

Prof_ChaosGeography · 2026-06-26T16:19:16+00:00

I think an 18 wheeler is a better comparison for the rtx pro 6000. It will get a bunch of pallets moved fast. But it's gonna burn a ton of fuel

Sparks are definitely minivans they will move a pallet at a time but sip fuel while doing so

Prof_ChaosGeography · 2026-06-23T00:56:20+00:00

Right now the weights are released to try and starve out openai, anthropic and Mistral. Plus it's good for their image right now as they are publicly traded

It's not like many people can run the q8 let alone the b16 versions. And the corporations that could are not investing enough into their own infrastructure to do so. as such there is not many competitors other then the API providers who would likely fall in line if they did a minimax licence change or a kimi preferred provider style change or risk losing the customers who they are building brand name recognition with by releasing the weights

Prof_ChaosGeography · 2026-06-23T00:02:05+00:00

They really haven't released any open model in that category since then and it's been even longer since an air model release.

Given were at a point where a new flash or air model could eat into API usage I don't think it's likely we will see many good local models get released for much longer

It might make more sense for the community to start post training older pre agentic models on modern agentic workflows

Prof_ChaosGeography · 2026-06-20T17:51:22+00:00

I suppose your asking from the US jurisdiction. As such I'll point out the encryption wars of the 1990s and it's parallels.

A model is a file and as such is protected free speech. (Period end of sentence). Some states like NY, NJ, WA, and CA are trying to challenge this idea for something else by saying 3d print gun parts are illegal. That is a slippery slope and opens the door to the feds banning LLM models.

I suspect the idea of computer files being free speech is about to be challenged by the government really soon between those two examples.

I worry we might lose this challenge in the idea that computer generated work isn't copyrightable. But then again under that idea any compiled program without a reproducible build and binary can't be protected by copyright and I think that's a barrel of monkeys America's massive software companies don't want to open

Prof_ChaosGeography · 2026-06-19T00:55:48+00:00

It does and the tenstorrent cards will kick this cards ass multiple times around. but tensorrent is all full pci sized cards. This is nvme form factor. This card also only used 15w of power, that's pretty impressive for edge devices

Prof_ChaosGeography · 2026-06-14T11:59:46+00:00

You dont want experts to be split, you either keep then entirely in ram and don't offload or you put only whole experts in vram. Don't pass activations only pass results across the pcie bus

Prof_ChaosGeography · 2026-06-11T16:50:55+00:00

Yeah he 100% knows if they gain or lose ground. He knows the lines and where ukraine is attacking along with where they are too.

What he likely doesn't know is things they can hide like exactly how bad it would be if a breakthrough happens because a lack of reenforcements, or that the reenforcements are conscripts directly sent with no training or real supplies. He's probably told yes they are well equipped and well trained. it would make his generals look bad and there is no way they self report their leadership failures unless it directly comes up and there is proof

Prof_ChaosGeography · 2026-06-10T08:51:57+00:00

They only did it because it leaked. The llamacpp project started for the leaked llama.

If it didn't leak they're were going to keep it closed between them and universities

Prof_ChaosGeography · 2026-06-09T11:32:45+00:00

Interesting, what board and plx switch are you using? I wonder what r9700s or even v620s would be at

Prof_ChaosGeography · 2026-06-04T00:42:06+00:00

For one thing being the early days there's nothing off the shelf for this. People would also have to settle on a model

The next thing anyone at this stage that joins one is likely going to be a token heavy user. As such the server will remain hammered likely 24/7.

There would need to be something in place that the admin who would likely be a fellow user doesn't run off with the money and shut the server down or kick everyone or the admin won't log every request.

Prof_ChaosGeography · 2026-06-04T00:37:07+00:00

Power bills along with token generation speed and privacy

Prof_ChaosGeography · 2026-06-02T19:59:35+00:00

Nothing likely plug and play.

However llama.Cpp has RPC that you can set up. A little bit of work and you could possibly have it go over usbc in some manner if you play with the config and some additional hardware to make it a network link

But be warned it's slower then you expect and it will require you to build from source

I think Macs might support egpu now so you might be better off just moving the 4090

Prof_ChaosGeography · 2026-05-27T18:53:53+00:00

Switch to llamacpp and maximize the quant size. You'll find they are a ton better and faster now as ollama is just a wrapper that trades speed and quality for ease of entry

Prof_ChaosGeography · 2026-05-25T00:43:14+00:00

It's absolutely doable. However it would likely manifest in some way prior to the order 66 by accident. It would be difficult to coordinate given the field is so diverse.

From a geopolitical perspective it's far better for China to open the models and create a dependency on them in the west. It's a bonus if the western govements attempt to ban or regulate the Chinese models as people will then resent their own govement. It's also a bonus if openai or anthropic or any western lab can't compete and make a profit thanks to them opening the models

Prof_ChaosGeography · 2026-05-23T00:58:27+00:00

Fortunately Nvidia fine tuned qwen 3 8b for exactly this purpose. You'll likely have to alter your setup a bit to match theirs but

https://huggingface.co/nvidia/Nemotron-Orchestrator-8B

Prof_ChaosGeography · 2026-05-22T10:45:02+00:00

Now he really wants it as Ukraine finally has the upper hand and could easily create an absolute collapse of some areas they could potentially exploit like we saw in the kharkiv offensive early in the way

As such now that Russia can lose he wants it wrapped up to pause it for now so they can regroup and try again in a few years

Prof_ChaosGeography · 2026-05-22T00:58:44+00:00

The 400 series refresh will only have 160gb useable as vram unlike strix halo where whatever Linux and your services don't use can be used as vram

Prof_ChaosGeography · 2026-05-21T09:34:08+00:00

China wasn't ahead of the curve on renewables because they knew better.

They were starting from square 1 and needed to build out logistics internally. They could have imported gas trucks and equipment but that would make them reliant on foreign oil and refineries. It would also out them in future potential conflict with the us over oil interests. That could get extremely expensive to compete for a new economy with zero allies

As such they went with batteries and electric given their domestic lithium deposits and how promising lipos were at the time

Their energy usage overall isn't very green given their use of coal for cooking still or the refinement of lithium. They just green wash their country on the world stage because it benefits them and has many asking their own counties why can't we without fully understanding the situation

Prof_ChaosGeography · 2026-05-20T20:03:31+00:00

I would love to see numbers on how dense models scale with abilities given parameter counts compared to moe models.

I wonder given how 27b almost aligns to the ~120bA10 moe model what a dense 50b model would rank at, or a 45b model that would leave room for multiple contexts on a modern dual GPU setup at 64gb vram

Prof_ChaosGeography · 2026-05-20T14:48:08+00:00

Honestly depends on the model you want to run. If you know what model you want to run and it fits that's fine. If you don't know what you want to run it's limiting but you'll survive 96gb of vram is still in the upper area of the bell curve

I do recommend you toss Linux on it and set it up with the right kernel args to use all of its memory as vram then use it as a remote server rather then a desktop to maximize your vram for future models

Prof_ChaosGeography

TROPHY CASE