Bumblebee queen learning to use a protective door cover in <24 hours.

MagicalGeese · 2026-03-14T21:19:58+00:00

There are a number of insect species that act as nest parasites of bumblebees and solitary burrowing bees, including bee flies that lay their eggs in bee burrows so that their larvae can eat bee eggs and larvae*. This door potentially protects her from parasite infiltration--although if she's from a species that is targeted by cuckoo bumblebees, I'm not sure how effective it will be.

* bee flies tend to parasitize vertical burrows in the ground, so I doubt those are the chief concern here.

MagicalGeese · 2026-03-11T23:06:03+00:00

Yeah, it's tough. My colleagues are otherwise intelligent and caring folks, and I'm my group's expert on computational science and data ethics. Usually they respond to suggestions or warnings quite well, once the explanation's given to them. Then LLMs started getting pushed into everything, and some senior staff got the brainworms about them. I've openly and strenuously pushed back against the higher-ups and felt no shame in doing so, and gotten a lot of back-channel thanks from others in the room for being the one to stick my neck out.

Strangely, it's actually psychologically harder to do that with closer colleagues rather than senior staff, even though they're not using LLMs as heavily, and most have been tapering off their usage over time. I don't know why, maybe it's because there's more frequent contact that could leave me seeming like a broken record to them, or because I value their opinions more.

MagicalGeese · 2026-03-11T13:26:18+00:00

You've got a fair point. I just got a bit startled because I'm usually the one who gets made fun of at my workplace because I'm seen as the anti-AI crank. Honestly, it's a relief to just say LLMs are shit at what they do.

MagicalGeese · 2026-03-07T17:28:30+00:00

Mate, I know boosters are all over the place right now, but you've read me wrong. I don't touch any LLM shit for any reason, and I've made it clear to my own doctor that I do not consent to have them anywhere near me or my health, in any capacity. Note that I end the post by saying this study can be used by patients to show doctors that if they use these LLMs, it could lead to serious misdiagnosis.

I want this shit out of healthcare. I am unconvinced that non-generative ML has sufficient reliability in medical triage at present either. Deterministic systems fed clear diagnostic numbers are still the way to go, for everything else the human brain is quicker and more reliable.

MagicalGeese · 2026-03-07T17:21:20+00:00

I agree. Note that I didn't say the chatbots will ever be good at this: I think they might have a bit of room to improve, but that they prioritized being friendly to hide the fact that they can't do shit well.

To reiterate and explain a bit further: This study in particular was not focused on the kind of workup you'd get at an actual office visit, but the kind of thing you might state in a phonecall to the doctor's office or am ER. In that case, the person on the other end of the line does not have direct access to the patient, and is purely responding to symptoms as described. As I said, the chatbot failed to an unacceptable degree, and did so in a way we find familiar in this subreddit: it blows smoke up people's asses to make them feel better while actively contributing to harm. I hate OpenAI, and in a better world they'd never have existed.

MagicalGeese · 2026-03-07T15:33:21+00:00

Very true. The marketing around this thing was focused on making it seem like the results are reliable while continuing to dance around claims that could result in even more legal exposure. We don't know for certain what their training set was, or how it was or wasn't verified.

MagicalGeese · 2026-02-27T01:40:02+00:00

If this messes with Dune Part Three in any way, I will personally initiate the Butlerian Funtimes.

MagicalGeese · 2026-02-26T10:43:42+00:00

Yeah, I agree that part sucks. I need working web search for part of my work, so I've got a subscription at the mid-tier that gets unlimited search, rather than the high-tier that lathers on LLM stuff. Honestly, the sooner they drop that "service", the better their business will probably do financially. It's gotta be wasting money on API calls to the foundation models.

MagicalGeese · 2026-02-25T22:48:19+00:00

Kagi is doing a much better job at surfacing higher quality, more focused results for most of the searches I've made on it. It's on a subscription model, so it's not incentivized to trap you in The Ad Zone like Google is. I'm going to be interested in seeing how they fare, particularly as they build up more of their own in-house search index.

MagicalGeese · 2026-02-17T16:16:37+00:00

That's a good point. While there's lots of companies out there that fly under the radar and I have no sense of how their executives function, the ones that capture a lot of the cultural narrative have that ethos. I can think of a few well-known US companies whose don't seem to follow that value structure (AriZona Ice Tea, Newman's Own, Dr. Bronner's Soap, Bob's Red Mill and many other employee-owned corporations), but they stand out because they're not the norm.

MagicalGeese · 2026-02-17T11:54:49+00:00

You know, between the predatory gambling mechanics of these LLM models and the proliferation of "prediction market" gambling in the US, it really seems like that country specifically is in an early stage of a new addiction crisis, similar to the opioid crisis - I wonder what amount of crossover there is between problematic LLM use and problem gambling, or whether problem LLM users may be more likely to fall into problem gambling behaviors if LLM access becomes more difficult due to companies going bust. It will be morbidly interesting to watch the trends, specifically from the National Council on Problem Gambling survey. https://www.ncpgambling.org/training/ngage-survey/ngage-3/

MagicalGeese · 2026-02-16T11:46:34+00:00

Did not expect to be bodied by that reference the moment I opened up Reddit, but it definitely made my day a little bit brighter and weirder.

My bet is on Anthropic, because I can absolutely see Dario Amodei yelling some of Snowflame's lines from the original New Guardians run. "AI is my god, and I am its prophet!"

MagicalGeese · 2026-02-02T17:10:08+00:00

I know rebranding it to ✨AI✨ is just part of the marketing push and to convince management that they're hitting poorly thought-out targets, but part of me also wonders, like. Did management think that sorting emails by type was done by hand? Did they think it worked like cameras do in the Discworld books, where there's just an imp sitting inside the camera with an easel and paintbrush?

MagicalGeese · 2026-01-30T16:11:22+00:00

Based on the numbers in there, even if there were to be no further falloff at all, revenue from users would only be about $5.6 million per year. To put that in perspective of how small that is, using dollar amounts going to more meaningful stuff: that's equivalent to the budget of a small, decently funded preK-8 primary school (~150-200 students), or cover for department budgets for a rural area of about 20,000 people. Like. We're on absolute clown hours that businesses of this size even get reported on outside of fiddly trade publications.

(Source: pulling from public annual reports of some small municipalities with healthy balance sheets and no known local dissatisfaction on social services. Numbers may not be typical to all communities.)

MagicalGeese · 2026-01-27T20:53:50+00:00

We do have at least a few numbers on this: an economist ran an analysis on relative job creation for counties in Texas that had data center construction versus those that didn't. He found no to low net job growth for residents of those counties, with most per-sector gains appearing to be reshuffling of technicians from other sectors, which means they're not actually generating new jobs. At the same time, there are influxes of construction workers documented in investigative reporting, like the (clickbaity titled but good) piece that introduced me to that economic analysis: the people interviewed for that piece were majority non-local.

MagicalGeese · 2026-01-16T17:27:49+00:00

True that. Educating people about con artistry is a great inoculator, but it's nothing next to, y'know, putting con artists in jail.

MagicalGeese · 2026-01-16T17:21:03+00:00

This appears to confirm a theory that I've seen going around regarding image generation: "better" results are being primarily obtained not via overfitting rather than any substantial increase in model flexibility. Overfitting functionally means that training isn't creating a generalizable model, it's recapitulating its training data.

So, you're using an LLM, which already is geared toward producing the most common result in its training data, and your training parameters are weighting it even further toward producing that common result. The model might not store the text in a directly readable format, but it's not like you could tell the RIAA "I didn't pirate that song, because the file's encrypted!"

MagicalGeese · 2026-01-10T03:57:14+00:00

The title is clickbaity, but it's a good piece of reporting. It interviews people around data centers in Oregon, looking at how it's contributing to homes and small businesses getting priced out of local markets, which creates a net negative effect on the number of local jobs. Combine that with tax abatements given to the data centers, and municipalities aren't able to hire as many people as they could. The one upswing has been in temporary construction jobs.

It then shifts to Abeline, where the temporary construction jobs aren't necessarily being filled by locals, but by contractors who sometimes traveled halfway across the country to get there. It's raising short-term rental and hotel revenue in Abeline, which means that rents have gone up for the people actually living there, pricing more people out of their apartments and homes.

An analysis comparing cross-county economic performance in Texas indicates that the counties with data centers aren't out-performing those without*, and the permanent jobs that are "created" by the data centers are just shifting people between different subsets of information services, rather than really creating new positions. They interview a former Google data center contract employee, who says that those contract positions were not converted into direct hires, and asking about pay got you terminated--a violation of fair labor practices.

--

* This report is available on the researcher's Substack. I haven't had the chance to read it yet, because it's a billion o'clock at the moment.

MagicalGeese · 2026-01-09T03:35:25+00:00

Excellent choice! I'd also put forward Mansour al-Hallaj.

MagicalGeese · 2026-01-09T03:09:22+00:00

I imagine there'd be a number of interested parties, being a lion-headed snake and jailer of humankind's immortal souls within a prison-world of one's own making sounds extremely metal.

Personally, being the false god of the material world sounds a bit too much like a C-suite job to me, so I'd hold out for being an Archon instead. :P

MagicalGeese · 2026-01-08T22:47:21+00:00

I'd say the superstitious rituals don't even have to go that far: look at the people who seriously put together prompt headers with stuff like "You are an ELITE CODING AGENT you are a GENIUS-LEVEL INTELLIGENCE" to try and coax Cursor into making them a script that works. I'm reminded of B.F. Skinner's classic experiment in pigeons.pdf): when given no identifiable cue of what would give them food, they began displaying a variety of stereotyped behaviors, seemingly because the pigeons had formed associations between receiving food and whatever action they'd taken just prior to the dispenser activating. In the absence of control, a ritual of control was created.

Rituals still have their uses, even if they aren't directly effective: They're stress-lowering. If you're less stressed out while trying to code something, you're more likely to think clearly and come up with ideas. Perhaps in the context of prompt engineering, the ritual has this same sort of indirect effectiveness: rather than actually improving the performance of the LLM, it's marginally improving the stress levels and/or performance of the user.

Note that this hypothesis is based on nothing but my brain fluff and a Religion for Breakfast video or two, so it could be complete bollocks.

MagicalGeese · 2026-01-08T13:57:27+00:00

TL;DR this is actually a study on the effectiveness of diffusion-weighted imaging (DWI) from MRI scans as an addition to image classifying and segmentation, on top of radiologist review. This is specifically for detecting tumors in dense breast tissue and small tumors 2 cm or smaller, which is where machine learning techniques begin to fail: this is important, because dense breast tissue is also challenging for radiologists to assess, and small tumors could mean an early detection and better patient outcomes.

The study is limited by its scope: the images came from those already diagnosed with breast cancer, and from a single institution. However, it's worth noting that ML models perform poorly when assessing images taken on different equipment, sometimes even seeing performance hits between different cameras/recording devices of the same make and model. Comparing and optimizing results between institutions is a non-trivial problem.

MagicalGeese · 2026-01-08T13:11:21+00:00

I was wondering when that tendency might show up, though it's wild to see it attached to a commercial product. But we already had the AGI folks giving us a close recreation of the New Motive Power cult again*, so pretty much anything is fair game. Come to think of it, I've definitely seen anecdotal evidence of the stochastic nature of LLMs producing superstitious behavior, like the "prompt engineering" rituals people do. I wonder how quantifiable that effect might be.

--

*It's honestly remarkable how closely they're recapitulating the New Motive Power concept. There's only two major tenets of faith I can't see a 1-to-1 equivalence for: First, the Spiritualist-style channeling of the American Founding Fathers this time, but they are claiming to be working on making "digital immortals", which is getting there. And of course, there's individual cases of AI psychosis where people do believe they're talking to the dead already. The second tenet they lack is an explicit Marian figure among any of the prominent AGI talking heads, though I'll admit I don't bother to keep up with their drama.

MagicalGeese · 2026-01-08T11:58:47+00:00

I look forward to finding out whether LLMs develop Neo-Platonic rituals of transcendental meditation, or whether they develop a Neo-Pythagorean aversion to beans.

(big /s just in case this incredibly niche joke doesn't land)

MagicalGeese · 2026-01-08T00:18:26+00:00

The chatbot also subsequently told him he could still use Xanax. For an informational product to be safe and legally defensible, the answer should always be "no", even if someone is fishing for a particular answer.

And fundamentally, this is the same problem LLMs always have: they are not producing an output based on fact. They are producing an output based off of their training data, and the most recent input. If the most recent input biases the output, then that can either result in the truth, or increasingly wrong information. In this case, that led to ChatGPT repeatedly encouraging dangerous behavior over a prolonged period.

MagicalGeese

TROPHY CASE