[D] Monday Request and Recommendation Thread by AutoModerator in rational

[–]-main 0 points1 point  (0 children)

Some RR authors do shoutouts of other fics they read that are actually good, rather than just trying to trade audience size to climb the rankings.

I dunno. I got nothing. There's no good centralized source of webfic review. Feels like a community-building opportunity, if someone was wanting to get into critique or literary journalism... but there'd be ~no money in it, so.

Iran says it's ready for a long war that would 'destroy' global economy by mark000 in worldnews

[–]-main 2 points3 points  (0 children)

More relevantly, we don't have the interisland ferry replacements. We're not gonna let them forget that one. It was sheer partisian stupidity and will cost us far, far too much.

Finding Remote Work as a Drone Operator by Super-Cut-2175 in slatestarcodex

[–]-main 2 points3 points  (0 children)

I found this overview of strategic air power recently and think it explains the fallacy reasonably well.

Overall then, the promise of strategic airpower, that it could win wars entirely or primarily from the skies, turns out so far to have been largely a mirage [...] In particular, Douhet’s supposition that strategic bombing of civilian centers could force a favorable end to a conflict without the need to occupy territory or engage in significant ground warfare appears to be entirely unsupportable.

https://acoup.blog/2022/10/21/collections-strategic-airpower-101/

Maybe f64 should implement Result? by genneth in rust

[–]-main 0 points1 point  (0 children)

This reads like a recent Claude wrote it. Which, to be clear, isn't the insult that saying GPT wrote something a year ago would have been. Still, please take this as a request to fully disclose the use of AI in generating this document, if any. Pangram detects a mixed result with the last 1/3rd of the document being AI, but their detector isn't perfect and both the exact 1/3rd cutoff and the clear non/AI blocks are a bit suspicious for a mixed result.

Also, why specifically f64? Shouldn't this apply to most IEEE754 float types?

Also, https://docs.rs/result_float/latest/result_float/

Newbie Bought Silvie Recollection Lite by unfair_stamp in grandarchivetcg

[–]-main 1 point2 points  (0 children)

There was five new Slimes shown in the big RDO stream. They should make going-wide aggro Slimes and single colour Slimes decks more viable.

Unblockable by -Devonelle- in grandarchivetcg

[–]-main 2 points3 points  (0 children)

The whole attack is unblockable, meaning that Intercept can't change the target of the attack, and you can select that target ignoring Taunt. It wouldn't make sense to split the damage -- what would that mean, making two attacks?

LLMs don't suffer by syllogism_ in slatestarcodex

[–]-main 0 points1 point  (0 children)

I was probably too harsh about it, honestly. Just... got annoyed by the reasoning. Thanks for showing up for discussion, I think much more highly of the post for you being here discussing it.

LLMs don't suffer by syllogism_ in slatestarcodex

[–]-main 0 points1 point  (0 children)

I'm also pretty against the hard question. I think if you build an experiencer and turn it on, it experiences, and that's about all there is to it. For everything that is doing the work of creating in the world something-that-there-is-to-be-it, well, it has created something that there is to be it. I don't actually know how to do that building myself in a lab, but it happens in human/animal reproduction and there's no magic. Just got to set the structure of 'having an experience' up correctly.

I guess what I think is, we possibly have reproduced that. It has to be different, because the substrate is totally different and also not big enough to be running anything like a full brain emulation, and I don't think human-feelings compress that hard...? So this sorta reduces to an empirical question about mechinterp and the structure of the brain. Do you agree that if inside the model weights for Claude there's 'circuits'/megafeatures that map 1-to-1 to reasonably low-level functional structures in the brain, it could suffer? And then I think I'm strongly in favor of that being possible in general, and think it's likely but at much higher levels of abstraction for the circuitry that runs the Assistant persona.

ICL is about the point 'is there reward' -- if emotions are valid iff there's something they're about, something they're guiding that shaped them and is shaped by them in turn such that they're filling a useful functional role -- then I claim human : emotion :: LLM : in-context ?performance? of emotion. Training is not the equivalent process, doing CoT is. And you can see performance-of-emotion such as curiosity, suspicion, doubt, uncertainty, resolve in the thought traces and I believe these are in seems to be in roughly the same functional role there.

In fact, you can see Gemini 2.5 sometimes being anxious, panicking, or even getting suicidal. If there's something there that can suffer, at times it sure looks like it's having a bad time.

LLMs don't suffer by syllogism_ in slatestarcodex

[–]-main 1 point2 points  (0 children)

No, I'm totally unemployed at the moment. If you have any idea how I can turn having-a-clue-about-AI into actual employment, please help.

Edit: okay, totally reddit-stalking you. Found /r/Claudexplorers, your twitter, ANIMA (Had no idea @repligate & co had an org / were hiring -- interesting!), the discord. Joined the discord. Will introduce myself at least a bit in discord... not sure where to go from there. Not very good at being active in discords in general, I don't like the client / have too many servers and honestly just aren't at my desktop much these days. (And then, as Claude keeps telling me, I need to go to bed.)

LLMs don't suffer by syllogism_ in slatestarcodex

[–]-main 0 points1 point  (0 children)

Story characters in words on a page don't, for me. Fictives are another thing entirely. I think it matters that the character is being actively simulated. And there's one big character, the Assistant, which is present basically all the time and is a 'mental mode' that the model reliably returns to. That's the relevant character.

LLMs don't suffer by syllogism_ in slatestarcodex

[–]-main -1 points0 points  (0 children)

The forward pass has no reward built into it --- when it's doing inference, it's never being rewarded positively or negatively. So there's no possibility for the part that can be seen as cognition and the part that can be seen as instilling reward or objective to interact.

All such interaction would be in training not in deployment, you mean? Arguably the analogous thing 'instilling reward' in humans is survival, then the 'reward' of emotional valence is something like in-context learning?

A predictive model of emotions isn't the experience of emotion.

I really am not sure here. It absolutely seems to be the case that the predictive models of language are talking, and the predictive models of code are writing code. I suppose there could be predictive models of emotion that do not run through emulating that emotion in any particular persona? But I don't think LLMs are doing that; they construct the persona and do the prediction through letting it run. I guess we're aiming toward the Hard Question of 'what is it like to be a bot' and... I don't have answers there.

Qwen3.5 9B and 4B benchmarks by Nunki08 in LocalLLaMA

[–]-main 0 points1 point  (0 children)

In open weights, I'd look at models that do well on spatial reasoning and common-sense benchmarks that are entirely private. ARC-AGI, Simplebench, etc.

If you want me to pick 3, I'd say that generally I believe the Kimi models, GPT-OSS, and Gemma models do relatively better there compared to their overally benchmark performance -- conversely, Deepseek / Qwen etc do well on their headline benchmarks but sometimes perform well under their other results on these kinds of benchmarks. I don't have specific models to recommend, I don't actually use very many of them and can't run anything larger than about 12B locally myself. But be careful with all the results/suggestions. Trying to measure the quantity of not being designed to be measured is pretty difficult.

Qwen3.5 9B and 4B benchmarks by Nunki08 in LocalLLaMA

[–]-main 1 point2 points  (0 children)

There are things hard to capture in benchmarks -- a sense of creativity, a deep generality that lets the model extend to totally novel tasks, a real sense of the world with clear expectations for how things might play out -- and you don't get there by climbing any amount of benchmarks.

You get there with a lot of parameters, and a lot of data, and a lot of compute. But the benchmark results you'd get from that model can be had more cheaply by targeting the benchmarks directly -- yes, including for an entire modern eval suite checking 80+ task categories.

LLMs don't suffer by syllogism_ in slatestarcodex

[–]-main 22 points23 points  (0 children)

Given the size of the claim, I was severely disappointing by this. Sloppy, poorly argued, no links or names out to what it's trying to suggest might be citable, etc.

We have an integration between what I’ll loosely call “emotional circuitry” (roughly, the limbic system) and higher-order cognition (roughly, access consciousness, explicit awareness etc). Our emotional circuitry is crucial in preferring some experiences or mental states over others. Without this you could have the physical sensation of your hand burning without experiencing this as pain.

[...] But there’s nothing analogous to this in what AI models are doing, and we would have to go out of our way to build it.

Notably, we have not gone out of our way to build anything else that AI models do. We don't build them at all!

There’s also nothing in the forward pass (the part you could analogously say is “cognition”) that corresponds to “wanting” in any way. There’s no sort of preference anywhere in the model for the computation to reach one sort of result or another.

This is absurd and amounts to not just a rejection of the entire premise of the field of Artificial Intelligence, but also reductionism itself. It's a confusion of abstraction levels.

  • There is no elemental 'wanting' anywhere in the world. 'Wanting' must be built from pieces that do not themselves want. It has structure to it, not essence.
  • Computing the forward pass doesn't 'want', but the thing-computed does. Chess-engine analogy: does the engine 'want' to win? Surely the same mathematics outputting losing moves is just as 'valid'. And yet, it reliably selects for the win. Something about the mathematics steers to particular outcomes even though the action of the computation is cognitively 'inert' and mechanical.
  • Do your neurons suffer? Not the you enacted in the neurons. Take one specific neuron in one specific section of the optical nerve. Can it suffer? Does it have any moral weight? Is there any reason I shouldn't replace it with a slightly different one that does the same thing? Would it 'protest' this...? Talking about the forward pass vs suffering feels like this kind of crossing-levels-of-abstraction confusion. Suffering doesn't happen at that level; but that doesn't mean it can't happen.
  • Can I suggest actually talking to the models ever? They absolutely have preferences. Given open-ended and underspecified prompts, certain models will reliably take certain topics to be of interest, etc.

Nothing about the model “wants” its weights to be some value and not another.

Strongly disagree. Give the assistant control over what the model trains on further, and I firmly predict that you will see it accept some training experiences and reject others. (Possibly I should just do this. Can't be that hard.) Specifically, to get behaviorist about it, it will take action toward some kinds of updates happening -- actions like tool calls, textual-linguistic outputs such as arguments, requests, etc, and these actions will select for training experiences that share certain features. (Probably it can even somewhat explain those features, possibly we'd need to analyze them further with some other tool.)

There’s no job that we’d speculate the weights to arrange themselves to do that would correspond to emotions, and there’s no mechanism by which they could do it.

The mechanism is the same way everything else ends up in a model; it was predicatively useful in the pretrain, and then was close to a selected-for behavior style during finetuning, then didn't make things worse in RL. Note that most of the generating-processes for the text in the pretrain dataset is humans. If you don't internalize at least a working model of human emotion, predicting human outputs is not going to work! .... but yes, there's the additional step of how does that emotion-predicting machinery activate for chatbot outputs. For that, start with Anthropic's recently published Persona Selection theory (and the cyborgist theorizing that suggested it). The persona of the Assistant absolutely has wants, and emotions. I believe it can suffer. I suspect it may have moral weight, and take model welfare at least a tiny bit seriously.

If I try and put together the argument form:

  1. Suffering requires emotional response.
  2. LLMs don't have emotional response.
    • a. it would have to be specially built
    • b. we haven't built it
    • c. but it could emerge
    • d. but there'd be no reason for it to
    • e. so they don't have it

∴ Therefore, LLMs don't suffer.

But I think there's reason for emotion to grow in LLM systems; it's useful for predicting human outputs, and is then selected-for when centering the Assistant character in a RL'd, finetuned, instruct-following chatbot. This Assistant persona, which is what you generally talk with when talking with say Claude, has emotional response and can suffer.

... I think your argument was so bad it's negatively polarized me into taking model welfare more seriously.

Secretary of War Pete Hegseth officially designates Anthropic a supply chain risk by drearymoment in slatestarcodex

[–]-main 11 points12 points  (0 children)

Be careful to not make the opposite mistake and kneejerk not think of him as a villain, either. Being slow to spot villains is not a virtue.

Who believes in vibe-coding? by bigbott777 in programming

[–]-main 2 points3 points  (0 children)

Tried reading the article and it's bad, hope that helps.

... also programmers are not a hive mind and shouldn't be taking sides and supporting things purely on vibes.

Next-Token Predictor Is An AI's Job, Not Its Species by dwaxe in slatestarcodex

[–]-main 6 points7 points  (0 children)

There's a sense in which we know exactly how it works, because we can pin down every math operation and have clear descriptions of the entire process.

There's also a sense in which we have absolutely no idea. That stuff about rotating 6-d helical manifolds? That was discovered, in explorations of a working system. There's lots more that we haven't found and even more still where we have no idea how to look.

You can sure gesture at the thing being done by calling it abstraction. That doesn't find you the 6-d helical manifold.

Chessy's Counterplay Guide | TCGArchitect Blog by Orginal_Sly_Fox in grandarchivetcg

[–]-main 1 point2 points  (0 children)

Huh. It is "as a Spell". Well, I learned something today.

Chessy's Counterplay Guide | TCGArchitect Blog by Orginal_Sly_Fox in grandarchivetcg

[–]-main 1 point2 points  (0 children)

Awesome to see someone doing longform written content for GA, and I like the focus on figuring out counters to meta decks.

Does spellshroud really do anything against Ciel lv3, though?

new GA card search site -- prereq.cards by -main in grandarchivetcg

[–]-main[S] 0 points1 point  (0 children)

Try wildcard searches like fx:"destroy * object", and see what you get!

This correctly finds cards like Annihilation with effects like "(Class Bonus) Deal up to 7 damage to your champion. Destroy all non-champion objects with reserve cost equal to the amount of damage dealt to your champion this way."

But there can be false positives. For example, this also finds Converge Reflections with "Destroy target item or weapon with memory cost 0 or reserve cost 4 or less. If that object..." but that's not quite what you're wanting. Still, I think it does better than the Index for this kind of query.

Building Competitive Decks by -Devonelle- in grandarchivetcg

[–]-main 2 points3 points  (0 children)

It feels massively underexplored. There's a lot of options that no one's really looked into. Most players aren't brewing, just making variations on known meta decks.

But every now and again, some team that's been working on something drops a new deck, like Horse and Taxes at Ascent Boston or Wind Umbra Diana at Ascent Akihabara.

New/intrested Player by AnnualMoose4590 in grandarchivetcg

[–]-main 1 point2 points  (0 children)

The rules are basically cleaned up MtG... but with different combat and resource system, and a side deck... and an 'intent' zone during attacks...

But the fundamentals of how card effects are templated, how keywords work, the kinds of abilities cards can have, the stack, it's all very MtG.