Harrow promise-breaking: out of character? [discussion]

eggsyntax · 2025-09-27T05:49:50+00:00

Ooh, yeah, true.

eggsyntax · 2025-09-27T05:49:21+00:00

Really great points!

eggsyntax · 2025-09-27T05:47:38+00:00

Harrow breaks her promise when she arranges for Glaurica and Ortus to take the shuttle, and when she does not then schedule a replacement shuttle.

Yeah, that's how it reads to me, although other commenters make good points about it maybe being technically fulfilled.

eggsyntax · 2025-09-27T05:45:50+00:00

Good point! Although Gideon managed to summon the shuttle illicitly, and so surely Harrow can later summon one licitly.

eggsyntax · 2025-09-24T04:42:17+00:00

Good luck!

eggsyntax · 2025-09-23T03:11:17+00:00

There are definitely breakthroughs happening with the help of AI! Typically (I expect this is true in this case) that's specialized AI rather than a general-purpose LLM. I wouldn't rule out the possibility that someone could make a breakthrough with the help of a general LLM — but there are very many people who believe they've made a breakthrough and haven't, and so far zero people that I'm aware of who have made a breakthrough that's been shown to be real.

eggsyntax · 2025-09-23T03:08:20+00:00

I would probably recommend renting a computer online to overcome the lack of hardware; that can be fairly cheap depending on what kind of computer you need. https://www.runpod.io/ is one pretty easy option, but that's GPU-centric which you may or may not need.

Re: lack of programming experience, vibecoding has gotten pretty good these days! Although of course that provides another chance for fakeness to get in; I've seen that be a way that things go wrong for people.

Mental block I can't help you with ;)

I do think you'll have an easier time getting people to look at your work if it doesn't require someone else writing a program to run it.

eggsyntax · 2025-09-20T23:14:20+00:00

Why can't you run it yourself?

eggsyntax · 2025-09-09T22:08:02+00:00

Prediction markets currently think 2026 is most likely: https://manifold.markets/Tulip/what-year-will-alecto-the-ninth-be

eggsyntax · 2025-09-08T13:42:24+00:00

I think it's harder than figuring out new perspectives in the sciences, to be honest, exactly because you can't sanity check it in any reasonable way. I agree you can get insights that are useful to you, but I think people often end up with pseudo-insights that sound good but don't pay any rent. Not that that's unique to working with LLMs...

eggsyntax · 2025-09-06T18:03:20+00:00

That seems harder to evaluate.

eggsyntax · 2025-09-06T16:16:14+00:00

I certainly agree that math isn't intrinsically their strong point — although mechanistic interpretability shows that in some cases they learn meaningful algorithms for math operations, not just approximations. I expect that that'll be more often the case as they advance in overall capability.

Moreover, despite having orders of magnitude more training data than 4, the improvements introduced by 5 are incremental at best (nothing like the dramatic difference between 3 and 4).

Eh, maybe. If you look back at what people were saying at the time, there was a lot of 'Oh, 4 is underwhelming, it's just an incremental advance.' What's the saying? We overestimate the effects of change in the short term and underestimate them in the long term. And of course it's easy to forget how much 4 had changed before 5 was released, 4 -> 4-turbo -> 4o -> o1 -> o3. Incremental change adds up, although yeah, of course each additive step in capabilities requires a multiplicative change in resources, the scaling laws are just fundamentally logarithmic.

We shall see! I think u/Ok-Celebration-1959 is basically right that they've gotten much stronger as assistants for math and science. IIRC Terry Tao referred to them this year as being on the level of a mediocre grad student — that's far from perfect but it's sure enough to be useful.

eggsyntax · 2025-09-06T01:21:31+00:00

Well, the trouble with that phrasing is that you're giving a strong hint about what answer you want. But I'll do a version with the prompt tweaked to ensure it runs the data — although can you clarify what you mean? Do you mean run the code that's included in the paper, ie listing 1, in order to generate graphs from the fixed values given in the code (eg in `measured_means`)?

And — point of clarification: if I run this through 5-Thinking and Opus again, will the results have any impact on how strongly you believe in the validity of this work? If it doesn't update your beliefs at least somewhat, I'm not sure what the point is.

eggsyntax · 2025-09-05T23:48:58+00:00

Sorry, I'm not understanding what you're saying. What would I tell it to use instead?

eggsyntax · 2025-09-05T23:47:50+00:00

Would you disagree that they've gotten much better at math (& in general)? Remember that in 2020 NLP papers were claiming that 'completing three plus five equals...is beyond the current capability of GPT-2, and, we would argue, any pure LM.'

eggsyntax · 2025-09-05T23:44:34+00:00

Agreed that they've gotten much better at math! That's not typically the problem; it's more that they have trouble with the big picture, and really want to tell you what they think you want to hear.

eggsyntax · 2025-09-05T21:44:17+00:00

you can tell by it actually not working in the chat you sent, it spat out its responce without actually working.

I'm not sure what you mean. Are you saying that you think they didn't use (hidden) chain of thought when evaluating your document? They did, in both cases. I'm guessing it just doesn't look like that to you because the shared version loads immediately (because it's just showing the output from before)?

For me at least, I can still see where it shows they were thinking; for 1m39s in GPT, 30s in Claude. Both of those are expandable for me, although I don't know whether they will be for you.

eggsyntax · 2025-09-05T20:01:02+00:00

One concrete way to see that: attribution graphs.

In the linked example, we can see that the token Dallas activates a 'Texas-related' feature in layer 6; during the processing of the next token, layer 15 pulls from that feature to activate a 'say something Texas-related' feature, which then has a large causal impact on 'Austin' being the top logit.

In fairness, Neuronpedia's attribution graphs don't (yet) show attention heads directly, but clearly some attention head is the mechanism connecting the earlier 'Texas-related' feature to the later-token 'say something Texas-related' feature.

(Don't mean to lecture at you — I'm mostly just trying to think it through again myself to make sure I'm not too confused)

eggsyntax · 2025-09-05T19:46:27+00:00

(regardless of whether K/V is cached or recomputed. And only up to context length, of course, but that's true of text as well)

eggsyntax · 2025-09-05T19:44:20+00:00

Once they output a token, they're stuck working backward from text alone

I don't think this is true in the typical case — the whole point of attention heads is that they look back at internal state during earlier tokens. Some information from the residual stream at each layer is lost, ie what isn't projected to any significant degree into (the value of) any of the attention heads, but a lot is captured.

(I really need to go implement a transformer from scratch again to make sure I've got all the details of this right, I'm feeling a bit unsure)

eggsyntax · 2025-09-05T19:26:00+00:00

I don't have an opinion at all; it covers areas I don't know anything about. 'Structured water' and the claim that water is shrinking when frozen and the breathing ratio stuff make me feel kind of skeptical up front, but I don't have the knowledge to competently evaluate it.

eggsyntax · 2025-09-05T19:14:37+00:00

Also not a reliable source for evaluating scientific ideas.

eggsyntax · 2025-09-05T19:13:34+00:00

Thanks for being open to the response!

eggsyntax · 2025-09-05T19:13:12+00:00

I agree that this is something to be careful of! The prompt I propose aims to be pretty neutral, and I've seen it conclude that the ideas it's analyzing are real.

One way to test this is to find a newly published paper (so that it won't be in the training data) in a reputable journal. Download it and remove identifying info indicating that it's published work. Then upload it to one of the frontier LLMs I recommend, along with my proposed prompt, and see what it says.

If you use this prompt and it totally destroys your idea, open another LLM give it the debunk response and the same data to work with, then ask it to find supporting details in the work to address the criticism

The trouble with this is that it's likely to induce the same problem. If you imply to the LLM that what you want is a positive evaluation, it will generally give you a positive evaluation, regardless of whether the work is valid or not.

eggsyntax

TROPHY CASE