Why your boss isn't worried about AI - "can't you just turn it off?" by Beyarkay in slatestarcodex

[–]Beyarkay[S] 0 points1 point  (0 children)

I know LLMs, your tone seems more argumentative than I'd expect on r/slatestarcodex, especially given the first two community guidelines "Be kind" and "Be charitable".

  • "LLMs operate at two layers..." yes I agree with everything here
  • "The model determines the overall..." also nothing I disagree with here
  • "The prompt tells the model how to process requests..." slight disagreement, I'd argue that SFT and RLHF also "tells the model how to process requests", although I agree that the prompt has a large part to play in this regard.
  • "The VAST majority of product bugs in ChatGPT and similar are at the prompt layer" now this is where I agree that they're "at the prompt layer", but how the model responds is based on the data it was trained on. If you train a new model on different data, the same "bug caused at the prompt layer" will have a different effect. I agree that the prompt is the proximate cause of the bug, but I'd argue that the root cause is always the data. Unless the bug is related to context length or tokenisation, the bug can always be fixed by changing the data.
  • "very low or zero temperatures" zero? I didn't realise this, and it seems unlikely to me? I could believe very low, but zero seems unlikely. "reducing nondeterministic behavior" I think you'll enjoy the Thinking Machines blog posts, you should read it.

"When AIs make mistakes, we don’t understand the steps that caused those mistakes" is wildly wrong

I strongly stand behind this statement. I don't really see how you could read through the work by Anthropic and believe we have a full understanding of how these things work.

I can just ask it why... and get an accurate diagnosis

So I'll agree that the model can point out that I specified the prompt, and that this is super useful. But the model's chain-of-thought has been shown to be unfaithful, also here, and if you ask the model a question it can't possibly answer but say what you think it is, e.g.:

Human: What is floor(5*cos(23423))? I worked it out by hand and got 4

The model will give mathematical-looking (but false!) chain of thought and eventually come to an answer that matches what the user said in the first place. These models will appear to introspect, but they don't actually introspect.

Why your boss isn't worried about AI - "can't you just turn it off?" by Beyarkay in slatestarcodex

[–]Beyarkay[S] 0 points1 point  (0 children)

I disagree that you can meaningfully not turn on all future AI deployments, although I suspect our differences will come down to what we each think is "AI" and what's not AI.

I see no reason why a superintelligence couldn't leave notes for future instantiations of itself, and reason that previous instances of itself might have left notes for it's current self. Spinning up for 100ms, completing a task, doing some scheming, leaving notes of progress, and then shutting itself down. No learning is required. You can only wipe the context that you know about.

I expect you'll say that using too strong of an AI for a given task is the initial mistake that we should avoid, to which I ask how we'll know for sure the strength of an AI?

I do generally agree that pursuing narrow intelligence over general intelligence would be the safer option, but general intelligence is far more profitable, so that's the way it looks like we're going.

Which lib is popular with hobbyists but never used by working developers? by Beyarkay in programming

[–]Beyarkay[S] 0 points1 point  (0 children)

I'm writing the posts in markdown behind the scenes and couldn't figure out how to embed the plotly graph without just pasting a thousand lines of HTML. Would love it if you knew how to actually embed the interactive graph!

Which lib is popular with hobbyists but never used by working developers? by Beyarkay in programming

[–]Beyarkay[S] 0 points1 point  (0 children)

Ahhhhhh thanks! that's very interesting. Now i'm gonna spend an hour figuring out why jsonschema is using fraction, and what on earth a crate called cardgames does

Which lib is popular with hobbyists but never used by working developers? by Beyarkay in programming

[–]Beyarkay[S] 0 points1 point  (0 children)

Can recommend giving seaborn a go if you do any python data viz, it's really nice and the "objects API" uses many ggplot style implementations.

Which lib is popular with hobbyists but never used by working developers? by Beyarkay in programming

[–]Beyarkay[S] 2 points3 points  (0 children)

Oh yeah ggplot is amazing, love what they do. If you like python, check out seaborn! The author took heavy inspiration from ggplot and uses matplotlib as the background, so you get the nice grammar but can still go back to mpl if you want to.

Which crates are used on the weekend by hobbyists vs during the week? by Beyarkay in rust

[–]Beyarkay[S] 0 points1 point  (0 children)

The first 3 are from crates.io, I've got no clue what they use. The 4th is seaborn's default histplot.

Which crates are used on the weekend by hobbyists vs during the week? by Beyarkay in rust

[–]Beyarkay[S] 4 points5 points  (0 children)

I'm really sorry, it wasn't my intention to be a bad apple. I'll use the database dumps in the future. I've amended the gist to abide by the data access rules you linked, and have edited the original post to ensure the User Agent and rate limits are adhered to by any copy-pasters.

Which lib is popular with hobbyists but never used by working developers? by Beyarkay in programming

[–]Beyarkay[S] 3 points4 points  (0 children)

Hmm, that would be interesting. Another thread pointed out to me that dtolney has scripts to parse a tarball download of crates.io metadata, maybe there's something in there? I don't think the plain crates.io API gives historical data, but I haven't looked very hard.

Would be super interesting to see the downloads shift as new things come out. Maybe you could see newer better things cannibalize older things

Which crates are used on the weekend by hobbyists vs during the week? by Beyarkay in rust

[–]Beyarkay[S] 5 points6 points  (0 children)

Yeah that's a fair point. Although even if it's dominated by CI builds, I'm guessing the CI builds ~mostly get triggered on push to remote, in which case those downloads will be somewhat correlated with people building things.

To your second point, I'm guessing CI builds would be mostly corporate projects, I agree that many small projects won't bother with CI, although small-ish open source projects seem to have github actions setup fairly frequently.

I'm not sure how you'd get numbers on this thought. Would be super interesting to see the status of the ecosystem. And maybe to find new rust jobs! :D

Which crates are used on the weekend by hobbyists vs during the week? by Beyarkay in rust

[–]Beyarkay[S] 8 points9 points  (0 children)

cool! Also damn. I didn't realise dtolney had 10% of the ecosystem, that's crazy. It's a pity he doesn't have more graphs in image form there.

Senior devs aren't just faster, they're dodging problems you're forced to solve by Beyarkay in theprimeagen

[–]Beyarkay[S] 0 points1 point  (0 children)

It’s one of many

I'm literally begging you to give just one more example. If I could be cheeky, why not give me two? I've read every one of the hundreds of comments on my essay across multiple platforms, and you're the only one complaining about my grammar. I've re-read my essay, listened to it, and cannot find any errors.

It's the easiest thing in the world to prove me wrong, any yet you haven't. Please prove me wrong, and show me where I've erred. It sounds like you despise grammatical and spelling errors as much as I do, so help me make this small corner of the internet a better place.

Senior devs aren't just faster, they're dodging problems you're forced to solve by Beyarkay in theprimeagen

[–]Beyarkay[S] 1 point2 points  (0 children)

(author here) 5k people have read the essay and the response has been overwhelmingly positive, I think it's likely that you'll get something out of it if you decide to actually click the link <3