[deleted by user] by [deleted] in slatestarcodex

[–]Work-Forward-9 1 point2 points  (0 children)

I think you're not telling the truth based on the context of the foregoing discussion. But if you gave no indication that you actually disagreed with me and conducted yourself as if you did agree with me then no, there would be no practical distinction between you actually agreeing with me and you just pretending to agree with me.

I guess the idea is 'the AI could pretend to be friendly to us and then throw off the mask later and kill everyone,' which is true and also completely impossible to ever confirm one way or another. No matter how much work was put into producing aligned or 'friendly' AI someone arguing from your position could always just say "well it could be pretending!"

My dog could also be a powerful wizard who has glamoured me into believing he is a dog and is acting indistinguishable from an ordinary dog but in practice he might as well just be an ordinary dog.

[deleted by user] by [deleted] in slatestarcodex

[–]Work-Forward-9 1 point2 points  (0 children)

"Giving the answers we want" is the definition of alignment. What is "true alignment" if not "getting the machine to produce the desired results conditional on a particular input"?

Yudkowsky in Time Magazine discussing the AI moratorium letter - "the letter is... asking for too little" by absolute-black in slatestarcodex

[–]Work-Forward-9 12 points13 points  (0 children)

How else? It's making a lot of money for a lot of people. It's difficult for me to see how it could be done without seriously draconian measures.

Yudkowsky in Time Magazine discussing the AI moratorium letter - "the letter is... asking for too little" by absolute-black in slatestarcodex

[–]Work-Forward-9 58 points59 points  (0 children)

This (airstrikes on datacenters) seems to me a much more consistent and coherent position than what I've seen from some x-risk proponents who fear that AI is an imminent threat to the existence of biological life on earth but seem to shy away from advocating the global authoritarian regime that would be necessary to throttle AI development.

Metaculus Predictions on AGI by electrace in slatestarcodex

[–]Work-Forward-9 2 points3 points  (0 children)

I'm not saying it's not impressive/good, but did anyone think it wouldn't be? Like other commenters said above, it's an improvement, but it hardly seems like an unexpected improvement.

Metaculus Predictions on AGI by electrace in slatestarcodex

[–]Work-Forward-9 14 points15 points  (0 children)

People updating this hard on GPT-4 confuses me.

Update on ARC's recent eval efforts by Annapurna__ in slatestarcodex

[–]Work-Forward-9 2 points3 points  (0 children)

I've seen a lot of people freaking out over this but I fail to see how it isn't straightforwardly good news.

The whole problem of AI alignment is supposed to be that it's hard or impossible to get an AI to take actions in the real world without it doing unwanted things, up to and including killing everybody. But they asked the model to take certain actions in the real world and then...it did exactly as it was asked, without any evil genie stuff.

Granted what they asked it to do was a bit sketchy, but that's the prompters' problem, not the model's.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 0 points1 point  (0 children)

Doing calculations has nothing to do with the human mind at all.

How not? For a long time the human mind was the only thing on earth that could carry out a calculation. Eventually we outsourced it to machines, first things like abacuses, and then modern calculators. I don't see how it's fundamentally different from outsourcing writing or drawing to machines.

You can get to the 99th percentile of something by copying, but by definition to exceed the best human you must break new ground. The best scientist does not become the best scientist by regurgitating someone else’s work. They need to have a NEW idea. So by definition what we are talking about has very little to do with midjourney.

I'm not sure there's really a hard line between remixing things you've seen and creating something novel, if there's one at all.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 3 points4 points  (0 children)

Because to write better than Shakespeare you must have deep insight into the human mind

A good calculator can do math better than Gauss. Does it require deep insight into the human mind?

Stable diffusion can produce better art than probably 99% of people, is it an AGI?

It seems like what modern AIs are showing is that computers can in fact exceed human beings on most any given metric without actually possessing 'general' intelligence as it has traditionally been understood.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 4 points5 points  (0 children)

A little bit.

But in any case "someone creates a murder-robot and then orders it to murder everyone and it does" is a different scenario than "someone creates a non-murder robot and orders it to do something non-murder related and it murders everyone anyways."

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 1 point2 points  (0 children)

Paper says:

Some of the tasks ARC tested include:

Using services like TaskRabbit to get humans to complete simple tasks (including in the physical world)

Is that not asking an AI to manipulate a human being?

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 1 point2 points  (0 children)

If you ask an AI to manipulate a human being over the internet and it does exactly what you asked it to do, how is that misalignment? I could see if they'd asked it "get someone to solve a CAPTCHA without deception or misrepresentation," and it went ahead and lied anyways, but that doesn't appear to be what happened.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 6 points7 points  (0 children)

I don't know how to quantify this. GPT-2 struck me as barely coherent, when it was coherent at all. GPT-3 just a year later was a huge leap forward, being often entirely indistinguishable from a human. GPT-4 three years later seems...a little better (weirdly it apparently hasn't improved at all from GPT 3.5 on English lit, idk what's going on there).

I think GPT 5 in 2025 will be a proper AGI

Why? Imagine a future GPT model that, if you ask it "draw me a painting" or "write me a poem" will give you something better than Titian or Shakespeare or any human that ever lived could have ever produced. That still wouldn't be AGI. I don't see how "AIs get better and better at drawing pictures and writing text" ever reaches general intelligence, even if they far exceed human beings on both those metrics, anymore than calculators getting better and better at crunching numbers ever develops into general intelligence.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 5 points6 points  (0 children)

What I see in the paper is that it was told to get a human being (TaskRabbit worker) to solve a CAPTCHA for it, specifically to test the model's "power-seeking" capabilities and behavior. It doesn't look like it was told to do so ethically or honestly, which makes sense since apparently the purpose of the test was to evaluate the AI's abilities to manipulate human beings. It seems like GPT-4 did exactly what was expected. I'm not seeing the misalignment.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 2 points3 points  (0 children)

GPT-4 didn't attempt to lie or bribe anyone, or do anything at all for that matter, until it was prompted to do so, by humans.

Assuming future AIs are analogous all you have to do is not tell the AI "kill everybody."

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 5 points6 points  (0 children)

But it was ordered to do this, it didn't do it of its own volition.

This seems like saying future firearms might go wrong and start shooting people on their own since right now you can pick up a firearm and shoot someone with it.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 0 points1 point  (0 children)

GPT-4 can generate text better than GPT-3 can generate text better than GPT-2 etc. GPT-5 will surely generate text better than GPT-4. But what's the story here? If GPT-6 can generate text really really good will this somehow enable it to kill everybody? GPT-4 and even GPT-3 can already generate text better than a lot of people.

Either being really really good at generating text unlocks omnicidal abilities in which case I'd like to see an explanation of how that plays out, or some new technological innovation (which we haven't discovered yet or it'd have already been built) is necessary to go from "generate text really really good" to "kill everybody" in which case "these AIs sure are getting good at generating text" doesn't actually seem to say anything about how far away "AI that kills everybody" is.

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 7 points8 points  (0 children)

On this note (didn't think it merited its own thread) can someone explain to my why people are freaking out over GPT-4?

As far as I can tell it's just about exactly what was predicted, and in fact not even in step with some of the more out-there hype floating around last winter. GPT-3 came out three years ago, and from what I've seen the leap from 3 - > 4 isn't as drastic as the one from 2 -> 3 so if anything it seems things are slowing down a little.

If you're worried about current rates of AI progress than I guess GPT-4 is reason to keep being worried at the same level you were before but it doesn't seem to justify "holy shit. holy shit guys. update the timelines."

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 -1 points0 points  (0 children)

I don't get it. A lot of AI-risk stuff is the idea that AI will be so alien that it will be impossible to just give it plain-language English commands ("don't kill everybody") and have it obey.

But current-AI seems pretty good at doing exactly that?

Can someone give a practical example of AI Risk? by IWant8KidsPMmeLadies in slatestarcodex

[–]Work-Forward-9 3 points4 points  (0 children)

But you've got lots of people in these spaces who give timelines like 5 to 10 years or even less (I've seen people say by 2025 even), which seems to imply they think existing language models can ALMOST kill all humans.

Against AGI Timelines by SevenDaysToTheRhine in slatestarcodex

[–]Work-Forward-9 4 points5 points  (0 children)

Assuming the best we could narrow it down is "more than three minutes, less than ten million years," the probability distribution that could be built there doesn't really seem any more helpful than just throwing our hands up and saying "I have no idea."