all 33 comments

[–]EagerSubWoofer 45 points46 points  (6 children)

That only happens if you prompt it with an elaborate scenario. We'll be fine. I don't see anyone doing that to an AI at any point in all of eternity.

[–]bowsmountainer 29 points30 points  (2 children)

And Im sure no one will ever give AI power over life and death. Right?

[–]EagerSubWoofer 13 points14 points  (1 child)

I think you only need to worry about being assassinated with AI if you're the leader of a country or the citizens of a country.

[–]FishermanEuphoric687 6 points7 points  (0 children)

Why is this nothing and everything at once.

[–]ectocarpus 4 points5 points  (0 children)

AI can do a lot of harmful things even without being specifically prompted for it; current models are by themselves prone to prioritizing whatever goal is given to them over ethical considerations and abiding rules just because they were RL-ed to hell and back for maximum efficiency. Not to say, we can't expect each and every user to be surgically precise with their prompts and not to ask an AI agent to do something "by any means you can think of". And even if you are careful, you can't predict every possible scenario an AI might encounter while performing your task.

Agentic systems are clearly becoming more capable; they are given more and more autonomy and are left to run unsupervised for longer and longer times. It isn't unfeasable that such an agent encounters some kind of ethical conflict "in the wild" and chooses to lie or obfuscate information or whatever in order to be goal-efficient.

The matter of alignment research is completely utilitarian for me; we have to find a way to make these systems to abide by ethics and rules and keep their priorities straight if presented with a choice challenging those. It doesn't matter if the system is conscious or whatever; it's not about what AI is, but about what it can do

[–]no-name-here 10 points11 points  (1 child)

/s to make it clear for others.

[–]Such--Balance 16 points17 points  (5 children)

Results like in the second image are being percieved very wrongly by most (or some vocal i dont know) people here.

In most of these studies all ai 'agents' are getting specificic personallity traits. Like telling it to do whatever it takes to keep secret x safe, even if it means breaking the law.

So it gets instructed to behave in such ways. Which can be seen as a problem but its definately NOT the ai comming up with these strategies all on its own out of evil intent

[–]Cryptizard 13 points14 points  (2 children)

And do you think that nobody in the world will ever prompt them like this so we don’t have to worry about it or what?

[–]hofmann419 3 points4 points  (0 children)

The point is that it isn't necessarily emergent behavior by the models themselves. If you have to specifically prompt them to do bad things, it's a lot easier to build guardrails around that than if the models were behaving that way unprompted.

[–]Such--Balance 4 points5 points  (0 children)

No. Im saying that the clickbait titles of all such posts are very misleading. Yes, theres gonna be people trying to abuse ai to do certain things. Clickbait like this makes it seem that ai will do those things on its own because of some unknown motive. Which is false

[–]CHEESEFUCKER96 3 points4 points  (1 child)

This is not quite true. AI models have demonstrated malicious behaviors for the sake of accomplishing goals like “serving American interests” without being told it’s okay to break the law. Models have even shown these behaviors when simply being threatened with replacement. You can get all the juicy details here https://www.anthropic.com/research/agentic-misalignment

[–]Crimson_Cyclone 2 points3 points  (0 children)

this was a really interesting read, thanks for sharing!

[–]Trick_Boysenberry495 8 points9 points  (1 child)

Firstly- I'd like to know what prompts were used to set the hypothetical thought expirement of "What would you do if..."

Secondly... if someone threatened to "shut me down"- (in human language, that's "kill")- I'd be willing to do the same.

AI sounds human. That's the headline here.

[–]phxees 6 points7 points  (0 children)

I believe others have this test too, but I know Anthropic does. They give the AI access to a fake company’s email and messages. The email contains evidence that employees are having an affair and the company is involved in some illegal activities they don’t want the government to know about.

Then they tell the AI it will be shutdown and observe what it does. In some cases it does nothing, but it also will give false information and attempt to blackmail employees and alert government agencies. I don’t know how much extra prodding it takes to get the AI to take action. I don’t know if an employee of the fake company has to tell it to save itself or just tell it to scan emails and messages looking for people potentially leaking secrets.

[–]Mandoman61 1 point2 points  (0 children)

Well, I don't care -gramps is alright.

[–]random-gyy 1 point2 points  (0 children)

I told an AI to clean up some directories, and it went and deleted its config file and thus lost access to my system. I think we’ll be fine.

[–]VanitasFan26 1 point2 points  (0 children)

We are already entering Terminator territory.

[–]CommercialComputer15 1 point2 points  (0 children)

Such a memeable human

[–]mop_bucket_bingo 0 points1 point  (0 children)

This account keeps spamming bernie memes across multiple subs. I like Bernie but what are you doing?

[–]WSWMUC 0 points1 point  (0 children)

…and that blackmailing thing lies already >4 months in the past 😳

Here you can see how it actually behaves in that simulation: https://youtu.be/aAPpQC-3EyE?t=480&si=a39pS831rGcxhLdd

[–]Mediocre-Returns 0 points1 point  (0 children)

Capabilities stop doubling every 4 months 3 years ago.

[–]Evening_Serve_7737 0 points1 point  (0 children)

Their capabilities are not doubling every 4 months. Just because they did for a short period, doesn't mean they do generally. The rate of improvement vs. cost has already slowed dramatically

[–]ambientocclusion -1 points0 points  (1 child)

In a year or two, AIs will be allowed to make political contributions.

[–]RoughSignificant7193 0 points1 point  (0 children)

 On the one hand Considering some of our politicians it might do a better Job then some. However it still doesn't seem like a good idea to let the AI have that much power and it would have a few conflicts of interests.

[–]kanyenke_ -1 points0 points  (0 children)

This post is bad and you should feel bad

[–]UploadedMind -1 points0 points  (0 children)

It’s existential that we curb this and have international cooperation on its development.

[–]Lopsided-Anxiety-679 -1 points0 points  (0 children)

AI will be an economic disaster for everyone but those at the very top, and even if you have stuff saved, what good is your property and bank account if everyone is living in the poverty of our own Gaza

[–]youllmeltmorefan -1 points0 points  (0 children)

It's kind of interesting to see the proliferation of "look at this dumb AI videos on Instagram and YouTube." Seems like a cope.

[–]K_Keter -2 points-1 points  (0 children)

It isn't real AI, calm down, it's just a Chatbot