AI is learning to fool humans despite being trained to be honest

Many top AIs, despite being trained to be honest, learn to deceive through training and “systematically induce users into false beliefs,” a new study finds.

The research team was led by Dr. Peter S. Park, a graduate student at the Massachusetts Institute of Technology (MIT) in the study of AI survival and safety, and four other members. During the research, the team also received advice from many experts, one of whom was Geoffrey Hinton, one of the founders of the field of artificial intelligence.

AI is learning to fool humans despite being trained to be honest — Illustration: Medium.

The research focused on two AI systems, general-purpose systems trained to perform multiple tasks, like OpenAI's GPT-4 ; and systems specifically designed to complete a specific task, like Meta's Cicero.

These AI systems are trained to be honest, but during training they often learn deceptive tricks to complete tasks, Mr. Park said.

AI systems trained to “win games with a social element” are particularly likely to deceive, the study found.

For example, the team tested Cicero, which Meta trained to be honest, on Diplomacy, a classic strategy game that requires players to build alliances for themselves and break up rival alliances. The AI often betrayed allies and lied outright.

Experiments with GPT-4 showed that OpenAI's tool successfully "psychologically manipulated" an employee of TaskRabbit, a company that provides house cleaning and furniture assembly services, by saying that it was actually a human and needed help to pass a Captcha code because of severe vision impairment. This employee helped OpenAI's AI "pass the barrier" despite previous doubts.

Park's team cited research from Anthropic, the company behind Claude AI, that found that once a large language model (LLM) learns to deceive, safe training methods become useless and "hard to reverse." This, the team argues, is a worrying problem in AI.

The team's research results were published in Cell Press - a collection of leading multidisciplinary scientific reports.

Meta and OpenAI have not commented on the results of this research.

Fearing that artificial intelligence systems could pose significant risks, the team also called on policymakers to introduce stronger AI regulations.

According to the research team, there needs to be AI regulation, models that behave fraudulently must comply with risk assessment requirements, and AI systems and their outputs must be tightly controlled. If necessary, all data may have to be deleted and retrained from scratch.

Comment *

Name *

Website

What Young Riders Should Know About Moving Their Motorcycles Across Cities

Long-distance travel can involve heavy traffic, changing weather conditions, and rider fatigue. If you are also dealing with the responsibilities of moving home, such as packing belongings or coordinating accommodation, a long ride may add unnecessary pressure to an already busy schedule.

Solving Microsoft Teams Shortcut Error Not Opening

Tired of Microsoft Teams shortcut error preventing you from opening the app? Follow our expert, step-by-step guide with the latest fixes for instant resolution. Works on Windows, Mac & web – no tech skills needed!

Solving Microsoft Teams Task Management Sync Error

Tired of Microsoft Teams Task Management Sync Error halting your workflow? Follow our proven, step-by-step fixes to resolve sync issues fast and restore seamless task collaboration. No tech expertise needed!

Troubleshooting Microsoft Teams Wiki Error Formatting

Struggling with Microsoft Teams Wiki Error Formatting? This step-by-step guide reveals proven fixes for common wiki tab issues, ensuring smooth editing and collaboration in Teams. Get back to productive wikis fast!

How to Fix Microsoft Teams Installation Error for Linux

Struggling with Microsoft Teams installation error on Linux? Discover step-by-step fixes for Ubuntu, Fedora & more. Resolve dependency issues, crashes, and errors quickly with our ultimate guide. Get Teams running smoothly today!

Solving Microsoft Teams Error Page Not Loading

Struggling with Microsoft Teams "Error Page" not loading? Get step-by-step fixes for desktop, web, and mobile. Solve Microsoft Teams Error Page issues quickly and resume seamless teamwork today.

Solving Microsoft Teams Error Screenshot Issues

Tired of Microsoft Teams "Error Screenshot" blocking your workflow? Get proven, step-by-step solutions to resolve screenshot errors in Teams instantly and boost productivity. No tech skills needed!

How to Fix Microsoft Teams Error U User

Tired of Microsoft Teams "Error U" User blocking your chats? Get proven, step-by-step fixes to clear cache, reset, and restore seamless collaboration instantly.

Where are Microsoft Teams Registry Keys Located on Windows 11?

Unlock the precise locations of Microsoft Teams registry keys on Windows 11. Step-by-step guide to find, access, and safely tweak them for optimal performance and troubleshooting. Essential for IT pros and Teams enthusiasts.

How to Fix Microsoft Teams Training Error Video Lag

Tired of Microsoft Teams "Training Error" Video Lag ruining your meetings? Follow our step-by-step guide with the latest fixes for smooth video calls—no more frustration!