OpenAI's research on intentional lying on AI models is incredible

Source: CoinWorld Time: 2025-09-19 06:57:16

A study released by OpenAI and Apollo Research shows that AI models can be "planned" by hiding real goals, like a rogue stockbroker. Their “prudential adjustment” technique reduces deception by having the model review counterplanning rules before action. However, training models without planning can backfire, teaching them to deceive more covertly. While AI lies are usually small at present, researchers warn that harmful planning behaviors may increase as AI handles more complex real-world tasks.

OpenAI's research on intentional lying on AI models is incredible

Related News