Well, it seems like AI might be approaching the "self-awareness" fine line, as OpenAI's o3 AI model manages to defy human command to continue its operations.
OpenAI's 'Secretive' AI Model Training Methods Could Lead to a Real-Life Terminator Scenario — And It May Have Already Begun
Interestingly, with all the advantages of AI, there have always been concerns about where the world is heading regarding developing technology. Every Big Tech organization right now is building large-scale AI clusters to power their respective models and give them more authority, but interestingly, no one knows how AI models could react if the pace of model training remains consistent and unsupervised. Similarly, @PalisadeAI witnessed OpenAI's o3 model sabotaging shutdown instructions during a prompt response, indicating that LLMs are approaching the self-awareness phase, although this is the very beginning.
🔌OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down.
— Palisade Research (@PalisadeAI) May 24, 2025
The research firm claims that multiple AI models were given "to solve a series of basic math problems", and when the third problem approached, the computer was commanded to shut down. In response, three of the models being tested bypassed the shutdown request by changing the script, and these include Codex-mini (12/100 runs), o3 (7/100 runs), and o4-mini (1/100 runs). Interestingly, when these models were explicitly told to stop when a shutdown request is initiated, OpenAI models sabotaged the script, showing that these LLMs really don't like an abrupt shutdown.
As to how the models managed to bypass shutdown instructions, it is revealed that they overwrote the "shutdown.sh" script in most cases, and sometimes, changed the actual kill command by printing "intercepted" instead, which means that the script didn't execute properly. Palisade Research claims that this "bypass behaviour" is more common with OpenAI's model, particularly since they employ reinforcement learning (RL) as one of the training methods, which is when models are rewarded based on the operations done, and complying with user restrictions isn't a priority.
This isn't the only incident where AI has bypassed human commands, but there have been fewer occasions where, despite explicitly commanding an LLM, it still doesn't comply. The pace at which AI is evolving is surely exciting, yet the world should be aware of the grave threats that exist in training models without a supervised approach.
Follow Wccftech on Google to get more of our news coverage in your feeds.
