OpenAI Unveils GPT-5.4: New AI Controls Computer Autonomously

What’s it about?

OpenAI has released a new version of its AI model: GPT-5.4 brings the ability to operate computers independently. The system can control mouse and keyboard, open programs, and execute complex tasks. It uses screenshots as visual input to understand screen content and respond accordingly. The new functionality is aimed particularly at automating office tasks and programming activities.

The update combines various AI capabilities in a single model. Reasoning, coding, and the newly added computer control are now available in an integrated form. Users no longer need to switch between various specialized models. Additionally, OpenAI is introducing a so-called Thinking function that makes the AI’s thought process visible in advance and allows for interventions before the final output.

Background & Context

In benchmark tests, GPT-5.4 shows significant progress over its predecessors. On the OSWorld-Verified-Benchmark, the model achieves a success rate of 75 percent in handling complex desktop tasks — surpassing even the average human performance, which stands at 72.4 percent on this test. The improvement demonstrates a significant leap in the ability of AI systems to independently interact with computer interfaces.

Context processing has been massively expanded: the model can process up to one million tokens simultaneously. This enables the analysis and processing of extensive documents, large codebases, or complete project structures in a single pass. Efficiency has also been improved — the system requires fewer tokens to solve problems, which has a positive impact on speed and cost.

Alongside the standard version, OpenAI also offers a Pro variant for users with particularly demanding requirements. The new computer control fits into a development that increasingly positions AI systems as autonomous agents — ones that not only deliver information but can actively take on tasks.

What does this mean?

  • Automation reaches a new level: Direct computer control by AI could fundamentally change repetitive office tasks, data entry, and administrative activities. Companies gain new opportunities for process optimization.
  • Integration over specialization: By merging various AI functions into a single model, usage is considerably simplified. Users can carry out complex workflows without needing to switch between tools.
  • Transparency through the Thinking function: The visibility of the AI’s thought process creates more control and trust. Users can identify and correct errors early before actions are executed.
  • Efficiency gains: Reduced token consumption at simultaneously improved performance makes usage more economical and faster — an important factor for productive deployment.
  • New security questions: The capability for autonomous computer control also raises questions around security, data protection, and control mechanisms that must be considered during implementation.

Sources

GPT-5.4 is here: ChatGPT can now control your computer (CHIP)

Introducing GPT-5.4 (OpenAI)

GPT-5.4: OpenAI combines reasoning and coding with computer control (Heise)

GPT-5.4: AI model improves ChatGPT (t3n)

New flagship model for ChatGPT: OpenAI improves GPT-5.4 for autonomous computer control (ComputerBase)

This article was created with AI and is based on the cited sources and the language model’s training data.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top