
Microsoft Copilot Studio enables enterprises to create custom AI assistants and virtual agents through an intuitive graphical interface. Once created, these assistants and agents can be tested and published directly within Copilot Studio.
Today, Microsoft announced a new research preview tool in Copilot Studio called "Computer Use." This tool allows Copilot Studio agents to interact with any website or desktop application as if they were tools themselves. Agents can now click buttons, select menus, and type into fields across apps and websites. The new "Computer Use" tool enables agents to operate in environments without available APIs for programmatic integration.
Powered by a large language model (LLM), "Computer Use" can automatically adapt to changes in apps and websites. According to Microsoft, the tool includes built-in reasoning capabilities to resolve issues autonomously.
To ensure the "Computer Use" tool is enterprise-ready, it runs on Microsoft-hosted infrastructure, eliminating the need for organizations to manage their own servers. Microsoft emphasized that customer data remains within Microsoft Cloud boundaries and will not be used to train large language models.
Microsoft highlighted the following ways in which the "Computer Use" tool enhances Robotic Process Automation (RPA):
- It responds to changes in real time: When buttons or screens change, the tool keeps working without breaking your flow.
- It is easy to use: You can describe what you want in natural language, no coding needed, and test and refine the prompt with real-time side-by-side video of the computer use reasoning chain and the planned UI automation.
- It is built with intelligence: The agent sees what is on the screen and makes smart decisions in real time, even in complex or constantly changing environments.
- It comes with full visibility: Makers can view a history of computer use activity at will, including captured screenshots and reasoning steps.
Early this year, OpenAI announced Operator, which uses a Computer-Using Agent (CUA) model that combines the vision capabilities of GPT-4o with advanced reasoning through reinforcement learning. Microsoft may be leveraging the same underlying technology behind Operator to power this new "Computer Use" tool in Copilot Studio.
Interested organizations can fill out this form to get an invite from Microsoft to try out this new tool.
3 Comments - Add comment