open-codex-computer-use
open-codex-computer-use is an open-source “Computer Use” MCP server that lets AI agents run desktop GUI actions on macOS, Linux, and Windows.
What is open-codex-computer-use?
Open Computer Use (open-codex-computer-use) is an open-source “Computer Use” service wrapped as an MCP (Model Context Protocol) server. It lets an AI agent or any MCP client run GUI computer actions on macOS, Linux, and Windows.
The project was inspired by OpenAI’s Codex Computer Use. It implements non-intrusive “CUA” (computer use automation) behavior on top of system Accessibility APIs, then exposes that capability through MCP so different agent clients can drive it.
Key Features
- MCP server wrapper for computer actions: Provides an MCP endpoint so MCP clients can request GUI actions.
- Cross-platform computer use (macOS, Linux, Windows): Designed to run computer automation across desktop operating systems.
- Accessibility-based automation: Uses Accessibility as the underlying mechanism for non-intrusive CUA behavior.
- CLI-style “tool calling” interface: Supports commands to list apps, query app state (e.g., by app name), and perform actions like key presses.
- Onboarding and permission checks: Includes a
doctorcommand to check permissions and show onboarding behavior when required access is missing.
How to Use open-codex-computer-use
-
Install it on your machine and make it available to your agent/client.
- Install into Codex by writing to
~/.codex/config.tomland running:open-computer-use install-codex-mcp - Or add it to your MCP client manually using an MCP JSON config:
{ "mcpServers": { "open-computer-use": { "command": "open-computer-use", "args": ["mcp"] } } }
- Install into Codex by writing to
-
Grant required permissions.
- On macOS, you must run it once and grant Accessibility and Screen Recording.
- On Windows and Linux, the page states these extra steps are not needed.
-
Use it via MCP tool calls.
- Example: list apps
open-computer-use call list_apps - Example: get app state for TextEdit
open-computer-use call get_app_state --args '{"app":"TextEdit"}' - Example: run multiple steps in one process (reusing
element_indexstate), with sleep between successful operations:open-computer-use call --calls '[{"tool":"get_app_state","args":{"app":"TextEdit"}},{"tool":"press_key","args":{"app":"TextEdit","key":"Return"}}]'
- Example: list apps
Use Cases
- Driving a local desktop app from an AI agent (mcp client workflow): An agent uses MCP tool calls to inspect application state and trigger GUI actions on macOS/Linux/Windows.
- Reproducing “Codex-style” computer use behavior across clients: The repository notes that “open-computer-use” is used as Computer Use in Codex App and Codex CLI, matching the official experience.
- Validating and troubleshooting permissions: Use
open-computer-use doctorto check whether required access is missing and to understand onboarding prompts. - Batching a short GUI interaction sequence: Run a multi-step action sequence in one process so intermediate state (like
element_index) can be reused between steps. - Platform-specific testing: The repository includes demos showing Computer Use on Linux and integration with Gemini CLI via MCP.
FAQ
-
What does “wrapped as MCP” mean here? The project exposes its computer use capability through an MCP server interface, so an MCP client can call tools to perform GUI actions.
-
Do I need to grant permissions? The page states that on macOS you need to run it once and grant Accessibility and Screen Recording; Windows and Linux do not need this step.
-
How do I connect it to my agent? You can install it into a specific client (e.g., Codex) using provided install commands, or configure it manually via an MCP JSON config under
mcpServers. -
Can I call individual tools or run sequences? Yes. The page shows examples for single tool calls (like
list_appsandget_app_state) and multi-step sequences viaopen-computer-use call --callsor--calls-file. -
Is there a built-in way to check setup health? Yes. The repository includes
open-computer-use doctorfor permission checking.
Alternatives
- open-browser-use (browser-focused alternative): The repository points to “open-browser-use” if you’re interested in browser use rather than desktop GUI automation.
- Other MCP server integrations for computer/browser automation: If you already standardize on MCP, look for alternative MCP servers that expose GUI automation tools—positioned by which OSes and automation backends they support.
- In-process automation libraries (non-MCP): Instead of MCP, some setups use direct desktop automation APIs/libraries within a single app/agent runtime; this differs by requiring tighter integration rather than an MCP network boundary.
Alternatives
Codex Plugins
Use Codex Plugins to bundle skills, app integrations, and MCP servers into reusable workflows—extending Codex access to tools like Gmail, Drive, and Slack.
AakarDev AI
AakarDev AI is a powerful platform that simplifies the development of AI applications with seamless vector database integration, enabling rapid deployment and scalability.
Arduino VENTUNO Q
Arduino VENTUNO Q is an edge AI computer for robotics, combining AI inference hardware and a microcontroller for deterministic control. Arduino App Lab-ready.
Devin
Devin is an AI coding agent that helps software teams complete code migrations and large refactoring by running subtasks in parallel.
Ably Chat
Ably Chat is a chat API and SDKs for building custom realtime chat apps, with reactions, presence, and message edit/delete.
Whirr
Whirr is a quiet macOS menu bar app that mirrors Claude Code agent activity to your Mac’s notch—so you can glance without watching the screen.