OpenAI Codex

OpenAI's agentic coding suite: an open-source Rust CLI with three approval modes and a macOS/Windows desktop app with ambient background coding — the primary competitor to Claude Code.

Two distinct products share the Codex name: the CLI (terminal agent, open-source, self-hosted) and the Codex app (desktop agent with cloud-parallel task execution). Both are powered by codex-1.


The CLI

What It Is

A lightweight terminal coding agent released in April 2025. Originally TypeScript; rewritten to Rust in June 2025 — 95.6% of the codebase is now Rust. The rewrite significantly reduced startup time and resource usage. Apache 2.0 licensed.

npm install -g @openai/codex
codex

GitHub: openai/codex — 75k+ stars, 14.5M monthly npm downloads, 400+ contributors as of early 2026.

Approval Modes

Three modes define how much the agent can do without stopping for confirmation:

ModeFile editsShell commandsWhen to use
suggestRequires approvalRequires approvalDefault; safest for production machines
auto-editAuto-appliedRequires approvalDay-to-day development
full-autoAuto-appliedAuto-appliedIsolated containers, disposable environments

Switch mid-session with /mode — no restart required. full-auto is the equivalent of Claude Code's bypassPermissions and carries the same risks outside an isolated environment.

Sandboxing

The sandbox is the boundary that lets Codex act autonomously without unrestricted machine access. In Auto preset, Codex can read files, make edits, and run commands inside the working directory. For Codex cloud tasks, internet access is configurable: full access or a domain allow-list.

MCP Support

Codex CLI supports protocols/mcp servers, using the same tool connectivity layer as ai-tools/claude-code and other modern coding agents.

AGENTS.md

Codex reads AGENTS.md from the repo root for project-level instructions — the direct counterpart to Claude Code's CLAUDE.md. Both files serve the same purpose: tell the agent how the project is structured, what commands to run for tests and lint, and what conventions to follow.

AGENTS.md is now an open standard stewarded by the Agentic AI Foundation under the Linux Foundation, supported by 60+ tools including Cursor, GitHub Copilot, Windsurf, Aider, Gemini CLI, Devin, and JetBrains Junie. Teams often maintain both: AGENTS.md for universal agent rules, CLAUDE.md importing AGENTS.md and adding Claude-specific enhancements.


The Codex App

A desktop application launched February 2026 on macOS; Windows support followed in March 2026.

Ambient Coding

The defining feature: Codex can work in the background while you do other things. Multiple agents run in parallel, using computer-use capabilities (seeing, clicking, and typing with its own cursor) without interfering with your active session. Agents can operate on ongoing and repeatable work, remember preferences, and learn from previous actions.

Developer Workflow Features

  • PR review: addresses GitHub review comments, reviews open pull requests
  • Task assignment from GitHub issues
  • Multiple terminal tabs
  • Remote devbox connections via SSH (alpha)
  • In-app browser for iterating on frontend designs
  • Parallel agent dispatch from a single command centre UI

Cloud Parallel Execution

Tasks are dispatched to isolated cloud containers. Each container has internet access (configurable), code execution, and git. This allows dispatching many tasks simultaneously and collecting results — a structural differentiator from Claude Code's local-first model.


The Model: codex-1

codex-1 is a version of o3 optimised specifically for software engineering tasks. Key characteristics:

  • Context window: 192,000 tokens
  • SWE-bench Verified: 72.1% (surpasses o3-high at 71.7%; 85% with up to 8 retries)
  • First-attempt accuracy on software engineering tasks: ~37%; climbs to 70.2% with retries
  • Fine-tuned to produce clean diffs, follow repo conventions, and prefer verifiable actions

The CLI also supports other OpenAI models via the --model flag.


Comparison: Codex CLI vs Claude Code

DimensionCodex CLIClaude Code
ImplementationRust (open-source, Apache 2.0)TypeScript (proprietary)
Default modelcodex-1 (o3-based, SWE-bench 72.1%)Claude Sonnet 4.6 (SWE-bench ~60%)
Approval modessuggest / auto-edit / full-autodefault / bypassPermissions
Project config fileAGENTS.md (open standard, 60+ tools)CLAUDE.md (Anthropic-specific)
Multi-agentCloud parallel containersLocal subagents + worktrees
MCP supportYesYes (native host)
Hooks systemNo equivalentPreToolUse / PostToolUse / Stop
Skills / slash commandsAgent Skills (SKILL.md pattern)Skills via ~/.claude/plugins/
/ultrareview equivalentNoneYes (multi-agent cloud review)
Desktop appYes (macOS + Windows, Feb/Mar 2026)No standalone app (IDE extensions only)
Ambient background codingYes (computer use, parallel agents)No
Self-hostableYes (CLI is fully local)Yes (CLI is local)

When to Choose Codex

  • You need cloud-parallel task execution across many isolated containers simultaneously.
  • You want an open-source, auditable Rust binary with a permissive licence.
  • Your team uses AGENTS.md as the cross-tool standard and needs Codex to read it natively.
  • full-auto mode inside CI or ephemeral containers is the target workflow.

When to Choose Claude Code

  • You need hooks for automated lint, test, or guardrail enforcement on every tool call.
  • Multi-agent coordination with strict file ownership and worktree isolation matters.
  • The broader MCP ecosystem (native host, first-party servers) is important.
  • /ultrareview parallel code review from multiple angles is valuable.

Key Facts

  • CLI released: April 2025 (TypeScript); rewritten to Rust: June 2025
  • Codex app (macOS): February 2026; Windows: March 2026
  • GitHub stars: 75k+ (as of early 2026)
  • Model: codex-1 — o3 fine-tuned for software engineering, 192k context window
  • SWE-bench Verified: 72.1% (first attempt); 85% (up to 8 retries)
  • Licence: Apache 2.0
  • AGENTS.md: open standard, Linux Foundation, 60+ tools
  • Approval modes: suggest (default) → auto-editfull-auto
  • /mode command switches approval mode mid-session without restart

Connections

Open Questions

  • How does Codex compare to Claude Code on multi-file refactors in practice — SWE-bench scores may not capture real-world complexity?
  • Do cloud-isolated containers in Codex's remote execution mode share state across tasks in the same session?
  • What's the latency difference between Codex CLI and Claude Code for a typical 20-file codebase task?