Graduating to the terminal

May 30, 2026 AI

A code editor showing a pyproject.toml file next to an OpenCode terminal session

This is the first post in a series on the AI engineering toolkit. It runs from terminal-curious to fairly advanced and has no planned end. The early posts will be simple, maybe too simple if you have done this for a while, but the difficulty ramps up as they go, and we will hopefully catch everyone at their level along the way. Later on it gets into running several agents at once, and the messier ground past that, like working out whether any of it actually helps, running AI in CI or production, and keeping it from doing real damage. The terminal is just the doorway, and the real benefit is further on, when the AI runs on its own and you are no longer ferrying it through every step.

I assume you have used one of the chat-box assistants for coding and want more out of it, so I am not going to define what they are.

The chat box stops being enough

You have a browser tab open with an assistant or the desktop version of it. You paste a function in, and you paste the fix back into your editor. That holds up for a while. Then the task grows, and you are pasting three files so the thing has enough to go on, while it quietly guesses about the two you did not think to include. The session turns into you shuttling text between a tab and your editor, and bits get lost on the way.

Most of that friction has nothing to do with the model. A chat box only ever sees what you hand it, so it cannot open the file sitting next to the one you pasted, or run your tests, or look at the error your build actually threw. You can wire connectors into the desktop apps so they reach into a folder or call a tool, which helps, though it is something you set up and then have to keep an eye on. Either way you are still the one doing the fetching and carrying.

A terminal agent drops that step. It runs in the directory where your code already lives, opens the files it needs, runs commands and reads what they print back, and edits in place. You give it the outcome you want, then read over what it changed. Running in the terminal does not mean giving up your editor: each of these agents ships an extension that wires it into your IDE session, so you can hand it open files or highlight a block as a reference and review its edits as side-by-side diffs, without the cramped prompt editing and history scrolling a bare terminal puts on you. More on that setup later. The rest of the series assumes you are working this way, with the agent inside the repo instead of a browser tab.

Three tools lead the field. Claude Code from Anthropic and OpenAI’s Codex are the single-vendor agents, each tied to its maker’s models. OpenCode is the odd one out, it’s model-agnostic so you point it at whatever provider you want. I am not going to pit them against each other. They are implementations of the same idea, and I will say things once where they agree and pull them apart where they really differ. The examples later lean on OpenCode, but little of it depends on which you picked.

Why the terminal, concretely

By default the agent works on your files as they sit on disk, uncommitted edits and untracked scratch files included, in the directory you started it from. There is nothing to grant it and no connector to point at a folder. A desktop app can read your files through a filesystem MCP server, but only the directory you set up ahead of time, which is always a step behind where you are actually working.

It runs your commands too, the tests and the linter and the build, and reads what they print and acts on it without you pasting anything back. Its memory is the whole project. A chat app only knows what you typed at it, so you paste files in and paste them again every time they change, while the agent reads them itself and keeps them around for the session. All of this happens inside the setup you already run, your own shell and git and virtualenv, so there is nothing new to install.

None of these are things only a terminal can do. The terminal’s pull is less friction and a bit more depth. It works through your real git history and shows every change as a diff you can read and undo before it sticks, and it keeps a whole task in one place. The chat apps are still fine for a one-off question, or a snippet you would rather keep away from your repo. I will come back to the desktop app once at the end, then drop it.

One thing falls out of the agent being a plain command: you can run it in a container, Docker or Podman, walled off from the rest of your machine. That is what makes it reasonable to give it a longer leash with fewer approval prompts, since the worst it can do stops at the container wall, and along the way you can cut its network access or pin an exact toolchain. All three run in a container the same way. Claude Code ships a reference devcontainer and OpenCode publishes an image. The how and why of that is a later post on permissions and sandboxing.

Installing Claude Code

Claude Code is tied to your Anthropic account. The free Claude.ai plan does not include it, so you need a Claude Pro, Max, Team, or Enterprise subscription, or an Anthropic Console account with API credits. It also works through Amazon Bedrock, Google Vertex AI, and Microsoft Foundry if that is how your company buys access.

On macOS, Linux, or WSL, the recommended install is one line:

curl -fsSL https://claude.ai/install.sh | bash

This drops a self-contained binary that updates itself in the background. If you would rather go through a package manager, Homebrew has a cask (brew install --cask claude-code), and Windows has WinGet (winget install Anthropic.ClaudeCode). There is also an npm package, @anthropic-ai/claude-code, which wants Node.js 18 or newer. I would reach for the npm path only if you already manage everything through npm, since the native installer handles its own runtime.

The command is claude. Drop into a project and start it:

cd your-project
claude

On first run it opens a browser to log you in. Credentials are stored locally, so you do this once. If you ever need to switch accounts from inside a session, /login does it, and /help lists the rest of the slash commands.

Installing Codex

Codex is OpenAI’s agent, and like Claude Code it is tied to one vendor. It runs OpenAI’s GPT-5-Codex models, reached through a paid ChatGPT plan (Plus, Pro, Business, and up) or an OpenAI API key. Unlike Claude Code it is open source, a Rust codebase under Apache-2.0 that ships updates at a furious pace.

On macOS, Linux, or WSL, one line installs it:

curl -fsSL https://chatgpt.com/codex/install.sh | sh

There is also npm install -g @openai/codex and a Homebrew cask (brew install --cask codex). The command is codex. Run it in your project and sign in with your ChatGPT account on first launch, or set OPENAI_API_KEY.

cd your-project
codex

Where it diverges from the other two is worth a note. The sandbox is enforced by the operating system itself (Seatbelt on macOS, Landlock and seccomp on Linux), so even before you change a setting it can edit inside your workspace but has to ask before it reaches the network or writes anywhere else. That is a harder boundary than a prompt you can click straight through. For undo it leans on the sandbox and on plain git, where Claude Code keeps its own checkpoints and OpenCode has /undo. And it reads AGENTS.md, the shared instruction file most agents now use, while Claude Code sticks to its own CLAUDE.md and ignores AGENTS.md. A later post digs into both.

Installing OpenCode

OpenCode is model-agnostic. It ships with no model access of its own, which means there is one extra step: you bring a provider. That is the price of not being tied to a single vendor.

One thing about the name, since it will look odd otherwise. The project used to live at sst/opencode and now lives at anomalyco/opencode. It is the same project. GitHub redirects the old links, and the docs at opencode.ai reflect the new home. If you see both names referenced around the web, they point at the same tool.

The recommended install on macOS, Linux, or WSL is again one line:

curl -fsSL https://opencode.ai/install | bash

There is broad package-manager coverage too: npm install -g opencode-ai, a Homebrew tap (brew install anomalyco/tap/opencode), Chocolatey and Scoop on Windows, pacman on Arch, and Mise. Pick whichever fits how you already install things. One note on updates: OpenCode’s install-script binary auto-applies patch releases on startup and prompts you before minor and major ones. A package-manager install does not auto-update for any of the three tools, though, so when you go that route you bump it yourself.

The command is opencode:

cd your-project
opencode

The first launch lands you in the TUI with no provider connected, so nothing works yet. Connect one:

/connect

This opens a picker. If you already have a provider account, pick it (Anthropic, OpenAI, Amazon Bedrock, and others), and paste the API key when asked. Keys are stored under ~/.local/share/opencode/. If you do not want to wrangle provider accounts, there are two easy ways in. OpenCode Zen is a managed, pay-per-use option with a curated set of models. And if you already pay for GitHub Copilot, connecting it is a device-code flow with no new key to manage.

Then pick a model and let it read your project:

/models     # choose from your connected providers
/init       # analyze the codebase, write an AGENTS.md

/init writes a file called AGENTS.md that captures what the agent learned about your project. Commit it. That file is the subject of one of the next posts, so I will leave it there for now.

Inside your editor

A terminal does not have to be a separate window. Most editors have one built in, so running claude, codex, or opencode there behaves exactly like a standalone terminal, and the agent comes along wherever you work. All three also reach further into the editor than a bare terminal does, in different ways depending on whether you are in VS Code or Neovim.

In VS Code and its forks like Cursor and Windsurf, all three run in the integrated terminal, the panel you toggle with Ctrl+` or Cmd+` on a Mac. Beyond that, each ships an extension, and they take different shapes. Claude Code’s official extension adds a graphical panel with side-by-side diffs you approve before they land, @-mentions of files and line ranges, and plan review in a real document. Anthropic calls it the recommended way to use Claude Code there, and it also installs on the forks through the Open VSX registry. OpenCode goes the other way: running opencode in the integrated terminal installs its extension automatically, and instead of a separate panel it keeps the TUI and wires the editor in with a quick-launch shortcut (Ctrl+Esc), selection and open-tab sharing, and a shortcut to drop file references like @File#L37-42. Codex has its own official extension for the same editors.

Neovim gets no first-party plugin from any of the three, but the community has filled the gap. For OpenCode I use NickvanDyke/opencode.nvim, which connects to a running opencode and shares editor context like the current buffer, selection, and diagnostics. For Claude Code there are two common approaches: coder/claudecode.nvim reimplements the same WebSocket protocol the official VS Code extension speaks, so you get native diffs and selection sharing, while greggh/claude-code.nvim takes the lighter route of toggling the claude CLI in a terminal split and reloading buffers it edits. Codex has no official plugin either, so you run it in a terminal split or pick up a community wrapper.

This is not an exhaustive list. Other editors have their own integrations too, JetBrains IDEs, Zed, and Emacs among them, so it is worth checking whether yours does before you reach for a separate window.

A first session that earns its keep

The install is the boring part. The agent earns its place the first time it does something you would otherwise have done by hand. Examples lean Python in this series, since that is the common tongue in the AI world, and I will use conda for environments. The flow is portable enough if you prefer something else.

Take a small project with a failing test. Set up the environment first, the ordinary way:

cd your-project
conda create -n demo python=3.12 -y
conda activate demo
pip install -e ".[dev]"
pytest

Say one test fails with a KeyError and you do not yet know why. In the chat-box world you would start copying: the test, the traceback, the function under test, probably the wrong files. Instead, start the agent in the same directory and describe the outcome you want:

pytest is failing with a KeyError in test_parse_config. Find the cause and fix it. Run the tests to confirm.

It opens the test, traces it to the function under test, and reads the code, then runs pytest itself, hits the same traceback you would have, changes a line, and runs the suite again to check it goes green. You did not paste anything in. It went and found what it needed on its own.

Two habits pay off early. One is to read the reasoning it prints and not skip straight to the diff. When it starts going in circles (“the user wants A, but there is B, so maybe C, wait”), that usually means your request was underspecified, or the code has a knot in it worth untangling first. The code it lands on might compile fine even then. The other is to keep an eye out for changes you did not ask for, an unrelated function quietly rewritten, some pattern nobody requested, the sort of thing that is easy to wave through and a pain to unpick later.

Look before it leaps

Get into the habit of making it plan before it touches anything. All three have a read-only mode where it can poke around and lay out an approach without editing a file, and you read that plan, push back on whatever is off, and only then let it start writing. It is cheap insurance against an agent that sets off confidently in the wrong direction. For anything big or vague, go one step earlier and just think out loud with it in that read-only mode before you even ask for a plan, because the back-and-forth tends to drag up constraints and dead ends you would never have spelled out yourself. There are ways to go even further along this line, for example with spec-driven development which will be the subject of its own post.

The mechanics differ. In Claude Code, Shift+Tab cycles through permission modes, including a plan mode that holds off on edits. In OpenCode, Tab toggles between build mode, the default where it edits, and plan mode, which is read-only. Codex gets there with a /plan command for drafting the approach and a read-only mode you set from /permissions that keeps it consultative until you approve.

For a one-line fix it is overkill. When you are not sure the agent has understood the task, though, a quick plan pass saves you from reviewing a big wrong change after the fact. Planning gets its own post further on.

When something goes wrong

It will, and two of the three hand you a built-in way back. OpenCode has /undo and /redo for the last batch of changes, and Claude Code keeps checkpoints you can rewind to. Codex went the other way and dropped its experimental undo, on the theory that you should be on git anyway, and its OS sandbox already caps how much a bad run can wreck. Whichever you are on, you can get back to where you started. The permission rules that decide what the agent may do without asking are a topic for a later post.

For now, keep it dull: stay in a git repo, commit before you hand it anything big, and look over what it changed before committing again. It is fast and usually right. That “usually” is why I keep each commit small enough to throw away.

It runs without you

What you build driving the agent by hand is what a pipeline runs later, with nobody sitting there. The config you tuned interactively (the AGENTS.md, the permission rules, we’ll get back to that) is exactly what an unattended job reads, so none of it gets redone.

Some tools promise the same from outside the terminal. Claude Cowork can run multi-step workflows on its own, but it is pitched at administrative and business work, the sort that lives in documents and inboxes, and it has no notion of git. For code that rules it out: git is how you track every change and keep the project maintainable over time. An agent that cannot touch it has no business in a codebase you live with.

In CI the honest alternative is calling the model’s API straight from a script. You can, but then you own the loop yourself, feeding it files and reading back what broke. Pointing Playwright at a hosted chat is another route, though a headless browser trips bot checks and breaks every time the page shifts, so it is permanent maintenance. A terminal agent is already that loop, behind the one contract every pipeline understands: arguments in, an exit code out.

This works today. The same tool you ran interactively takes the same prompt with the interaction stripped out:

# Claude Code, headless
claude -p "summarize the changes in this branch" --output-format json

# OpenCode, headless
git diff main | opencode run "review these changes and flag anything risky"

# Codex, headless
codex exec "summarize the changes in this branch"

All three also ship a GitHub Action that runs the agent against a pull request, so the gap between “I ran this in my terminal” and “this runs on every PR” is not large. That is its own post later. For now, the practice you put in by hand carries straight over to running it unattended.

Conclusions

It all comes down to where the agent sits. Put it where your code actually lives and it can read and change files and run things itself, without you ferrying text back and forth. Once it can run your tests and see the result, it has something concrete to check its work against, and most of what comes later in the series builds on that.

Claude Code is the lowest-friction start if you already pay Anthropic, and Codex is the same deal in the OpenAI world, both tied to their maker’s models. OpenCode asks for one more step, connecting a provider, and gives back model flexibility and a plugin system, which is why I lean on it as the default from here on, and we will talk more about why later. Whichever you pick does not bind you, since the concepts map onto all three and I will call out the places they diverge.

The desktop and web apps still have a place for a fast question, or a snippet you do not want anywhere near your repo. That is the last you will hear of them here. From here on we stay in the terminal, and the next thing worth doing is teaching the agent about your project, starting with the instruction files that stop it from forgetting your conventions. Get that groundwork in and the path opens up: from there we graduate not just to the terminal but to AI that runs on its own, which is where the real benefit is.