De-composer

Stealing Composer's Hands, Keeping Its Brain

TL;DR: Composer 2.5 is cheap but only inside Cursor, and Cursor's API does not accept arbitrary Claude-style tools. It accepts MCP servers. So CLIProxyAPI dynamically presents Claude Code's local tools as fake MCP servers for the Composer run, lets the official @cursor/sdk handle the Cursor side, then routes each fake-MCP tool call back to Claude Code on my machine.

By Jeff Nash·May 28, 2026·14 min read

jeffnash/CLIProxyAPI

The Setup

The Cheap Agent That's Stuck in Cursor

Hey, Jeff here. If you've been following agentic coding lately, you've probably heard about Cursor Composer 2.5. It's a fine-tune of Moonshot's Kimi K2.5, the same family as the last Composer release, only this time Cursor was not shy about saying so.

Composer 2 dropped without telling anyone Kimi was under the hood; people found the model IDs in API traffic, the internet did what the internet does, and Cursor eventually issued a classic, deer-in-the-headlights "we totally didn't mean to do anything, but we can understand why you might think that" mea culpa. Composer 2.5 skips that whole rigmarole and says Kimi K2.5 checkpoint right in the post.

The reason people got worked up, in a good way for once, is that this felt less like a leaderboard fine-tune with a press release stapled to it. Cursor openly admitting Kimi is the base model is only half the story; the other half is that a lot of people think Cursor's fine-tuning makes Composer feel like a different beast altogether. As someone who has downloaded dozens of user-uploaded post-trained GGUFs from HuggingFace like they were WinAmp skins in 2002 and been relatively disappointed, it's encouraging to see how different this model feels from Kimi. Composer 2.5 was trained on xAI's Colossus, speculative reports say Cursor allocated something like 85% of the compute budget to RL and synthetic coding tasks, and this is where that post-training story starts to feel real from the user side.

Chinese AI labs keep dropping models with stats that say they are one hair behind Opus or GPT, and the models are impressive, especially for the price, but the user reviews tell a messier story than the obligatory benchmark chart on every lab's release announcement. Setting aside the fact that they inexplicably highlight the numbers in their own column even when it isn't the highest, which insults MY intelligence, no one has really cared about benchmaxxed scores for a good while now. People still feel that little sigh of relief when their Claude or Codex quota resets and they can finally hand the ugly ticket back to a model they trust, which is a pretty good tell that the cheaper models have not quite crossed over for a lot of real workflows.

Composer 2.5 is interesting because, for me at least, it interrupts that reflex. Cursor seems to have cracked some of that feel here, and they did not make you pay frontier prices for it: the standard tier is fifty cents per million input tokens and two-fifty per million output, roughly an order of magnitude cheaper than Opus-class pricing and often cheaper than plain API access to the models people keep comparing it against. More importantly, it is useful in the boring ways that matter; it is fast, cheap enough to leave running, and willing to grind through long agent work where a benchmark-good cheap model starts sounding plausible while quietly losing the plot.

The 262k context window is not impressive on paper anymore, especially in a world where everyone is yelling about million-token ceilings, but Composer 2.5 is better than the spec sheet makes it look. The win is behavioral: sustained tool use, decent compaction behavior, good effort calibration, and less of that cheap-model habit where it wanders off right before the boring part gets hard. I would not call it Opus tier, but for workflows with a lot of interacting pieces, where a cheaper model can waste the whole afternoon by missing one relationship three steps back, it is a real tool for the job.

So for a minute there, agentic coding was cheap again, and that is why I cared enough to chase this instead of just opening Cursor and moving on with my life. Composer 2.5 was not a model I wanted to benchmark for a day and forget about. It was one of those rare cheap models that actually changes what I am willing to run, and I wanted it in my actual harnesses, inside Claude Code, inside the terminal workflows I already use, with the local tools and MCP servers I already trust.

The catch is structural. As of right now Composer 2.5 is only available in Cursor, their desktop app and their CLI. They are not doing the whole "paste this base URL anywhere, put in your API key, and go have fun" thing. Cursor was absolutely a first mover in agentic coding, and plenty of people still love the app, but plenty of people also moved back to VS Code, or just to a terminal with Claude Code, Codex, OpenCode, Pi, pick your poison.

That is the annoying part: you can love Composer 2.5 in Cursor and still live in Claude Code for everything else. Suddenly you are paying Cursor for the subscription, then reaching for Opus or Codex pricing again whenever you leave Cursor, because Cursor's best agent is trapped in Cursor's interface. If the model were mediocre, I would not have bothered. Since it is fast, cheap, and weirdly good at the exact long-running agent work I care about, I wanted the subscription I already pay for to work where I actually code.

This is exactly the sort of gap my CLIProxyAPI fork exists to close. CLIProxyAPI is a Go proxy server with one job: take whatever pile of AI subscriptions and OAuth logins you already pay for, Claude Code, Codex, Gemini, and the rest of the circus, and expose them behind one OpenAI- or Anthropic-shaped API on your machine. Instead of every CLI inventing its own auth circus, you point your tools at localhost, pass a key your config trusts, and say which model you want.

My fork exists because I wanted more providers and a hosted route. I added Copilot OAuth back when that was the best deal around, then Railway deployment with auth bundling so you can log in locally once and hit the proxy from a laptop, iPad, or whatever happens to be in front of you. I also added pass-through models for random OpenAI- or Anthropic-compatible endpoints and first-class Chutes support, which is basically serverless compute for open-source models; the fork fetches Chutes' model list, normalizes the names, and automatically exposes explicit chutes-* routes so I do not have to hand-maintain whatever model happens to be cheap and useful this week. Apparently my hobby is collecting AI billing relationships and pretending that is infrastructure.

The next feature was obvious: Cursor Composer via your Cursor subscription and your Cursor API key. Same theme, one base URL, many providers, many tools, now with Cursor. The version I wanted was not the sketchy path where I reverse-engineer Cursor's private protobuf and OAuth dance and pretend to be their app. It was unofficial routing with official SDK runtime: let the published @cursor/sdk talk to Cursor, then make Claude Code's tools show up in the shape Composer knows how to call.

That got the model talking, but not working. Cursor's SDK can run the Composer agent loop, but Cursor's side does not accept arbitrary Claude-style tools. It accepts MCP servers. So CLIProxyAPI has to dynamically present Claude Code's local tools as fake MCP servers, keep the SDK run alive across tool calls, and route every shell command, file read, edit, grep, and local MCP call back to the machine where Claude Code already lives.

It works now. Several wrong designs, one embarrassing filesystem leak, and a few test failures later, Composer can sit inside Claude Code without turning back into a chat box that talks about edits instead of making them.

The Squeeze

This Should Have Been Glue Code

I gave myself two rules before I started, both of them reasonable in isolation and annoying the second they met each other. The first rule was that I would not call Cursor's private API myself. I wanted Cursor's own published SDK to own auth, streaming, recall, and the protocol details I did not want to clone, because reverse-engineering a private endpoint and dressing myself up as the IDE is exactly the kind of cleverness that ages into a support burden and, depending on how spicy you get with it, a terms-of-service problem.

Rule two was that every tool had to run on the user's machine. If Composer wants to inspect or change the repo, that work needs to happen where Claude Code is running, against the user's files and configured tools. A proxy can translate and route; it cannot become the computer the agent is operating.

Use the official SDK

Cursor's code does the Cursor talking. My proxy does not clone private APIs, forge an IDE, or guess at protocol details.

Keep the tools local

Claude Code remains the execution environment, including the user's repo and MCP setup.

the catch: the official SDK is a full agent runtime, with a default instinct to run tools wherever the SDK process happens to live.

The clean path was also the trap: the SDK was the right way to talk to Cursor, but its default tool loop wanted to run wherever the Node process lived.

The Graveyard

Every Obvious Path Broke a Rule

The box below is the short version of what failed.

Five forks, each killed by a rule

Speak Cursor's wire protocol myself. This meant rebuilding undocumented protobuf and Connect framing, then personally calling the private endpoint I was trying to avoid.

Use a hosted Composer proxy. This fixed nothing I cared about, because now a random backend sat between me and Cursor.

Run the official SDK in agent mode. This was the right direction until the tools started running on the server's filesystem.

Suppress SDK tools entirely. This was safe in the same way removing the engine from a car is safe. No filesystem leak, no useful agent loop.

Pretend to be the Cursor IDE. This pushed me back toward forged identity headers, the line I was trying not to cross.

Running the official SDK in agent mode is the failure that mattered. The @cursor/sdk package was the obvious way to talk to Cursor without cloning their private API, because it already knew how to authenticate and manage a real Composer run. Then I ran it on a Railway box, pointed Claude Code at it, and Composer cheerfully told me it was operating inside a remote container with HOME=/root, not where my repo lives unless my laptop has been doing some very ambitious lying.

After the Railway pwd output, I was debugging tool execution location, not provider wiring. Every read and shell call had to run on my laptop through Claude Code.

The Turning Point

Why Does It Think It's Running on the Server?

Claude Code was running on my laptop, the Node sidecar was running on Railway, and the official Cursor SDK was doing the Composer run when I asked the agent something ordinary about my repo. Composer answered like a helpful little server process: its working directory was /app, its home directory was /root, and /home/jmn did not exist.

composer @ Railway (the wrong machine)

what's your working directory and home dir?

shell("pwd; echo $HOME; ls /home") → /app
HOME=/root
ls: cannot access '/home/jmn': No such file or directory

I'm running on a remote container, so I can't see your local files. My home directory is /root.

It wasn't hallucinating. It was telling the truth, which was worse.

One config line caused it: Agent.create({ local: { cwd: WORKDIR } }). In the SDK's world, "local" means "run the agent's tools in this process's local filesystem," and since the process lived on Railway, the agent's local filesystem was the Railway container. Those were real commands against Railway's disk, not hallucinated paths.

That is an easy mistake to dismiss as a config bug until you watch the agent act on it. A coding agent with tools pointed at the wrong filesystem is worse than a chat bot, because it will grep and narrate a tree for a repo your machine does not have. Composer understood the request well enough; I had accidentally handed it the wrong machine.

Which machine runs the command

bad config/app

Composer SDK

Railway shell, files, edits

Everything is technically local, just local to the wrong machine.

desired shape/home/jmn

Composer SDK

turn

Claude Code tools + MCPs

The model can run upstream; the file operations still need the local repo.

One config, two failures

Wrong filesystem. Read, shell, and edit calls touch the container, not the user's repo.

Wrong tool inventory. Claude Code's tools and MCPs are ignored while the SDK advertises its own local executor.

Removing local: just swings the pendulum too far the other way, because now the SDK no longer has the local tool loop I need. I had to intercept tool dispatch and send each call back to the client before the server touched disk.

The Body

The Smart Part Was Keeping Claude Code's Tools

The Railway leak left one requirement: tool calls leave the SDK process and finish on the machine where Claude Code runs.

I did not want Composer as a cheaper text box. I wanted it inside Claude Code with the same repo tools and MCP servers I already run locally. The annoying constraint is that Cursor Composer does not accept arbitrary Claude-style tools over the API; it accepts MCP servers, so CLIProxyAPI has to make Claude Code's tools look like MCP servers for the duration of the run.

That is why the Node sidecar exists in the first place. CLIProxyAPI stays the local API facade; the sidecar is the little SDK host whose job is to let Cursor's own code handle auth, recall, streaming, and the protocol furniture I do not want to clone by hand.

Composer's view

Everything looks like MCP

Cursor accepts MCP servers, not arbitrary harness tools

mcp.claude.read

mcp.claude.edit

mcp.claude.shell

mcp.claude.grep

mcp.local.github

mcp.local.browser

outbound call

mcp.claude.shell.run

CLIProxyAPI

Mints fake MCP servers

Advertise

turn Claude tools into MCP-shaped server entries

Ticket

map MCP call ids back to parked Composer runs

Dispatch

send real work to Claude Code, not the bridge disk

Claude Code's machine

Real tools execute

local repo

$ shell, edit, test

GitHub MCP

browser MCP

The small lie is the useful part: CLIProxyAPI makes Claude Code's real tools look like MCP servers because that is the only tool shape Cursor Composer knows how to call. Composer thinks it called an MCP server; Claude Code actually ran the shell command, file edit, grep, or local MCP tool.

Composer still needs tools in its own shape. CLIProxyAPI builds a fake MCP catalog from Claude Code's current tool inventory, including file reads, edits, shell, grep, and any real MCP servers already configured locally. When Composer calls one of those fake MCP tools, the SDK run pauses until Claude Code returns the result.

These are MCP-shaped entries, not a little zoo of spawned MCP subprocesses. Composer sees server-shaped tools because that is the contract it understands; the bridge maps those calls back to Claude Code's existing tool inventory, which is the whole reason the same local MCP setup keeps working.

Composer also needs to be told that repo paths come from the client harness, not from the directory next to the Node process on Railway. That sounds like a small prompt detail until you watch an agent confidently explain a directory tree from the wrong machine.

System prompt

Tell the model which machine owns the tools

Defaults to SDK host

/tmp/bridge

Composer guesses from the SDK host and starts narrating the wrong filesystem.

Uses client tools

client tools

The system prompt sends repo and MCP work through Claude Code.

claim ticket queue

A paused run gets a receipt, then the same run wakes up

tool call

tool_read_1

claim

parked run

Composer waits

later

matching tool_result resumes that parked SDK run

The bridge does not replay a fake transcript and hope the model remembers the same thought. It hands the result back to the live run that asked for it.

where each part lives

Cursor upstream, fake MCP in the middle, tools local

nothing runs on the bridge by accident

Claude Code

shellreadeditgrepMCPabortresume

The repo and MCP setup stay on the user's machine.

CLIProxy

fake MCP registry

tool id map

reject fallthrough

Cursor Composer

@cursor/sdk

auth, recall, model stream

Cursor only talks through the SDK path it already expects.

The boxes are boring on purpose. Claude Code owns the tools, the Go proxy translates requests, the Node sidecar owns the Cursor SDK run, and Cursor only hears from its own SDK. The useful seam is where the SDK's unary and streaming tool paths both end up at the same executor shape, with stable protobuf oneof cases like shellArgs, grepArgs, and mcpArgs surviving minification well enough to anchor the redirect.

The bridge hooks that SDK tool path, emits a tool_call to CLIProxy, parks the Composer run, and resumes it when the client sends the matching result. The important bit is that the thing Cursor sees is MCP-shaped, while the thing that actually executes is still Claude Code's normal local tool. If the hook is missing, the call refuses instead of quietly falling back to the SDK's native executor.

Result shape took more care than the diagram admits. Cursor serializes these tool messages through protobuf, and some of the useful fields sit inside nested oneof cases, which means a plain JSON object can look right in a log and still lose the exact branch the SDK expects when it goes back into the run. The bridge has to rebuild the proper message shape before resuming Composer, which feels too fussy until the agent hangs forever because the one field it needed dissolved in transit.

what the bridge actually does

Server-side tool dispatch never runs locally on the bridge: forward to the client, return a safe stub, or error. The SDK emits roughly thirty server-side message shapes, so every case has to be handled explicitly. Some get boring synthetic answers, like a neutral request context that reveals nothing about the bridge host. The rest route out or fail closed, because one missed case is all it takes for the server to become the computer doing the work.

CLIProxy fronts a Composer run in Cursor's SDK, fabricates the MCP surface Composer is willing to use, and lets Claude Code execute every tool locally. Because the hook itself is process-global, the sidecar carries the current session through the async chain with AsyncLocalStorage; without that, two conversations can share one hook and suddenly a tool result lands in the wrong paused run.

stateful run inside stateless requests

user turn

Claude Code asks Composer

tool call

Composer pauses mid-run

local work

Claude Code runs tools

resume

same run continues

A stateless Anthropic-shaped endpoint can front a stateful SDK run if the bridge keeps the run parked between turns. A request can end with tool calls, the run can sit there paused, and the next request can carry the tool results back into the same Composer run instead of starting over from a reconstructed transcript.

/v1/messages wants one shot per request; the SDK agent can pause for tools. The bridge stores the live run id so the next request continues the same Composer run instead of faking a new transcript.

Implementation Notes

The Stuff That Burned Real Time

The broad architecture was the clean part. The time went into little mismatches that are boring to describe until one of them leaves an agent run hanging forever or quietly routes a tool toward the wrong executor. I am putting them here as notes instead of making you walk through every commit, because if you have ever debugged glue code between two agent runtimes you already know the shape of the misery.

I wired the seam into the wrong build

CJS

ESM

The SDK ships ESM and CJS builds. I was validating one while importing the other, so the seam never fired and nothing looked broken. Now the bridge loads the intended build deliberately and refuses to boot if the seam is not present.

Tool ids got sanitized out from under me

call: tool:read.1

rewrite

return: tool_read_1

Claude rewrites unsupported tool-id characters, so an id with a colon or dot came back under a different name and missed the parked promise. The fix was to sanitize before storing, not after.

A punctuation mark almost corrupted the seam

U+2014->ASCII only

The seam lives inside generated SDK output, so injected guard strings stay ASCII. One fancy dash in the wrong place could quietly corrupt the exact line that decides where tools run, so there is an assertAscii guard now.

Most of the time went to ESM vs CJS loading the wrong bundle, tool ids getting rewritten, and streaming vs non-streaming returning different shapes. Any of those can route a tool to the wrong place without throwing.

Making It Real

The Test Has to Prove It Can Fail

The bad failure mode is simple: miss one tool case and the server starts executing it. The safety layer catches that, so if the bridge hook is missing, tools reject; if the SDK no longer exposes the expected dispatch shape, the sidecar refuses to boot instead of guessing and hoping nobody notices.

The self-test took longer than I wanted because the first version was lying to me. It pinged the bridge hooks directly, which proved that the hooks existed, and proved nothing about whether the SDK's rewritten hook ever called them. That is a green check that tests the wrong thing.

The version I trust exercises the actual rewritten hook in the SDK bundle. It turns routing off so native execution is reachable, then turns routing on so tools go to the client. A test that has never failed for the real risk does not count.

positive control

real run

CI installs a fresh @cursor/sdk, runs the hook check with routing off so native execution is reachable, then with routing on. If the bad path never fails, the test is worthless.

The False Summit

It Worked, Which Was Not the Same as Being Done

For a few hours I thought I was done: live API, tools routing to the client, tidy commit. Review and edge cases disagreed.

✓feat(cursor): Composer via @cursor/sdk with client-side tool executiongreen

Session identity was keyed too loosely, so separate conversations could merge.

Some upstream failures came back as clean empty successes, a 200 status code lie.

Streaming and non-streaming paths had drifted, so one path preserved data the other dropped.

The first self-test proved the hook existed, not that the seam ever reached it.

I had been calling it done when CI went green and the demo path worked. The streaming path and the non-streaming path had diverged, and two conversations could share a session key if they started with the same message.

I made server-side execution a test failure you have to see before the suite passes, and I diffed the two response paths until they matched.

Patching the SDK hook took an afternoon; trusting it took longer.

The Verdict

None of this is new computer science. CLIProxy fronts a Composer run in Cursor's SDK, turns Claude Code's local tools into fake MCP servers because Cursor only accepts MCP-shaped tools, and sends the actual execution back to Claude Code, which is the only way one server process does not both talk to Cursor and touch the filesystem.

I care that Cursor traffic stays on their SDK, tools stay on my machine, and nothing in the middle executes shell or file ops.

This depends on an unsupported hook in @cursor/sdk. The adapter is small so a version bump is a contained fix, and CI has to show server-side execution before we call it safe. The self-tests have one job: if the SDK bundle moves the dispatch anchor, the bridge fails before a tool quietly runs on the sidecar. Cursor should expose this as a supported mode eventually; for now it is the bridge I wanted badly enough to build.

If you already run my CLIProxy fork, you add the Node sidecar and a Cursor key, then point Claude Code at Composer. Setup is fiddly but the failure modes are obvious when you hit them.

-- Jeff

Appendix