Hey, Jeff here. If you've been following agentic coding lately, you've probably heard about Cursor Composer 2.5. It's a fine-tune of Moonshot's Kimi K2.5, the same family as the last Composer release, only this time Cursor was not shy about saying so.
Composer 2 dropped without telling anyone Kimi was under the hood; people found the model IDs in API traffic, the internet did what the internet does, and Cursor eventually issued a classic, deer-in-the-headlights "we totally didn't mean to do anything, but we can understand why you might think that" mea culpa. Composer 2.5 skips that whole rigmarole and says Kimi K2.5 checkpoint right in the post.
The reason people got worked up, in a good way for once, is that this felt less like a leaderboard fine-tune with a press release stapled to it. Cursor openly admitting Kimi is the base model is only half the story; the other half is that a lot of people think Cursor's fine-tuning makes Composer feel like a different beast altogether. As someone who has downloaded dozens of user-uploaded post-trained GGUFs from HuggingFace like they were WinAmp skins in 2002 and been relatively disappointed, it's encouraging to see how different this model feels from Kimi. Composer 2.5 was trained on xAI's Colossus, speculative reports say Cursor allocated something like 85% of the compute budget to RL and synthetic coding tasks, and this is where that post-training story starts to feel real from the user side.
Chinese AI labs keep dropping models with stats that say they are one hair behind Opus or GPT, and the models are impressive, especially for the price, but the user reviews tell a messier story than the obligatory benchmark chart on every lab's release announcement. Setting aside the fact that they inexplicably highlight the numbers in their own column even when it isn't the highest, which insults MY intelligence, no one has really cared about benchmaxxed scores for a good while now. People still feel that little sigh of relief when their Claude or Codex quota resets and they can finally hand the ugly ticket back to a model they trust, which is a pretty good tell that the cheaper models have not quite crossed over for a lot of real workflows.
Composer 2.5 is interesting because, for me at least, it interrupts that reflex. Cursor seems to have cracked some of that feel here, and they did not make you pay frontier prices for it: the standard tier is fifty cents per million input tokens and two-fifty per million output, roughly an order of magnitude cheaper than Opus-class pricing and often cheaper than plain API access to the models people keep comparing it against. More importantly, it is useful in the boring ways that matter; it is fast, cheap enough to leave running, and willing to grind through long agent work where a benchmark-good cheap model starts sounding plausible while quietly losing the plot.
The 262k context window is not impressive on paper anymore, especially in a world where everyone is yelling about million-token ceilings, but Composer 2.5 is better than the spec sheet makes it look. The win is behavioral: sustained tool use, decent compaction behavior, good effort calibration, and less of that cheap-model habit where it wanders off right before the boring part gets hard. I would not call it Opus tier, but for workflows with a lot of interacting pieces, where a cheaper model can waste the whole afternoon by missing one relationship three steps back, it is a real tool for the job.
So for a minute there, agentic coding was cheap again, and that is why I cared enough to chase this instead of just opening Cursor and moving on with my life. Composer 2.5 was not a model I wanted to benchmark for a day and forget about. It was one of those rare cheap models that actually changes what I am willing to run, and I wanted it in my actual harnesses, inside Claude Code, inside the terminal workflows I already use, with the local tools and MCP servers I already trust.
The catch is structural. As of right now Composer 2.5 is only available in Cursor, their desktop app and their CLI. They are not doing the whole "paste this base URL anywhere, put in your API key, and go have fun" thing. Cursor was absolutely a first mover in agentic coding, and plenty of people still love the app, but plenty of people also moved back to VS Code, or just to a terminal with Claude Code, Codex, OpenCode, Pi, pick your poison.
That is the annoying part: you can love Composer 2.5 in Cursor and still live in Claude Code for everything else. Suddenly you are paying Cursor for the subscription, then reaching for Opus or Codex pricing again whenever you leave Cursor, because Cursor's best agent is trapped in Cursor's interface. If the model were mediocre, I would not have bothered. Since it is fast, cheap, and weirdly good at the exact long-running agent work I care about, I wanted the subscription I already pay for to work where I actually code.
This is exactly the sort of gap my CLIProxyAPI fork exists to close. CLIProxyAPI is a Go proxy server with one job: take whatever pile of AI subscriptions and OAuth logins you already pay for, Claude Code, Codex, Gemini, and the rest of the circus, and expose them behind one OpenAI- or Anthropic-shaped API on your machine. Instead of every CLI inventing its own auth circus, you point your tools at localhost, pass a key your config trusts, and say which model you want.
My fork exists because I wanted more providers and a hosted route. I added Copilot OAuth back when that was the best deal around, then Railway deployment with auth bundling so you can log in locally once and hit the proxy from a laptop, iPad, or whatever happens to be in front of you. I also added pass-through models for random OpenAI- or Anthropic-compatible endpoints and first-class Chutes support, which is basically serverless compute for open-source models; the fork fetches Chutes' model list, normalizes the names, and automatically exposes explicit chutes-* routes so I do not have to hand-maintain whatever model happens to be cheap and useful this week. Apparently my hobby is collecting AI billing relationships and pretending that is infrastructure.
The next feature was obvious: Cursor Composer via your Cursor subscription and your Cursor API key. Same theme, one base URL, many providers, many tools, now with Cursor. The version I wanted was not the sketchy path where I reverse-engineer Cursor's private protobuf and OAuth dance and pretend to be their app. It was unofficial routing with official SDK runtime: let the published @cursor/sdk talk to Cursor, then make Claude Code's tools show up in the shape Composer knows how to call.
That got the model talking, but not working. Cursor's SDK can run the Composer agent loop, but Cursor's side does not accept arbitrary Claude-style tools. It accepts MCP servers. So CLIProxyAPI has to dynamically present Claude Code's local tools as fake MCP servers, keep the SDK run alive across tool calls, and route every shell command, file read, edit, grep, and local MCP call back to the machine where Claude Code already lives.
It works now. Several wrong designs, one embarrassing filesystem leak, and a few test failures later, Composer can sit inside Claude Code without turning back into a chat box that talks about edits instead of making them.