Code Gen

GitHub Copilot: The One Everyone Has But Nobody Evaluates

Rza

21 Sep 2025 — 6 min read

GitHub Copilot is the AI coding assistant with the largest installed base, the deepest distribution advantage, and the least amount of honest evaluation relative to how many developers use it. It ships in VS Code by default. Most developers who "use AI for coding" mean they have Copilot on. The honest verdict: Copilot's inline autocomplete is still the best in the industry, but the rest of the product — Chat, Workspace, the agent features — lags behind Cursor and Claude Code by a margin that matters.

What It Actually Does

Copilot's core strength has always been inline autocomplete, and in 2026 it remains the benchmark. You're typing, a gray suggestion appears, you hit Tab, and the code is there. The latency is low enough that it doesn't break your flow. The suggestions are contextually aware — they read the file you're in, the imports you've declared, the patterns in adjacent functions. For the specific task of "I know what I want to type and Copilot types it faster," nothing else matches this experience.

I tested Copilot, Cursor Tab, and Windsurf's Supercomplete side by side across a week of TypeScript development. Copilot's suggestions arrived faster and required fewer corrections. The margin wasn't enormous, but in a tool you interact with hundreds of times per day, small margins compound. Copilot's autocomplete accepted rate — the percentage of suggestions I actually used — was roughly 35-40%, which aligns with GitHub's own published research numbers. That sounds low until you realize the cost of a rejected suggestion is a single keypress to dismiss it. The cost of an accepted suggestion is potentially dozens of keystrokes saved.

Copilot Chat is the in-editor conversational interface. You ask it to explain code, suggest fixes, write tests, or generate functions. It works. It's fine. It is not as capable as Cursor's Composer for multi-file generation, and it's not as deep as Claude Code for architectural reasoning. Chat occupies an awkward middle ground: more capable than autocomplete, less capable than the purpose-built generation tools. In practice, I used Chat most for "explain this function" and "write a unit test for this" — tasks where the scope is narrow enough that Chat's limitations don't surface.

The model situation is a constraint. Copilot runs primarily on GPT-4o with limited model selection. Cursor lets you switch between Claude, GPT, and Gemini. Claude Code runs on Claude Sonnet and Opus. Copilot's model backend is whatever GitHub decides to ship, and you have less control over which model handles your request. For most autocomplete tasks, this doesn't matter — GPT-4o is fast and good enough. For complex generation tasks, the inability to switch to Claude for better reasoning is a real limitation.

Copilot Workspace is GitHub's agent play, and it's the feature that most clearly reveals the gap between Copilot's ambitions and its current capabilities. The pitch: you write an issue description, Workspace reads your repository, proposes a plan, generates the code changes, and opens a pull request. In practice, Workspace produces usable output for straightforward, well-defined changes — "add a 404 page," "update the API response format to include timestamps," "write tests for this module." For changes that require understanding complex interactions between components, Workspace's proposals range from incomplete to confidently wrong. Per community reports on r/github and Hacker News, the common experience is that Workspace gets you 60-70% of the way on simple tasks and 20-30% of the way on complex ones. Those complex-task numbers aren't good enough to justify the workflow change.

Enterprise features are where Copilot has no real competition. Code referencing filters that flag suggestions matching public repositories. IP indemnity from Microsoft. Organization-wide policy controls. Audit logs. SSO integration. If you're making a purchasing decision for a company with more than 50 developers, Copilot Enterprise is the only option that doesn't require your security team to write a novel-length exception document. This isn't a technical advantage — it's a procurement advantage, and in enterprise software, procurement advantages are real advantages.

What The Demo Makes You Think

GitHub's Copilot demos emphasize Workspace heavily, because Workspace is the feature that differentiates Copilot from "really good autocomplete." The demos show Workspace reading an issue, understanding the codebase, and generating a clean PR. What they show is real — it can do that, on certain issues, in certain repositories. What the demos don't convey is how narrow that capability window is.

The Workspace demos use repositories with clear structure, comprehensive README files, and well-defined issues. A real-world monorepo with implicit conventions, undocumented architectural decisions, and issues that say "fix the login thing, you know what I mean" produces very different results. Workspace doesn't understand your team's unwritten rules. It doesn't know that the data access layer uses a particular pattern because of a database migration three years ago that nobody documented. It generates code that is technically correct and structurally wrong.

There's a subtler perception gap around autocomplete itself. Because Copilot's autocomplete is so smooth, it creates the impression that you're coding faster than you actually are. You are typing faster — that part is real. But typing speed was never the bottleneck in software development. Reading code, understanding requirements, making design decisions, debugging — these tasks dominate a developer's day, and Copilot's autocomplete doesn't touch them. The productivity gain from autocomplete is real but smaller than it feels. GitHub's published studies suggest a 25-30% reduction in task completion time for specific coding tasks. That's valuable. It's not the revolution the marketing implies.

The pricing structure reinforces the perception that Copilot is a default choice rather than an evaluated one. At $10/month for Individual, it's cheap enough that most developers subscribe without calculating whether the value exceeds the cost. At $19/month for Business and $39/month for Enterprise, the per-seat pricing scales linearly while the value proposition shifts from individual productivity to organizational compliance features. Many teams pay for Copilot Business because it's the path of least resistance, not because they've evaluated it against alternatives.

What's Coming (And Whether To Wait)

GitHub is investing heavily in closing the gap with Cursor on generation and agent capabilities. Workspace is shipping updates regularly. The integration with GitHub's broader platform — Issues, PRs, Actions, code review — gives Copilot a surface area that standalone tools can't match. If Workspace reaches the point where it reliably handles medium-complexity issues, the workflow of "write an issue, get a PR, review it" becomes genuinely powerful.

The model question hangs over everything. Microsoft's relationship with OpenAI gives Copilot access to GPT models, but the trend in AI coding has been toward Claude for reasoning-heavy tasks. Whether GitHub opens Copilot to Claude or remains GPT-only will meaningfully affect its competitive position on generation quality. There's no public commitment either way [VERIFY].

The leapfrog risk for Copilot is low. It won't become the best generation tool overnight, but it also won't lose its autocomplete advantage or its enterprise distribution. Copilot's position is stable in a way that smaller competitors' positions are not.

Should you wait for improvements before subscribing? No — but also, maybe don't subscribe at all if you're already paying for Cursor. Copilot's autocomplete advantage over Cursor Tab is real but small. If you're already in Cursor's ecosystem, adding Copilot creates autocomplete conflicts (they fight over Tab suggestions) and doesn't provide enough incremental value to justify the cost. If you're in standard VS Code and don't want to switch editors, Copilot is the obvious choice and the current version delivers on its core promise.

The Verdict

GitHub Copilot is the right tool for two groups: developers who want the best autocomplete experience without changing their editor, and enterprises that need compliance-grade AI coding tools. For the first group, the $10/month Individual tier delivers clear value. For the second group, Enterprise at $39/month is the only game in town.

Copilot is not the right tool for developers who want the best AI code generation available. Cursor's Composer is better for multi-file generation. Claude Code is better for complex refactoring. The gap between Copilot and these tools on generation tasks is wide enough to matter for daily work.

The distribution advantage is Copilot's moat and its trap. Because it's already installed, most developers never evaluate whether they should be using something else. The autocomplete is good enough to feel productive. The generation features are mediocre enough to make you think that's all AI coding can do. If you're using Copilot and haven't tried Cursor or Claude Code, you don't know what you're missing. Whether what you're missing is worth the additional cost and workflow change is a personal calculation — but you should at least make that calculation instead of assuming the default is the best.

Updated March 2026. This article is part of the Code Generation & Vibe Coding series at CustomClanker.

GitHub Copilot: The One Everyone Has But Nobody Evaluates

Rza

What It Actually Does

What The Demo Makes You Think

What's Coming (And Whether To Wait)

The Verdict

Read more

The YouTube + AI Pipeline

The Weekly Drop

The Tool Collector's Guide to Owning Nothing

Self-Hosting & Tinkering