clouds
Westsmith logo icon westsmith

AI Assisted Coding

$ claude "refactor the auth module"
Reading 9 files…
auth.py — simplified token logic
tests/test_auth.py — 6 new cases added
README.md — updated
3 files changed, 94 insertions(+), 71 deletions(-)

This is my attempt to make sense of the AI-assisted coding landscape — written to educate and as a reference for anyone trying to navigate this space without the hype.

AI Coding Glossary

The Evolution of Code Assistance

Paper Manuals & Reference Books — The original developer companion. You'd thumb through language references, API docs or O'Reilly books to find the right method signature.

Online IDE Documentation & Language References — As the web matured, documentation moved online. IDEs integrated this so you could hover over a symbol and get inline docs.

Copying from Stack Overflow, GitHub & Open Source — Arguably the most impactful "tool" of the mid-2000s to 2010s developer toolkit. For coding capability, today's AI models are heavily dependent on the content of Stack Overflow and GitHub.

Early Code Completion (2000s IDEs) — Tools like early Eclipse, Visual Studio and IntelliJ began offering basic autocomplete, typically triggered by typing a dot after an object. Useful for reducing typos in method names but rarely capable of suggesting whole patterns or intent.

Autocompletion Before AI — Jedi & Rope — Python's dynamic typing made static autocompletion genuinely hard. Early tools like Rope and later Jedi tackled this with deep static analysis.

Autocompletion — Using AI — The landscape shifted significantly when GitHub launched GitHub Copilot in 2021, generating entire functions and suggesting multi-line logic from natural language comments rather than simply ranking symbol candidates.

Copying and Pasting into ChatGPT (Pre-Agentic) — When ChatGPT launched publicly in late 2022 it changed developer workflows overnight. For the first time you could have a conversation about your code, ask follow-up questions, request a refactor or say "that didn't work, here's the error."

Agentic tools like Claude Code operate at a fundamentally different level. Rather than completing tokens or ranking symbols, they understand intent across an entire codebase. You describe what you want in plain language, and they can generate, refactor, debug and explain across multiple files simultaneously. Used via the CLI, Claude Code can run commands, read outputs and iterate — behaving less like a tool and more like a junior engineer pair-programming alongside you.

The Near Future — Agents with Persistent Context — The next step is likely persistent, project-aware agents that maintain a living model of your codebase between sessions. Think less "answer my question" and more "autonomous collaborator" that files its own PRs, writes tests as code changes and flags when a new feature conflicts with an architectural decision made six months ago.

How AI Coding Assistants Work

The Spectrum of Assistance

It helps to think of AI coding assistants as sitting on a spectrum from ambient/reactive at one end to agentic at the other.

At the reactive end, tools like GitHub Copilot sit close to your cursor. They observe what you're typing, infer your intent from the surrounding code and comments, and suggest completions inline. The interaction model is passive: you write, it suggests, you accept or ignore.

At the agentic end, tools like Claude Code take a fundamentally different approach. Rather than predicting the next token of your code, they reason about goals, decompose tasks into steps, invoke tools, observe the results and decide what to do next.

How Agentic Tools Work

Agentic coding loop diagram

The engine underneath an agentic coding assistant is an LLM equipped with a set of tools it can call. When you give Claude Code a task, it doesn't just generate text — it generates decisions about what actions to take. These might include reading a file, running a shell command, searching the codebase for a pattern or editing a file. Each action produces a result that is fed back into the model's context, informing the next decision.

Project-specific context can be injected into this loop at initialisation. Claude Code, for instance, reads a CLAUDE.md file from your project root at the start of every session, shaping every decision it makes.

AI-Native IDEs

Tools like Cursor (cursor.com) replace the editor itself rather than operating in the terminal or as an extension. Built as a fork of VS Code, Cursor embeds AI as a first-class participant in how you navigate, write and refactor code.

The Role of the Context Window

Every LLM-based tool is fundamentally constrained by its context window. A large repository can contain millions of tokens worth of code. The model can't read all of it at once. Agentic tools therefore need strategies for deciding what to include in the context at any moment — semantic search, LSP integration to understand code structure, and explicit user-directed context attachment (the @filename pattern).

Model Context Protocol

MCP is an open standard for connecting AI agents to external data sources. With MCP, a coding assistant isn't limited to your local files — it can be connected to your Jira board, your Slack workspace, your Google Drive or any custom internal tooling that exposes an MCP server.

This shifts the agent's effective context from "what's in this repository" to "what's in your entire development environment."

How to Use an AI Coding Assistant

Starting with the Right Mental Model

Context is everything. Context costs tokens. Tokens cost money.

AI coding assistants operate within a finite amount of text they can "see" at one time. The assistant doesn't remember last session's decisions. It doesn't know your project's conventions, your team's opinions or the architectural choices made six months ago — unless you tell it.

Customising Your Assistant

In Claude Code, you can create a CLAUDE.md file in the root of your project. This file is read at the start of every session and can contain anything you'd want a new developer to know before touching your codebase:

Think of it as your project's onboarding document for the AI. The difference between a session that starts with this context and one without is significant.

Custom commands and slash commands are another underused feature. Claude Code lets you define custom /commands that encapsulate common workflows like running your test suite or triggering a code review prompt against your own standards.

The Token Economy: Where People Go Wrong

Planning as a First-Class Activity

Before starting a significant task, it's worth spending a few minutes writing down in plain language what you're trying to achieve, what constraints apply and what a successful outcome looks like.

Some developers maintain a suite of markdown files that together form a kind of living project memory: an ARCHITECTURE.md covering high-level design decisions, a CONVENTIONS.md for style and naming rules, a DECISIONS.md logging why certain approaches were chosen and a PROGRESS.md tracking what's done and what's next.

Asking Well

The Landscape

AI Coding Tools

Tool Description
Claude Code Uses Anthropic models. Agentic coding tool that lives in your terminal. Reads, edits and reasons across entire codebases.
Cursor AI-native code editor (VS Code fork). Composer mode handles multi-file edits autonomously.
Lovable Browser-based platform that turns natural language prompts into full-stack React/Supabase apps.
Gemini CLI Open source. Uses Google's Gemini models. Terminal coding agent with MCP and Google Search integration.
GitHub Copilot Integrated into VS Code, JetBrains and the GitHub ecosystem. Covers inline completions, chat, agent mode and PR summaries.
Devin Autonomous software engineer that operates independently via Slack or a VSCode-style interface, spawning its own environment to plan, code, test and open PRs.
Aider Open source. Model-agnostic. Terminal coding agent. Stages git changes and writes commit messages automatically.
Cline Open source. Model-agnostic. VS Code extension. Shows diffs inline and requires explicit approval before running terminal commands.

Models that perform well at coding

Several benchmarks track coding capability. The most useful:

Building in a Safety Net

Vibe coding and automated code generation introduce the risk of unmaintainable, non-working or insecure code. This is where automated tooling becomes not just useful but essential.

Quick to run and free: static analysis

Linting tools have been around for years, are generally free and can run in seconds. Including them in a pre-commit hook means you can't forget to run them.

Quick to run: unit testing

Unit testing is non-negotiable in modern software development, and AI tools have made high coverage levels more achievable than ever. Feed the AI your function signatures, docstrings and intent, then ask it to surface edge cases, boundary conditions and failure modes you might not have considered.

A few rules worth following:

Slower to run: AI PR reviewers

CodeRabbit is the current market leader in purpose-built AI code review. It installs as a GitHub or GitLab app, runs automatically on every pull request and leaves line-by-line comments with severity rankings. Free for open source.

GitHub Copilot Code Review was added in April 2025 and is bundled into existing Copilot subscriptions. Shallower than CodeRabbit but costs nothing extra if you're already paying for Copilot.

SAST Platforms

SonarQube / SonarCloud combines code quality and security into a single dashboard. The Community Edition is free and self-hosted.

Snyk Code uses data-flow analysis to catch things like second-order SQL injection where tainted data passes through multiple functions before hitting a sink. It also bundles dependency scanning, container scanning and IaC analysis.

Dependency security

Dependabot is free, built into GitHub and opens automatic pull requests for vulnerable dependencies. Enable it if you haven't. For more serious supply chain concerns, Socket.dev analyses the behaviour of npm and PyPI packages and can detect packages that exfiltrate data at install time.

Integration and end-to-end tests

Integration tests verify that your code works correctly with the real systems it depends on. Docker Compose and testcontainers-python make it straightforward to run a real Postgres instance, Redis cache or message queue locally or in CI.

For end-to-end tests, Playwright is the current tool of choice. It supports Python, JavaScript and TypeScript, runs headlessly in CI and has a codegen feature that records browser interactions and outputs test code automatically.

Written by Daniel Ball, founder of Westsmith