AI Agents: What They Are and How to Build Them

What follows is my attempt to explain what is meant by the term "AI agent", list out some examples and resources and separate the buzzword from the facts.

For simplicity, this document only refers to LLM-powered agents using technology and concepts from 2023 onwards. The real history and facts about AI agents goes back decades and incorporates a lot of different underlying technologies and techniques (I had to draw the line somewhere).

Basic definition of an AI agent

An AI agent is an autonomous program designed to perform specific tasks by understanding its environment, reasoning about it, and taking actions to achieve a goal. AI agents often have roles to define their purpose and responsibilities in a multi-agent system or as part of a larger application.

At the core of any AI Agent is a Large Language Model. This enables the agent to converse in natural language and transform between unstructured and structured text.

Here are some examples of what an AI agent or Agentic-based system will do:

A personal assistant bot that checks the weather or books a meeting
A web-crawling bot gathering information for market research
A financial bot deciding on stock investments based on historical data

Other terms in the context of AI agents

Role - Defines the identity or perspective the AI adopts in the interaction.
Instructions - Guidelines or constraints provided to the AI to shape its behavior or responses.
Message - Content exchanged between the AI and the user or system during the interaction
Agentic - Means a system can operate independently, making decisions based on data analysis and predefined goals
Model - The underlying LLM used by the agent
No Code - No Code agent systems use drag and drop plus text boxes for configuration instead of code (usually Python). Relevance AI would be an example of such a system
RAG - Retrieval-Augmented Generation. A technique for enhancing the accuracy and reliability of generative AI models with information from specific and relevant data sources
ReAct - Reasoning and Action
Multi-agent - A system using multiple specialised agents to collaborate on a desired outcome such as summarising today's news into an email
Task-Oriented Agent - Executes specific tasks based on given inputs, like fetching data, summarizing content, or scheduling events
Collaborative Agent - Works with other agents or users to solve problems or complete objectives.
Exploratory Agent - Collects and analyzes data to identify trends or new insights.
Decision-Making Agent - Uses reasoning or ML models to make informed decisions.
Interactive Agent - Communicates with users or systems, often using natural language.

What is the difference between a chatbot and an agent?

Where chatbots largely follow rules-based dialogues and are limited to answering predefined questions, AI agents can reason and ground answers in relevant knowledge and content. Chatbots, unlike agents, need extensive training on hundreds of conversations to be able to understand natural-language requests, making agents significantly quicker and easier to implement and launch.

AI agents can understand context, learn from interactions, and adapt their behavior to achieve specific goals. Unlike simpler systems, AI agents can handle ambiguity, make autonomous decisions, and execute multi-step plans to solve complex problems, making them suitable for more challenging and open-ended tasks.

No Code vs Code

No-code AI systems are ideal for users without programming expertise, offering intuitive drag-and-drop interfaces and pre-built components to create AI solutions quickly. They reduce development time and are great for prototyping or simple applications like chatbots or data analysis. However, they can be limited in customization, flexibility, and scalability for complex or unique use cases. In contrast, coded AI systems provide full control, customization, and the ability to optimize performance, making them suitable for advanced or highly specific requirements. While they require programming skills and more development time, coded solutions are more adaptable and capable of handling complex tasks effectively.

Python-configurable agent systems

List of agent systems that can be configured with Python (other language interfaces may be available).

System name	Official docs	Code samples
Phidata	Phidata introduction	Phidata cookbook
OpenAI Swarm	OpenAI Swarm docs	OpenAI Swarm samples
LangGraph	LangGraph docs	LangGraph samples
Microsoft Autogen	Autogen docs	Autogen samples
CrewAI	CrewAI docs	CrewAI Code samples
Vertex AI	Vertex AI Code docs	Vertex AI Code samples
Langflow	Langflow docs	Langflow repo

No Code agent systems

List of No-Code agent systems. These don't require coding skills but may be more expensive to run, less flexible and in some cases slower to execute. However, you may end up with a good-enough solution with a lot less effort, especially for individual or small scale deployment.

Do you need a framework to build AI agents?

No, in fact you may find that you are better off not using a framework. This is because you are relying on someone else's code to perform actions under the cover like passing data between agents. However, the concepts used in agent frameworks are useful to know, and it is worth familiarising yourself with the terminology and ways of working prescribed by frameworks.

A framework is a quick way to get you up and running though, so for a small project such as a personal assistant, it may make sense rather than coding your own framework.

This article describes how you set up a framework from scratch.

AI Agent approach vs traditional coding approach

For a basic use case, traditional procedural Python code is often the better choice due to its simplicity, speed of implementation, and ease of debugging. Tasks like processing data files, automating repetitive actions, or performing straightforward calculations are efficiently handled with well-defined, step-by-step code. It's cost-effective and ideal for problems with clear rules and predictable inputs.

An AI agent system, while powerful, is unnecessary for most basic tasks. It's better suited for complex, dynamic scenarios requiring adaptability, such as natural language processing or pattern recognition. AI introduces added complexity, requiring more time, resources, and expertise. For simple use cases, the overhead of an AI system outweighs its benefits.

Local hosted vs cloud hosted

For implementing an AI agent system, locally hosted solutions are ideal when data privacy, control, and low latency are priorities. Hosting locally gives you full control over the environment, allowing customization and operation without reliance on external networks. This approach is cost-effective for small-scale setups with minimal infrastructure needs, and it avoids ongoing subscription fees. However, it may require higher upfront hardware investment, and scaling up for large workloads can be challenging.

In contrast, cloud-hosted solutions excel in scalability, ease of setup, and access to powerful resources like GPUs or TPUs. They're ideal for dynamic or large-scale applications where rapid scaling or collaboration is required. Cloud providers handle maintenance, updates, and high availability, saving time and effort. However, recurring costs can add up, and reliance on internet connectivity and third-party providers can introduce security concerns and latency, depending on the use case.

Resources

Written by Daniel Ball, founder of Westsmith