Machine Learning Models

Introduction

In this article, we will demystify the term model and explore the evolving landscape of machine learning tools. We will also cover the difference between classical models and modern LLMs and show you the tradeoffs between running models locally, in the cloud or via an API.

Picking the right ML tool for the job is where we want to get to at the end, instead of plugging everything into a ChatGPT terminal. If you're a software developer, system architect or tech-curious builder wondering how to integrate ML wisely, this is for you.

There is also a heavy focus on examples in Python. If you don't know Python, do still read on. Hopefully you'll get inspired to learn it as well as discovering the underpinnings of modern AI.

Resources / TLDR

Communities

Kaggle — The largest AI and ML community
Hugging Face — The platform where the machine learning community collaborates on models, datasets and applications

Tools

Open Router Models — Access any hosted LLM model from one interface
Github Models — Find and experiment with AI models for free
Jupyter — ML engineers and data scientists use this to write Python code that interfaces with ML models
PyTorch, TensorFlow and JAX — The backbone of modern AI
scikit-learn — Python library for building most types of ML model

Learning

W3Schools Machine Learning course
Google Machine Learning Education
StatQuest YouTube channel — The best beginner-friendly ML YouTube channel
Free ML courses from Harvard, IBM and FreeCodeCamp

What is a Model?

In Machine Learning, a model refers to a mathematical construct trained to make decisions or predictions based on input data. Once you've trained or created a model, you can use it to create new data or predictions from an input it's never seen before.

GPT, Gemini and Claude are examples of Large Language Models and are perhaps the most well-known type of ML model by the general public. However, these are very extreme examples and represent the high end in a spectrum of model complexity.

Before AI became known for chatbots and image generators, professionals who dealt with data (such as data scientists) used, and still do use, a variety of machine learning models to make predictions, detect patterns or sort information. These models were usually small, focused and trained on structured data like spreadsheets or databases.

Types of ML Models

1. Linear Models — The Straight-Line Thinker

Linear Model

Draws a line or curve through data points to spot trends and make predictions.

Used for: Forecasting sales, predicting prices
Strength: Simple, fast, easy to interpret
Weakness: Can't handle complex relationships

2. Decision Trees — The Flowchart Brain

Decision Tree

Asks a series of yes/no questions to make a decision.

Used for: Loan approval, medical diagnoses
Strength: Easy to understand and explain
Weakness: Can overfit and make decisions that don't generalise

3. Random Forests — The Crowd of Flowcharts

Random Forest

Builds many decision trees and combines their answers to improve accuracy.

Used for: Risk scoring, product recommendations
Strength: More accurate and robust
Weakness: Harder to explain decisions

4. Clustering Models — The Natural Group Finder

Clustering Models

Groups similar things together without knowing the labels ahead of time.

Used for: Customer segments, user behaviour patterns
Strength: Great for discovery
Weakness: Can be sensitive to noise or unclear groups

5. Naive Bayes — The Probability Calculator

Bayes

Makes predictions based on how likely something is, given past data.

Used for: Spam filters, topic classification
Strength: Very fast
Weakness: Can oversimplify complex problems

6. Support Vector Machines (SVMs) — The Border Drawer

SVM

Draws the best dividing line between different categories in your data.

Used for: Image classification, face detection
Strength: Precise with clean data
Weakness: Not great with lots of messy or overlapping data

7. Neural Networks — The Brain-Inspired Pattern Learner

Neural Network

Mathematical models inspired by biological neural networks, consisting of interconnected nodes organised in layers that process and transform input data.

Used for: Pattern recognition, classification, prediction
Strength: Can learn complex relationships
Weakness: Need careful tuning, can be unstable

8. Deep Learning — The Advanced Pattern Master

Deep Learning

Deep learning refers to neural networks with many layers. These additional layers allow the network to learn increasingly complex features from data automatically. LLM models such as GPT and Gemini fall into this category.

Used for: Computer vision, language models, speech recognition
Strength: Learns complex patterns automatically, state-of-the-art performance
Weakness: Needs massive data/compute, complex to train, black box behaviour

Summary

Model Type	Example Use Case	Can it Handle Complex Data?	Needs Lots of Data?	Easy to Understand?
Linear Model	Predicting house prices	No	No	Yes
Decision Tree	Loan approval	Some	No	Yes
Random Forest	Fraud detection	Yes	Medium	Kind of
Clustering	Market segmentation	Some	Medium	Sometimes
Naive Bayes	Spam detection	No	No	Yes
SVM	Face detection	Yes	Medium	No
Neural Network	Voice or image recognition	Yes	Yes	No
Deep Learning (Transformers, CNNs)	Language, vision, etc.	Yes	Yes (lots)	Very hard

Choosing the Right Model for the Job

If your data is structured (tables, numbers, categories):

Use classical ML models like Logistic Regression, Decision Trees / Random Forests or XGBoost / LightGBM.

Predicting churn from customer data
Scoring leads in a CRM
Classifying transactions as fraud or not

Pros	Cons
✅ Fast	❌ Not great for messy or unstructured input
✅ Explainable
✅ Can run locally

If your input is text and the output is a simple label:

Use smaller NLP models (not full LLMs) like RoBERTa, DistilBERT or fastText.

Categorising support tickets
Sentiment analysis
Spam detection

Pros	Cons
✅ Lightweight and fast	❌ Doesn't generate language, just classifies
✅ More accurate than rule-based approaches

If you're working with images or video:

Use vision models like ResNet / MobileNet / EfficientNet (classification), YOLO / Detectron2 (object detection) or CLIP / BLIP (image + text tasks).

Classifying product images
Detecting objects in video feeds
Matching images to text descriptions

Pros	Cons
✅ Purpose-built and efficient	❌ Needs labelled image data to train
✅ Can run on phones or edge devices

If your input is audio or speech:

Use audio models like Whisper (speech-to-text), wav2vec2 (speech recognition) or TTS models.

Transcribing calls or meetings
Building voice-controlled interfaces
Generating spoken output from text

Pros	Cons
✅ Good open-source options available	❌ Audio data can be large and tricky to process
✅ Works well offline

If you need language generation, summarisation or reasoning:

Use large language models like GPT / Claude / Gemini via commercial APIs or LLaMA / Mistral / Phi as open-source options.

Summarising a legal document
Explaining code
Writing email drafts or documentation
Chatbots with memory and logic

Pros	Cons
✅ Extremely powerful	❌ Can be expensive
✅ Very general-purpose	❌ May hallucinate
	❌ Overkill for small classification tasks

Decision Table

Task Type	Recommended Model Type	Example Tool
Predict from tabular data	Decision Tree	XGBoost, LightGBM
Classify short texts	NLP	DistilBERT, fastText
Summarise/generate text	LLM	GPT, Claude, Mistral
Understand images	CNN	YOLO, ResNet, BLIP
Transcribe speech	ASR	Whisper
Group similar users	K-means Clustering	Scikit-learn
Detect sentiment in reviews	NLP	RoBERTa
Write SEO blog posts	LLM	GPT, Claude

Why LLMs Aren't Always the Answer

Large Language Models are incredibly capable. They can summarise, classify, generate, reason and write code. Given that power, it's no surprise that many developers now reach for LLMs as the default tool for every ML problem.

But just because you can use an LLM doesn't mean you should.

Why Everyone's Using LLMs for Everything

Low barrier to entry — You don't need to collect data, train anything or understand ML theory. Just write a prompt and get results.
One tool for many tasks — You can classify sentiment, summarise articles, translate languages and chat, all from the same API.
Faster prototyping — Especially for startups and small teams, LLMs let you get a working product today.

The Problems with Treating LLMs as a Catch-All

Wasteful overhead — You're using a billion-parameter model to do what a 5MB model could have done. Classifying tweets as positive or negative? A fine-tuned BERT could do it faster and cheaper.
Scaling costs — An LLM call might cost fractions of a cent, but multiply that by millions of users and you're bleeding money. Traditional models are nearly free to run once deployed.
Latency — Even the fastest LLMs are slower than traditional models. A call to a hosted LLM takes 200ms–1s+. A local scikit-learn model returns results in milliseconds.
Loss of specialisation — LLMs are generalists. A fine-tuned fraud detection model trained on your data will almost always beat an LLM trying to "reason" its way to a result.
Skills atrophy — When LLMs become a catch-all, developers stop learning about classical ML, statistics and feature engineering. That's dangerous in regulated, high-stakes or performance-sensitive environments.

A Better Approach: Use LLMs for What They're Great At

Use LLMs for language understanding, generation and reasoning. Use traditional models when you want speed, predictable output, privacy, simplicity or cost-efficiency.

A good architecture might look like:

Use LLMs at the edge to route or clean messy data
Pass that to a lightweight classifier or ranking model
Return a response that's fast, traceable and explainable

How to Access and Run Models

Commercial APIs

The easiest route is to use models via commercial APIs. Anthropic's Claude, OpenAI's GPT and Google's Gemini are the main options.

✅ Fastest time to value
✅ Extremely powerful models
⚠️ Black-box
⚠️ Pay-per-use, costs can scale fast

Hosting Locally

Running smaller models like Phi or Gemma on a laptop is increasingly feasible via tools like Ollama or LM Studio.

✅ Full privacy
✅ Free after setup
⚠️ Limited power

Hourly Cloud Compute

Platforms like RunPod and LambdaLabs let you spin up a GPU machine by the hour.

✅ On-demand power
✅ No hardware investment
⚠️ Pay-as-you-go can become expensive

Enterprise ML Platforms

Platforms like Amazon SageMaker and MLFlow provide integrated environments for model development, deployment and management at enterprise scale.

✅ End-to-end ML workflow management
✅ Built-in security and governance
⚠️ Requires enterprise licensing

Acquiring Models from Hugging Face

Hugging Face hosts a wide range of machine learning models, especially those built with PyTorch, TensorFlow and JAX. All models are free or open source, but you will need to provide the compute resource to run them.

Model Type	Hosted on Hugging Face?	Notes
Transformers (LLMs)	✅ Yes	Hugging Face's core focus (e.g. GPT-style, BERT, LLaMA)
CNNs for vision	✅ Yes	Models like ResNet, YOLO and CLIP
Audio models	✅ Yes	Whisper, wav2vec2, TTS
Small/efficient LMs (SLMs)	✅ Yes	e.g. DistilBERT, TinyLLaMA, Phi-3
Embeddings / vector models	✅ Yes	Sentence Transformers
Classical ML via sklearn	⚠️ Limited	A few examples, mostly for education
XGBoost / LightGBM	⚠️ Rare	Not commonly hosted
Rule-based or statistical models	🚫 Not really	Usually too simple to share as models

Written by Daniel Ball, founder of Westsmith