clouds
Westsmith logo icon westsmith

Machine Learning Models

Introduction

In this article, we will demystify the term model and explore the evolving landscape of machine learning tools. We will also cover the difference between classical models and modern LLMs and show you the tradeoffs between running models locally, in the cloud or via an API.

Picking the right ML tool for the job is where we want to get to at the end, instead of plugging everything into a ChatGPT terminal. If you're a software developer, system architect or tech-curious builder wondering how to integrate ML wisely, this is for you.

There is also a heavy focus on examples in Python. If you don't know Python, do still read on. Hopefully you'll get inspired to learn it as well as discovering the underpinnings of modern AI.

Resources / TLDR

Communities

Tools

Learning

What is a Model?

In Machine Learning, a model refers to a mathematical construct trained to make decisions or predictions based on input data. Once you've trained or created a model, you can use it to create new data or predictions from an input it's never seen before.

GPT, Gemini and Claude are examples of Large Language Models and are perhaps the most well-known type of ML model by the general public. However, these are very extreme examples and represent the high end in a spectrum of model complexity.

Before AI became known for chatbots and image generators, professionals who dealt with data (such as data scientists) used, and still do use, a variety of machine learning models to make predictions, detect patterns or sort information. These models were usually small, focused and trained on structured data like spreadsheets or databases.

Types of ML Models

1. Linear Models — The Straight-Line Thinker

Linear Model

Draws a line or curve through data points to spot trends and make predictions.


2. Decision Trees — The Flowchart Brain

Decision Tree

Asks a series of yes/no questions to make a decision.


3. Random Forests — The Crowd of Flowcharts

Random Forest

Builds many decision trees and combines their answers to improve accuracy.


4. Clustering Models — The Natural Group Finder

Clustering Models

Groups similar things together without knowing the labels ahead of time.


5. Naive Bayes — The Probability Calculator

Bayes

Makes predictions based on how likely something is, given past data.


6. Support Vector Machines (SVMs) — The Border Drawer

SVM

Draws the best dividing line between different categories in your data.


7. Neural Networks — The Brain-Inspired Pattern Learner

Neural Network

Mathematical models inspired by biological neural networks, consisting of interconnected nodes organised in layers that process and transform input data.


8. Deep Learning — The Advanced Pattern Master

Deep Learning

Deep learning refers to neural networks with many layers. These additional layers allow the network to learn increasingly complex features from data automatically. LLM models such as GPT and Gemini fall into this category.

Summary

Model Type Example Use Case Can it Handle Complex Data? Needs Lots of Data? Easy to Understand?
Linear Model Predicting house prices No No Yes
Decision Tree Loan approval Some No Yes
Random Forest Fraud detection Yes Medium Kind of
Clustering Market segmentation Some Medium Sometimes
Naive Bayes Spam detection No No Yes
SVM Face detection Yes Medium No
Neural Network Voice or image recognition Yes Yes No
Deep Learning (Transformers, CNNs) Language, vision, etc. Yes Yes (lots) Very hard

Choosing the Right Model for the Job

If your data is structured (tables, numbers, categories):

Use classical ML models like Logistic Regression, Decision Trees / Random Forests or XGBoost / LightGBM.

Pros Cons
✅ Fast ❌ Not great for messy or unstructured input
✅ Explainable
✅ Can run locally

If your input is text and the output is a simple label:

Use smaller NLP models (not full LLMs) like RoBERTa, DistilBERT or fastText.

Pros Cons
✅ Lightweight and fast ❌ Doesn't generate language, just classifies
✅ More accurate than rule-based approaches

If you're working with images or video:

Use vision models like ResNet / MobileNet / EfficientNet (classification), YOLO / Detectron2 (object detection) or CLIP / BLIP (image + text tasks).

Pros Cons
✅ Purpose-built and efficient ❌ Needs labelled image data to train
✅ Can run on phones or edge devices

If your input is audio or speech:

Use audio models like Whisper (speech-to-text), wav2vec2 (speech recognition) or TTS models.

Pros Cons
✅ Good open-source options available ❌ Audio data can be large and tricky to process
✅ Works well offline

If you need language generation, summarisation or reasoning:

Use large language models like GPT / Claude / Gemini via commercial APIs or LLaMA / Mistral / Phi as open-source options.

Pros Cons
✅ Extremely powerful ❌ Can be expensive
✅ Very general-purpose ❌ May hallucinate
❌ Overkill for small classification tasks

Decision Table

Task Type Recommended Model Type Example Tool
Predict from tabular data Decision Tree XGBoost, LightGBM
Classify short texts NLP DistilBERT, fastText
Summarise/generate text LLM GPT, Claude, Mistral
Understand images CNN YOLO, ResNet, BLIP
Transcribe speech ASR Whisper
Group similar users K-means Clustering Scikit-learn
Detect sentiment in reviews NLP RoBERTa
Write SEO blog posts LLM GPT, Claude

Why LLMs Aren't Always the Answer

Large Language Models are incredibly capable. They can summarise, classify, generate, reason and write code. Given that power, it's no surprise that many developers now reach for LLMs as the default tool for every ML problem.

But just because you can use an LLM doesn't mean you should.

Why Everyone's Using LLMs for Everything

The Problems with Treating LLMs as a Catch-All

A Better Approach: Use LLMs for What They're Great At

Use LLMs for language understanding, generation and reasoning. Use traditional models when you want speed, predictable output, privacy, simplicity or cost-efficiency.

A good architecture might look like:

  1. Use LLMs at the edge to route or clean messy data
  2. Pass that to a lightweight classifier or ranking model
  3. Return a response that's fast, traceable and explainable

How to Access and Run Models

Commercial APIs

The easiest route is to use models via commercial APIs. Anthropic's Claude, OpenAI's GPT and Google's Gemini are the main options.

Hosting Locally

Running smaller models like Phi or Gemma on a laptop is increasingly feasible via tools like Ollama or LM Studio.

Hourly Cloud Compute

Platforms like RunPod and LambdaLabs let you spin up a GPU machine by the hour.

Enterprise ML Platforms

Platforms like Amazon SageMaker and MLFlow provide integrated environments for model development, deployment and management at enterprise scale.

Acquiring Models from Hugging Face

Hugging Face hosts a wide range of machine learning models, especially those built with PyTorch, TensorFlow and JAX. All models are free or open source, but you will need to provide the compute resource to run them.

Model Type Hosted on Hugging Face? Notes
Transformers (LLMs) ✅ Yes Hugging Face's core focus (e.g. GPT-style, BERT, LLaMA)
CNNs for vision ✅ Yes Models like ResNet, YOLO and CLIP
Audio models ✅ Yes Whisper, wav2vec2, TTS
Small/efficient LMs (SLMs) ✅ Yes e.g. DistilBERT, TinyLLaMA, Phi-3
Embeddings / vector models ✅ Yes Sentence Transformers
Classical ML via sklearn ⚠️ Limited A few examples, mostly for education
XGBoost / LightGBM ⚠️ Rare Not commonly hosted
Rule-based or statistical models 🚫 Not really Usually too simple to share as models

Written by Daniel Ball, founder of Westsmith