O
Octo
CoursesPricing
O
Octo
CoursesPricingDashboardPrivacyTerms

© 2026 Octo

Understanding AI
1What Is Artificial Intelligence?2How Computers Process Information3The Internet & APIs4Data: The Fuel for AI5Machine Learning in Plain English6Neural Networks & Deep Learning7Large Language Models Demystified8AI Ethics & Responsible Use
Module 5~20 min

Machine Learning in Plain English

How machines learn from examples — no math required, just pattern recognition you already understand.

The doctor who saved thousands of lives — without examining a single patient

In 2017, a team at Stanford built a system that performed comparably to a panel of board-certified dermatologists on a held-out test set (Esteva et al., Nature, 2017) — results reflect a specific benchmark, not all dermatological conditions. Nobody programmed rules like "if the mole is brown and asymmetric, it's melanoma." Instead, they showed the system 130,000 photos of skin conditions — each labelled by a doctor as "cancer" or "not cancer" — and the system figured out the rules on its own.

That's machine learning. Not programmed rules. Not magic. A system that improves by looking at examples.

You already do this every day. When you were a toddler, nobody gave you a rulebook for identifying dogs. You saw hundreds of dogs — big ones, small ones, fluffy ones, scraggly ones — and your brain built its own internal "dog detector." That's exactly what machine learning does, just with math instead of neurons.

What "learning" actually means for a machine

When humans say a machine "learns," it sounds like the computer is sitting in a classroom taking notes. It's not. Here's what's really happening:

Learning = adjusting based on feedback.

Imagine you're throwing darts blindfolded. After each throw, someone tells you "too far left" or "a bit high." You adjust your next throw. After 100 throws, you're landing near the bullseye — not because someone gave you a formula for throwing, but because you adjusted based on feedback, again and again.

That's machine learning in four words: predict, check, adjust, repeat.

The machine starts with random guesses. Each time it gets feedback ("wrong!" or "close!"), it tweaks its internal settings — tiny numerical dials called parameters — to get a little closer to the right answer. After thousands or millions of rounds, those dials are tuned well enough that the system gives good answers on examples it's never seen before.

💭You're Probably Wondering…

There Are No Dumb Questions

"If machine learning is just adjusting dials, why is everyone making such a big deal about it?"

Because the number of dials is enormous. Modern AI systems have billions of parameters. No human could tune that many dials by hand. The breakthrough isn't the concept (adjust based on feedback) — it's the scale. When you have enough data and enough dials, the system starts finding patterns humans never could.

"Does the machine 'understand' what it's learning?"

No — not the way you understand things. It finds statistical patterns. A spam detector doesn't understand what spam is. It notices that emails containing "free money" and "click now" are usually labelled as spam, so it learns to flag emails with those patterns. Understanding vs. pattern-matching is one of the biggest debates in AI today.

Supervised learning: the flashcard method

Remember studying with flashcards? One side has the question, the other has the answer. You look at the question, guess the answer, flip the card to check, and adjust your thinking for next time.

Supervised learning works exactly like flashcards. You give the machine a set of examples where you already know the right answers (the "labels"), and the machine uses that feedback to learn patterns.

Flashcard analogySupervised learning term
The question on the frontInput data (features)
The answer on the backLabel (target)
Your stack of flashcardsTraining dataset
You checking your answerLoss calculation (how wrong was I?)
You adjusting your thinkingParameter update

Real examples of supervised learning:

  • Email spam filter: Thousands of emails, each labelled "spam" or "not spam." The model learns which words, patterns, and senders predict spam.
  • House price prediction: Thousands of houses with features (square footage, neighbourhood, bedrooms) and known sale prices. The model learns which features drive the price up or down.
  • Medical diagnosis: Thousands of X-rays labelled "fracture" or "no fracture" by radiologists. The model learns what fracture patterns look like.

The key requirement: you need labelled data. Someone (usually a human) has to provide the correct answer for every training example. That's often the hardest and most expensive part of the whole process.

⚡

Supervised or Not?

25 XP
YesNo
You give a model 10,000 movie reviews, each rated 1-5 stars by the reviewer. The model learns to predict star ratings for new reviews.
You give a model 50,000 customer purchase records with no labels. The model groups customers into clusters based on buying patterns.
You show a model 100,000 photos of cats and dogs, each labelled "cat" or "dog." The model learns to classify new photos.
A robot learns to walk by trying random movements and getting a score based on how far it travels.

2. You give a model 50,000 customer purchase records with no labels. The model groups customers into clusters based on buying patterns. →

0/4 answered

Unsupervised learning: sorting a messy drawer

You open your junk drawer. It's chaos — batteries, pens, receipts, rubber bands, paper clips, keys. Nobody told you the categories. Nobody labelled anything. But within minutes, you've sorted everything into natural groups: "writing stuff," "fasteners," "paper things," "electronics."

That's unsupervised learning. The machine gets data with no labels — no right answers — and finds patterns, groups, and structure on its own.

Supervised learningUnsupervised learning
Flashcards with answers on the backSorting a drawer with no labels
"Here's the right answer, learn from it""Here's a pile of data, find the patterns"
Needs labelled data (expensive!)Works with unlabelled data
Predicts specific outcomesDiscovers hidden structure
Email spam detection, price predictionCustomer segmentation, anomaly detection

✗ Without AI

  • ✗You label the training data
  • ✗Model learns: input → known output
  • ✗Use when you know what you want
  • ✗Examples: spam filter, image classification

✓ With AI

  • ✓No labels needed
  • ✓Model discovers structure itself
  • ✓Use when you don't know categories yet
  • ✓Examples: customer segmentation, anomaly detection

Real examples of unsupervised learning:

  • Customer segmentation: A retailer feeds purchase data into a model. Nobody says "this is a budget shopper" or "this is a luxury buyer." The model discovers natural clusters: "people who buy organic food and yoga mats" vs. "people who buy bulk items and diapers."
  • Anomaly detection: A bank feeds millions of normal transactions into a model. The model learns what "normal" looks like. Anything that doesn't fit the pattern gets flagged as potentially fraudulent — without anyone ever labelling individual transactions as "fraud."
  • Topic discovery: You dump 100,000 news articles into a model. Nobody labels the topics. The model discovers that certain words cluster together: "election," "candidate," "polls" form one cluster; "touchdown," "quarterback," "playoff" form another.
💭You're Probably Wondering…

There Are No Dumb Questions

"If unsupervised learning doesn't need labels, why would anyone bother with supervised learning?"

Because supervised learning is far more precise when you need a specific answer. If you need to know "is this email spam?" — that's a yes/no question with a definite right answer. Supervised learning nails that. Unsupervised learning would just group emails into clusters and say "these emails look similar" — which isn't the same as telling you which ones are spam.

"Is there anything in between?"

Yes! Semi-supervised learning uses a small amount of labelled data and a large amount of unlabelled data. Imagine you label 100 emails as spam/not-spam, then let the model figure out the remaining 10,000 by using patterns from those 100 examples. It's a practical middle ground when labelling is expensive.

⚡

Pick the Right Approach

25 XP
supervisedunsupervised
A streaming service wants to group its movies into genre clusters based on viewer behavior (not the studio's genre tags).
A hospital wants to predict whether a patient will be readmitted within 30 days (they have 5 years of historical admission data with outcomes).
A cybersecurity firm wants to detect unusual network traffic that doesn't match any known attack pattern.

2. A hospital wants to predict whether a patient will be readmitted within 30 days (they have 5 years of historical admission data with outcomes). →

0/3 answered

Reinforcement learning: learning through trial and error

You've played a video game you'd never seen before. Nobody explained the rules. You pressed buttons, watched what happened, and gradually learned: "jumping on that thing gave me points, falling in the lava ended the level." After an hour, you were good at the game — not because someone taught you, but because you tried things, got feedback, and adjusted.

That's reinforcement learning (RL). The model — called an agent — takes actions in an environment, receives a reward signal (positive or negative), and learns to maximise reward over time.

RL termWhat it meansGame analogy
AgentThe model making decisionsThe player
EnvironmentThe world the agent acts inThe game world
ActionA choice the agent can makeJump, run, shoot
RewardFeedback on how good the action wasPoints gained or health lost
PolicyThe strategy the agent develops"Always jump over the red things"

Real examples of reinforcement learning:

  • AlphaGo: DeepMind's system played millions of Go games against itself, rewarded for wins and penalised for losses. It discovered strategies no human player had considered in 2,500 years of the game (Silver et al., Nature, 2016).
  • Robot arms: Warehouse robots learn to pick objects by trying different grip angles — rewarded when the object is picked up cleanly, penalised when it drops.
  • Ad bidding: Ad platforms learn which bid to place for each impression by observing which bids led to conversions, updating in real time.
💭You're Probably Wondering…

There Are No Dumb Questions

"How is RL different from supervised learning — both involve feedback?"

In supervised learning, you get feedback on every example AND you already know the right answer: "you predicted $280K; it should have been $350K." In reinforcement learning, there often is no right answer to copy — just an outcome that was good or bad, sometimes long after the action. A chess agent doesn't learn from a teacher's answer key; it plays out the whole game and then works backward to figure out which moves led to the win or loss.

The key requirement is a reward signal — a clear definition of what "good" and "bad" outcomes look like. RL is powerful but harder to set up than the other two, which is why it's reserved for specific problems: games, robotics, and sequential decision-making.

The training loop: how a machine actually learns

Let's zoom in on that predict-check-adjust-repeat cycle with a concrete example. Say you're building a model to predict house prices.

After seeing thousands of houses, the model's internal dials are tuned so well that it can predict reasonable prices for houses it's never seen. That's the whole trick.

Key terms in the training loop:

TermWhat it meansAnalogy
EpochOne full pass through all training dataReading the entire textbook once
BatchA small group of examples processed togetherStudying one chapter at a time
LossHow wrong the model is (lower = better)Your score on a practice test (lower error = better)
Learning rateHow big each adjustment isHow drastically you change your approach after a mistake — too big and you overcorrect, too small and you learn too slowly

⚡

Hands-On: Be the Training Loop

50 XP
You are the machine learning model. Your job: predict whether a student will pass or fail a class based on two features — hours studied per week and classes attended (out of 30). Here are your training examples: | Student | Hours/week | Classes attended | Outcome | |---------|-----------|-----------------|---------| | A | 15 | 28 | Pass | | B | 2 | 10 | Fail | | C | 8 | 25 | Pass | | D | 1 | 5 | Fail | | E | 10 | 20 | Pass | | F | 3 | 8 | Fail | **Your tasks:** 1. Look at the pattern. What rough rule would you create to predict pass/fail? (Example: "If hours > ___ AND classes > ___, predict Pass") 2. Now test your rule on these new students. Predict pass or fail: | Student | Hours/week | Classes attended | Your prediction | |---------|-----------|-----------------|-----------------| | G | 12 | 22 | ? | | H | 4 | 15 | ? | | I | 6 | 26 | ? | 3. If Student H actually PASSED, how would you adjust your rule? _Hint: Look at each feature independently first — what values do passing students tend to have? What do failing students tend to have? Then try combining both features with AND. Before you use your rule to predict G, H, and I, check it against all six training examples first. Student I will test whether your rule handles edge cases — and adjusting your rule when it misfires is exactly what real ML training does._

Overfitting: the A+ student who fails the real exam

Here's a scenario you've definitely seen: a student memorises every answer in the practice test. Word for word. They score 100% on practice tests. Then they take the real exam — different questions, same concepts — and they bomb it. They memorised the answers instead of understanding the concepts.

That's overfitting. The model learned the training data too well. It memorised the specific examples instead of learning general patterns.

What it looks likeMemorising (overfitting)Understanding (good fit)
Performance on training dataPerfect (99%+)Very good (92%)
Performance on new dataTerrible (60%)Still very good (89%)
What the model learned"Example #4,721 has answer B""Asymmetric moles with irregular borders tend to be cancerous"
Real-world usefulnessUselessUseful

How do you prevent overfitting? The same way you'd advise that student:

  1. Use a test set. Hold back 20% of your data. Train on 80%, test on 20%. If the model does great on training data but poorly on test data, it's overfitting.
  2. Get more data. It's harder to memorise 1,000,000 examples than 100.
  3. Simplify the model. A model with too many parameters relative to the data can memorise everything. Reduce complexity.
  4. Regularisation. Technical term for adding a penalty when the model's parameters get too extreme — like telling the student "you can study the practice tests, but you can't bring notes into the exam."

Overfitting: training improves while validation degrades

💭You're Probably Wondering…

There Are No Dumb Questions

"How do I know if a model is overfitting if I'm not a data scientist?"

Ask one question: "What's the gap between training accuracy and test accuracy?" If training accuracy is 98% but test accuracy is 72%, the model is overfitting — it memorised the training data. A healthy model has a small gap (e.g., 95% training, 92% test).

"Is overfitting always bad? What about underfitting?"

Underfitting is the opposite problem — the model is too simple to capture the patterns. Like a student who didn't study enough and just guesses "Pass" for every student regardless of their grades. You want the sweet spot in between: complex enough to capture real patterns, simple enough to generalise to new data.

⚡

Overfit or Underfit?

25 XP
overfittingunderfittinggood fit
A spam detector gets 99% accuracy on training emails but only 55% on new emails.
A house price model predicts every house at $300,000 regardless of features. Training error: high. Test error: high.
A medical imaging model gets 94% accuracy on training X-rays and 91% on new X-rays it's never seen.
A recommendation engine memorises every user's exact purchase history but can't predict what *new* users will buy.

2. A house price model predicts every house at $300,000 regardless of features. Training error: high. Test error: high. →

0/4 answered

Putting it all together: the machine learning pipeline

Here's what the full process looks like from start to finish:

Most people think machine learning is step 5 — the training. In reality, steps 1-3 (getting good data) take 80% of the time. Data scientists have a saying: "garbage in, garbage out." The fanciest algorithm in the world can't learn from bad data.

Supervised Learning — you teach with labelled examples. "This email is spam. This one isn't."
Unsupervised Learning — you give unlabelled data and ask: "find the patterns yourself."
Reinforcement Learning — the model takes actions, gets rewards or penalties, and learns to maximise reward. How AlphaGo learned to play Go.

Back to the Stanford skin cancer system

The Stanford team never wrote a single rule that said "asymmetric border means melanoma." They showed the system 130,000 labelled photos and let it find the patterns itself — the same predict-check-adjust-repeat loop you worked through in the training exercise. What emerged were subtle correlations across millions of pixels that no dermatologist could consciously articulate, let alone program by hand. The system wasn't smarter than a doctor; it had just seen more examples, and it tuned billions of parameters until the patterns clicked. That's the core insight of this module: machine learning doesn't replace human expertise — it scales the signal buried inside labelled human judgments. The 130,000 photos were the dermatologists, distilled into weights.

Key takeaways

  • Machine learning = adjusting based on feedback. A model makes predictions, checks them against known answers, adjusts its internal settings, and repeats — thousands of times.
  • Supervised learning = flashcards. You need labelled data (question + answer pairs). Great for prediction tasks with clear right/wrong answers.
  • Unsupervised learning = sorting a drawer. No labels needed. The model finds patterns and groups on its own. Great for exploration and anomaly detection.
  • The training loop is predict, check, adjust, repeat. Everything else — epochs, batches, loss, learning rate — is just details of that loop.
  • Overfitting = memorising instead of understanding. Always check the gap between training and test accuracy. If it's big, the model memorised the data.
  • Data quality matters more than algorithm choice. 80% of the work is getting clean, representative data. The model is only as good as what you feed it.

?

Knowledge Check

1.A model achieves 99% accuracy on its training dataset but only 58% on a held-out test dataset. What is the most likely problem, and what is the standard first remedy?

2.A retailer wants to discover natural customer segments from purchase history without predefined categories. Which type of machine learning should they use?

3.In the machine learning training loop, what does the 'loss' metric represent?

4.Why does supervised learning require labelled data, and why is that often the biggest bottleneck in building an ML system?

Previous

Data: The Fuel for AI

Next

Neural Networks & Deep Learning