Reinforcement Learning
So far in this course, we have worked with supervised and unsupervised learning techniques. In both cases, models learn from existing data.
Reinforcement Learning is different. Instead of learning from a fixed dataset, the model learns by interacting with an environment.
This lesson introduces Reinforcement Learning (RL) in a simple and intuitive way, without jumping into complex mathematics.
What Is Reinforcement Learning?
Reinforcement Learning is a learning approach where an agent takes actions inside an environment and learns from the results of those actions.
The agent receives rewards or penalties based on its actions. Over time, it learns which actions lead to better outcomes.
Unlike supervised learning, there is no correct answer given in advance. The agent must discover the best strategy on its own.
Core Components of Reinforcement Learning
To understand reinforcement learning, we must understand its main building blocks.
The agent is the learner or decision maker. It could be a robot, a software program, or a game player.
The environment is everything the agent interacts with. This could be a game board, a road, or a simulation.
An action is any move the agent can take.
A reward is feedback from the environment that tells the agent how good or bad its action was.
How Learning Happens in RL
Reinforcement learning happens through trial and error.
The agent starts with little or no knowledge. It explores different actions, sometimes making mistakes.
Over many interactions, the agent learns a policy — a strategy that tells it what action to take in each situation.
The goal is to maximize total reward over time, not just immediate reward.
Real-World Example
Consider a self-driving car.
The agent is the driving algorithm. The environment is the road and traffic.
Actions include accelerating, braking, and turning.
Safe driving earns positive rewards. Crashes or traffic violations give negative rewards.
By learning from experience, the car improves its driving behavior.
Reinforcement Learning vs Other ML Types
In supervised learning, the model learns from labeled examples.
In unsupervised learning, the model discovers patterns in data.
In reinforcement learning, the model learns from consequences of actions.
This makes reinforcement learning especially useful for decision-making problems.
Simple Reinforcement Learning Example (Conceptual)
Imagine a robot in a maze.
Each step forward gives a small negative reward (to encourage efficiency).
Reaching the exit gives a large positive reward.
Over time, the robot learns the shortest path to exit.
Why Reinforcement Learning Is Powerful
Reinforcement learning can solve problems where rules are too complex to program manually.
It is widely used in:
• Robotics • Game playing (chess, Go) • Recommendation systems • Resource optimization
Mini Practice
Think of a daily activity that could be framed as a reinforcement learning problem.
Identify the agent, actions, environment, and rewards.
Exercises
Exercise 1:
What is the role of reward in reinforcement learning?
Exercise 2:
Why is reinforcement learning called trial-and-error learning?
Quick Quiz
Q1. Does reinforcement learning require labeled data?
In the next lesson, we build the foundation for deep learning by learning about Neural Networks.