Computer Vision Lesson 44 – Instance Segmentation | Dataplexa

Instance Segmentation – Separating Individual Objects

So far, you have learned how semantic segmentation labels every pixel. But it has one important limitation.

It does not distinguish between different instances of the same class.

Instance segmentation solves this problem by answering a more precise question:

Which pixel belongs to which object?

Why Semantic Segmentation Is Not Enough

Imagine an image with five people.

Semantic segmentation will label all of them as:

“person”

But it will not tell:

Which pixels belong to person 1
Which pixels belong to person 2

For many real-world problems, this is not sufficient.

What Is Instance Segmentation?

Instance segmentation assigns:

A class label
A unique object identity
A pixel-accurate mask

Each object is treated as a separate instance, even if multiple objects belong to the same class.

Semantic vs Instance Segmentation

Aspect	Semantic Segmentation	Instance Segmentation
Class labels	Yes	Yes
Object identity	No	Yes
Separates same-class objects	No	Yes
Mask per object	No	Yes

How Instance Segmentation Thinks

Instance segmentation combines ideas from:

Object detection
Semantic segmentation

Conceptually, it works in three steps:

Find objects (bounding boxes)
Classify each object
Create a pixel-level mask for each object

This makes it more complex than semantic segmentation.

Why Instance Segmentation Is Harder

Instance segmentation must solve:

Overlapping objects
Objects touching each other
Different sizes and shapes

The model must understand:

What is foreground vs background
Where one object ends and another begins

Real-World Example

Consider a street scene:

10 cars
5 pedestrians
2 bicycles

Semantic segmentation gives:

Car pixels
Person pixels
Bicycle pixels

Instance segmentation gives:

Car #1, Car #2, …
Person #1, Person #2, …
Bicycle #1, Bicycle #2

Why Instance Segmentation Matters

Many applications require object-level understanding:

Autonomous driving
Robotics manipulation
Medical image analysis
Video tracking

Without instance segmentation, these systems fail.

Instance Segmentation vs Object Detection

Object detection draws boxes.

Instance segmentation goes further.

Feature	Object Detection	Instance Segmentation
Bounding boxes	Yes	Yes
Pixel-level masks	No	Yes
Object separation	Partial	Accurate

Popular Models for Instance Segmentation

Several architectures have been proposed:

Mask R-CNN
YOLACT
Detectron-based models

Among these, Mask R-CNN became the most influential.

That is why the next lesson focuses entirely on it.

How Output Looks Conceptually

The output of instance segmentation includes:

Bounding box
Class label
Binary mask for each object

Each object has its own mask, independent of others.

Common Mistakes Beginners Make

Confusing instance segmentation with semantic segmentation
Thinking bounding boxes are enough
Ignoring overlapping objects

Understanding this distinction is critical for interviews and real projects.

Practice Questions

Q1. Why can’t semantic segmentation separate multiple people in an image?

Because it assigns the same class label to all pixels without object identity.

Q2. What additional information does instance segmentation provide?

It provides a separate pixel-level mask for each individual object.

Q3. Which task combines object detection and segmentation?

Instance segmentation.

Mini Assignment

Think of a supermarket shelf image.

Why is instance segmentation better than detection?
Why is semantic segmentation insufficient?

Answer conceptually.

Quick Recap

Semantic segmentation labels pixels by class
Instance segmentation separates objects of the same class
Each object gets its own mask
Used in autonomous driving, robotics, medicine
Foundation for Mask R-CNN

Next lesson: Mask R-CNN – Instance Segmentation in Practice.

← Previous Course Index Next →