AI Course
Lesson 85: Object Detection
Object detection is one of the most important tasks in computer vision. It answers two questions at the same time: what is present in an image and where it is located.
Unlike image classification, which only tells you what the image contains, object detection identifies multiple objects and draws bounding boxes around them.
Real-World Connection
Object detection powers many technologies you use daily. Self-driving cars detect pedestrians and vehicles, security systems identify intruders, retail stores track customer movement, and mobile apps recognize objects through the camera.
Any system that needs to locate and track objects in real time relies on object detection.
What Is Object Detection?
Object detection is a computer vision technique that identifies objects in an image and marks their positions using bounding boxes.
- Detects multiple objects in a single image
- Assigns a class label to each object
- Returns the coordinates of each object
How Object Detection Works
Most object detection systems follow these steps:
- Analyze the image using convolution layers
- Extract features like edges, shapes, and textures
- Predict object class probabilities
- Predict bounding box coordinates
Modern detectors use deep learning models trained on large labeled datasets.
Popular Object Detection Models
- YOLO (You Only Look Once) – fast and real-time detection
- SSD (Single Shot Detector) – balance between speed and accuracy
- Faster R-CNN – high accuracy, slower inference
Simple Object Detection Example
Below is a basic example of using a pre-trained object detection model with OpenCV.
import cv2
net = cv2.dnn.readNetFromCaffe(
"deploy.prototxt",
"mobilenet_iter_73000.caffemodel"
)
image = cv2.imread("image.jpg")
(h, w) = image.shape[:2]
blob = cv2.dnn.blobFromImage(
image, 0.007843, (300, 300), 127.5
)
net.setInput(blob)
detections = net.forward()
for i in range(detections.shape[2]):
confidence = detections[0, 0, i, 2]
if confidence > 0.5:
box = detections[0, 0, i, 3:7] * [w, h, w, h]
(x1, y1, x2, y2) = box.astype("int")
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.imshow("Detected Objects", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
What This Code Is Doing
The model scans the image and predicts bounding boxes around objects it recognizes. Each detection includes a confidence score, which indicates how sure the model is.
Bounding boxes are drawn only when the confidence exceeds a chosen threshold.
Understanding the Output
The output image displays green rectangles around detected objects. Each rectangle represents an object along with its estimated location.
In real applications, labels such as “person” or “car” are also displayed.
Challenges in Object Detection
- Objects of different sizes
- Overlapping objects
- Lighting variations
- Real-time performance constraints
Why Object Detection Matters in AI
Object detection allows machines to understand visual scenes. It enables decision-making in autonomous systems and provides critical perception capabilities.
Most advanced AI vision systems combine object detection with tracking and segmentation.
Practice Questions
Practice 1: What is used to mark object locations in detection?
Practice 2: Name a real-time object detection model.
Practice 3: What score measures how sure a detection is?
Quick Quiz
Quiz 1: Which task finds objects and their locations?
Quiz 2: Which model processes the image in a single pass?
Quiz 3: Which library is used in the example code?
Coming up next: Object Tracking — following detected objects across video frames.