Computer Vision Lesson 3 – Pixels & Images | Dataplexa

Pixels and Images

Before we talk about edges, objects, faces, or deep learning models, we must clearly understand what an image actually is for a computer.

Humans see images as meaningful scenes — people, roads, text, objects. A computer does not see any of that. It only sees numbers arranged in a grid.

This lesson builds the most important foundation in Computer Vision: pixels and how images are represented internally. If this is clear, everything later (OpenCV, CNNs, YOLO, segmentation) becomes much easier.

What Is a Pixel?

A pixel (picture element) is the smallest unit of an image. Each pixel stores information about intensity or color at a specific location.

Think of an image as a chessboard:

Each square = one pixel
Each pixel has a value
All pixels together form the image

If you zoom into any digital image enough, you will eventually see small square blocks. Those blocks are pixels.

Image as a Grid of Numbers

From a computer’s point of view, an image is just a matrix (2D array).

For example, a grayscale image can be written as:

Example: 5 × 5 Grayscale Image

[
 [  0,  30,  80, 120, 255 ],
 [ 10,  50,  90, 140, 230 ],
 [ 20,  60, 100, 150, 200 ],
 [ 30,  80, 130, 180, 160 ],
 [ 40, 100, 160, 200, 120 ]
]

Each number represents the brightness of a pixel. The position of the number tells the computer where that pixel is located.

Pixel Intensity Values (Grayscale Images)

In a grayscale image:

Pixel values usually range from 0 to 255
0 → pure black
255 → pure white
Values in between → shades of gray

Pixel Value	Meaning
0	Black
50	Dark gray
128	Medium gray
200	Light gray
255	White

So when a computer processes a grayscale image, it is really processing these numbers.

Color Images: More Than One Number per Pixel

Color images are slightly more complex. Instead of one number per pixel, we usually have three numbers.

The most common format is RGB:

R → Red channel
G → Green channel
B → Blue channel

Each channel again has values from 0 to 255.

Example: One RGB Pixel

( R = 255, G = 0, B = 0 ) → Red
( R = 0, G = 255, B = 0 ) → Green
( R = 0, G = 0, B = 255 ) → Blue
( R = 255, G = 255, B = 255 ) → White

So a color image is actually a 3D array: height × width × 3.

Image Dimensions Explained

When you hear something like:

Image size = 640 × 480

It means:

640 pixels wide
480 pixels tall

For a color image:

640 × 480 × 3 values

That is 921,600 pixel values for a single image.

Why Pixel Understanding Is Critical in CV

Every Computer Vision operation works by manipulating pixel values:

Blurring → averaging pixel values
Edge detection → comparing neighboring pixels
Thresholding → checking pixel intensity limits
CNNs → learning patterns from pixel neighborhoods

If pixels change, the image changes. This is why preprocessing matters so much.

Real-Life Analogy

Think of an image like a large Excel sheet:

Each cell = one pixel
Each cell contains a number
Formulas (algorithms) operate on those numbers

Computer Vision is essentially applied mathematics on pixel grids.

Practice Questions

Q1. What is a pixel?

A pixel is the smallest unit of an image that stores intensity or color information.

Q2. How many values does one RGB pixel have?

Three values: Red, Green, and Blue.

Q3. Why do computers see images as matrices?

Because images are stored as grids of numeric pixel values that algorithms can process.

Quick Quiz

Q1. In grayscale images, pixel values usually range between?

0 to 255.

Q2. A color image is represented as which data structure?

A 3D array (height × width × channels).

Key Takeaways

Images are numerical grids, not visual scenes
Pixels store intensity or color values
Grayscale images use one value per pixel
Color images usually use RGB (three values per pixel)
All CV algorithms work by manipulating pixel values

In the next lesson, we will see how images are stored and represented in different formats, and why representation matters in Computer Vision pipelines.

← Previous Course Index Next →