Computer Vision Lesson 4 – Image Representation | Dataplexa

Image Representation

In the previous lesson, you learned that computers see images as grids of pixel values. Now we go one level deeper and answer an important question:

How exactly are images represented, stored, and interpreted inside a computer system?

This lesson builds the bridge between raw pixels and real Computer Vision operations. Understanding image representation is critical before working with OpenCV, filters, CNNs, or deep learning models.


Why Image Representation Matters

The same image can be represented in different ways, and each representation is useful for a specific purpose.

  • Some representations are better for processing speed
  • Some preserve color information better
  • Some are optimized for storage or compression

If representation is wrong or misunderstood, algorithms may fail or give poor results.


Grayscale Image Representation

A grayscale image is the simplest form of image representation.

  • Each pixel is represented by one number
  • Typical range: 0 to 255
  • Stored as a 2D matrix

Conceptual Grayscale Representation

Image (5 × 5):

[
 [  0,  40,  90, 150, 255 ],
 [ 20,  60, 110, 170, 240 ],
 [ 30,  80, 130, 190, 210 ],
 [ 50, 100, 160, 200, 180 ],
 [ 70, 120, 170, 220, 160 ]
]
  

This format is widely used in:

  • Edge detection
  • Thresholding
  • Shape analysis
  • Medical imaging

Color Image Representation (RGB Model)

Color images are represented using multiple channels. The most common format is RGB.

  • Each pixel has 3 values: Red, Green, Blue
  • Each value ranges from 0 to 255
  • Stored as a 3D matrix

RGB Pixel Representation

Pixel at (x, y):

[ R, G, B ]

Example:
[ 120, 200, 80 ]
  

So an image of size H × W becomes:

  • H × W × 3 numeric values

This representation preserves full color information but requires more memory.


Channel-wise Representation

Instead of viewing an RGB image as one unit, a computer often separates it into three independent channels.

  • Red channel → intensity of red color
  • Green channel → intensity of green color
  • Blue channel → intensity of blue color

Each channel itself is a grayscale image.

This separation allows algorithms to:

  • Process specific color components
  • Detect patterns hidden in a single channel
  • Improve accuracy in vision tasks

Image Representation vs Image Format

Many beginners confuse image representation with image file formats. They are not the same.

Concept Meaning
Image Representation How pixel data is arranged in memory (arrays, channels)
Image Format How the image is stored on disk (JPEG, PNG, BMP)

Once an image is loaded into memory, all formats are converted into numeric arrays.


Common Image Data Types

Pixel values are stored using specific data types:

  • uint8 → most common (0–255)
  • float32 → normalized images (0.0–1.0)
  • int → rarely used in practice

Deep learning models often require normalized float values, while classic CV uses uint8.


Why Normalization Is Often Required

Sometimes pixel values are converted from:

  • 0–255 → 0.0–1.0

This process is called normalization.

Normalization helps:

  • Improve numerical stability
  • Speed up learning in neural networks
  • Prevent dominance of large values

Real-World Analogy

Think of image representation like this:

  • The image file is like a ZIP file
  • Image representation is the unzipped data
  • Algorithms work only on unzipped numeric data

No algorithm works directly on “pictures” — only on numbers.


Practice Questions

Q1. How is a grayscale image represented in memory?

As a 2D matrix where each value represents pixel intensity.

Q2. Why are RGB images considered 3D arrays?

Because they have height, width, and three color channels.

Q3. What happens to image formats after loading into memory?

All formats are converted into numeric pixel arrays.

Quick Quiz

Q1. Which representation is best for edge detection?

Grayscale image representation.

Q2. What does normalization usually convert pixel values into?

Values between 0.0 and 1.0.

Key Takeaways

  • Images are stored as numeric arrays in memory
  • Grayscale images use one value per pixel
  • Color images use multiple channels (RGB)
  • Image representation is different from file format
  • All CV algorithms operate on numeric pixel representations

In the next lesson, we will explore image transformations — how resizing, rotating, and scaling images affect pixel representation.