Pandas Lesson 12 – Indexes | Dataplexa

Working with Indexes in Pandas

In Pandas, an index is used to uniquely identify rows in a DataFrame. While it may look like just row numbers, the index plays a critical role in data selection, alignment, and performance.

In this lesson, you will learn how to view, set, reset, and work effectively with indexes.


Loading the Dataset

We continue using the same dataset used in previous lessons.

import pandas as pd

df = pd.read_csv("dataplexa_pandas_sales.csv")

Understanding the Default Index

By default, Pandas assigns an integer-based index starting from 0.

df.head()

The numbers on the left side of the DataFrame are the index values.


Viewing Index Information

You can inspect the index directly using:

df.index

This tells you the index type, range, and length.


Setting a Column as Index

Often, a column like an order ID or date is better suited as an index.

Example: Set order_id as the index.

df.set_index("order_id", inplace=True)

Now each row is uniquely identified by order_id.


Accessing Rows Using the Index

Once a column becomes the index, you can access rows directly.

df.loc[1005]

This retrieves the row where the index value equals 1005.


Resetting the Index

If you no longer want a custom index, you can reset it back to default.

df.reset_index(inplace=True)

The old index becomes a regular column again.


Dropping the Index While Resetting

Sometimes you don’t need the old index at all.

df.reset_index(drop=True, inplace=True)

This removes the index completely.


Renaming the Index

You can give the index a meaningful name.

df.index.name = "row_number"

Sorting by Index

Indexes can be sorted just like columns.

df.sort_index(inplace=True)

This is especially useful for time-series or ID-based data.


Why Indexes Matter

  • Faster data access
  • Cleaner row identification
  • Better alignment during joins and merges
  • Essential for time-series analysis

Common Index Mistakes

  • Using non-unique values as index
  • Forgetting to reset index after filtering
  • Dropping important columns unintentionally

Practice Exercise

Using the dataset:

  • Set an ID column as the index
  • Access a row using .loc
  • Reset the index
  • Rename the index

What’s Next?

In the next lesson, you will learn how to detect and handle duplicate data to keep datasets clean and accurate.