Pandas Lesson 9 – Sorting | Dataplexa

Sorting and Ranking Data in Pandas

After cleaning the dataset, the next step is organizing the data in a meaningful order.

Sorting and ranking help you quickly identify top values, lowest values, trends, and priorities.


Why Sorting and Ranking Matter

Sorting and ranking allow you to:

  • Find highest and lowest values
  • Identify top-performing products or regions
  • Prepare data for reporting and visualization
  • Make comparisons easier

Loading the Dataset

We continue using the cleaned dataset from previous lessons.

import pandas as pd

df = pd.read_csv("dataplexa_pandas_sales.csv")

Sorting Data by One Column

To sort data by a single column, use the sort_values() method.

Example: Sort rows by sales in ascending order.

df.sort_values(by="sales")

This places the lowest sales values at the top.


Sorting in Descending Order

To view the highest values first, set ascending=False.

df.sort_values(by="sales", ascending=False)

This is commonly used to find top-performing records.


Sorting by Multiple Columns

Sometimes one column is not enough. You can sort using multiple columns.

Example: Sort by region first, then by sales within each region.

df.sort_values(by=["region", "sales"], ascending=[True, False])

This groups regions alphabetically and sorts sales from highest to lowest inside each group.


Sorting Rows by Index

You can also sort rows using the DataFrame index.

df.sort_index()

Index sorting is useful after filtering or merging datasets.


Ranking Data

Ranking assigns a numerical rank to values, based on their size.

Example: Rank sales values from lowest to highest.

df["sales_rank"] = df["sales"].rank()

Each row receives a rank number.


Ranking in Descending Order

To rank highest values as number one:

df["sales_rank_desc"] = df["sales"].rank(ascending=False)

This is useful for leaderboards and performance reports.


Handling Ties in Ranking

When two values are equal, Pandas offers different ranking methods.

  • average – average rank (default)
  • min – lowest rank
  • max – highest rank
  • dense – no gaps in ranking

Example using dense ranking:

df["dense_rank"] = df["sales"].rank(method="dense", ascending=False)

Selecting Top Records

After sorting, you often want only the top rows.

Example: Top 5 sales records.

df.sort_values(by="sales", ascending=False).head(5)

This is commonly used in dashboards and reports.


Practice Exercise

Using the dataset:

  • Sort sales in descending order
  • Find the top 5 highest sales
  • Create a ranking column for sales
  • Use dense ranking to handle ties

What’s Next?

In the next lesson, you will learn how to rename columns to make datasets cleaner and more readable.