Pandas Lesson 5 – Selecting Data | Dataplexa

Selecting Data in Pandas

After exploring a DataFrame and understanding its structure, the next essential skill is selecting specific data.

In real-world analysis, you rarely work with the entire dataset at once. Instead, you select required columns, rows, or combinations of both.


Why Data Selection Matters

Selecting data allows you to:

  • Focus only on relevant information
  • Perform calculations on specific columns
  • Filter records for analysis and reporting
  • Prepare data for cleaning or visualization

Pandas provides multiple ways to select data, each useful in different situations.


Loading the Dataset

We will continue using the same dataset from previous lessons.

import pandas as pd

df = pd.read_csv("dataplexa_pandas_sales.csv")

Selecting a Single Column

The simplest way to select data is by choosing one column.

You can select a column using square brackets:

df["Product"]

This returns a Pandas Series containing only the values from the selected column.


Selecting Multiple Columns

To select multiple columns, pass a list of column names.

df[["Product", "Sales", "Quantity"]]

The result is a new DataFrame with only those columns.


Selecting Rows by Index

Sometimes you need specific rows instead of columns.

To select rows by their index position, use iloc.

df.iloc[0]

This selects the first row in the DataFrame.

To select multiple rows:

df.iloc[0:5]

This returns the first five rows.


Selecting Rows and Columns Together

You can combine row and column selection using iloc.

df.iloc[0:5, 1:4]

This selects:

  • Rows from index 0 to 4
  • Columns from index 1 to 3

Selecting Data by Column Labels

Another powerful method is loc, which uses column and row labels instead of numbers.

df.loc[0, "Sales"]

This selects the value from the Sales column in the first row.

Selecting multiple rows and columns:

df.loc[0:4, ["Product", "Sales"]]

Difference Between loc and iloc

  • loc uses labels (column names, index labels)
  • iloc uses numeric positions

Understanding this difference is critical for accurate data selection.


Practice Exercise

Exercise

Using the dataset:

  • Select the Sales column
  • Select Product and Quantity columns
  • Select the first 10 rows
  • Select rows 5 to 10 and only the Sales column

What’s Next?

Now that you know how to select data, the next lesson will focus on filtering data using conditions, which allows you to work with only specific records.