Pandas Lesson 4 – Exploring DataFrames| Dataplexa

Exploring DataFrames

Once data is loaded into Pandas, the next critical step is understanding what the data contains.

In this lesson, you will learn how to explore a DataFrame to understand its structure, columns, data types, and overall quality.


Why Data Exploration Is Important

Before performing analysis or cleaning, you must first understand the data you are working with.

Exploring data helps you:

  • Identify available columns
  • Understand data types
  • Detect missing or incorrect values
  • Avoid mistakes during analysis

Loading the Dataset

Start by loading the dataset that you downloaded from the Dataplexa datasets page.

import pandas as pd

df = pd.read_csv("dataplexa_pandas_sales.csv")

Once loaded, the data is stored in a DataFrame called df.


Viewing Column Names

To see all column names in the DataFrame, use:

df.columns

This helps you understand what information is available and how each column is named.


Checking Data Types

Each column in a DataFrame has a data type (number, text, date, etc.).

To check data types, use:

df.dtypes

This is important because many operations depend on correct data types.


Basic Information About the DataFrame

The info() method provides a quick summary of the dataset.

df.info()

It shows:

  • Number of rows and columns
  • Column names
  • Data types
  • Non-null value counts

Statistical Summary of Data

For numeric columns, Pandas can generate summary statistics using describe().

df.describe()

This includes:

  • Count
  • Mean
  • Minimum and maximum values
  • Quartiles

This helps you quickly understand distributions and ranges.


Checking for Missing Values

Missing data is common in real-world datasets.

To check how many missing values exist in each column:

df.isnull().sum()

This allows you to decide whether data needs cleaning, which you will learn in later lessons.


Understanding Dataset Shape

To confirm how large the dataset is, use:

df.shape

This returns:

  • Total number of rows
  • Total number of columns

Practice Exercise

Exercise

Using the dataset:

  • List all column names
  • Check data types of each column
  • Display basic information using info()
  • Generate summary statistics

What’s Next?

Now that you understand the structure of your dataset, the next lesson will teach you how to select specific columns and rows from a DataFrame.