Series and DataFrame Basics
In this lesson, you will learn the two most important building blocks of Pandas: Series and DataFrames. Every operation you perform in Pandas is based on these structures.
Understanding them clearly will make all future lessons easier, especially when working with real datasets.
What is a Pandas Series?
A Series is a one-dimensional data structure in Pandas. It represents a single column of data with labels called an index.
You can think of a Series as:
- A column in a spreadsheet
- A single variable with indexed values
- A labeled NumPy array
Creating a Series
You can create a Series from a Python list:
import pandas as pd
sales = pd.Series([1200, 1500, 1800, 1100])
print(sales)
Each value automatically receives an index starting from 0.
Series with Custom Index
Indexes can be meaningful labels instead of numbers. This is very useful in real data analysis.
sales = pd.Series(
[1200, 1500, 1800],
index=["January", "February", "March"]
)
print(sales)
Now each value is associated with a specific month.
What is a Pandas DataFrame?
A DataFrame is a two-dimensional data structure with rows and columns.
This is the most commonly used Pandas object and closely resembles:
- An Excel sheet
- A SQL table
- A CSV file loaded into memory
Creating a DataFrame from a Dictionary
A very common way to create a DataFrame is using a dictionary. Each key becomes a column name.
data = {
"Product": ["Laptop", "Phone", "Tablet"],
"Quantity": [10, 25, 15],
"Price": [800, 500, 300]
}
df = pd.DataFrame(data)
print(df)
Each row represents a record, and each column represents a variable.
Understanding Rows and Columns
In a DataFrame:
- Rows represent observations (records)
- Columns represent features (attributes)
This structure makes DataFrames ideal for real-world datasets such as sales, customer data, or transaction logs.
Inspecting DataFrame Structure
Once a DataFrame is created or loaded, you should inspect its structure.
df.shape
This returns the number of rows and columns.
df.columns
This shows all column names in the DataFrame.
Accessing Columns from a DataFrame
You can access a column as a Series using its name:
df["Price"]
This returns a Series containing all values from the Price column.
Using the Course Dataset
In upcoming lessons, you will load the course dataset (CSV file) and apply these same concepts to real data.
Every column in the dataset will behave like a Series, and the entire file will be represented as a DataFrame.
Practice Exercise
Exercise
Create a DataFrame that contains:
- Product Name
- Units Sold
- Unit Price
Then print the DataFrame and access one column as a Series.
Expected Outcome
You should be able to:
- Create a DataFrame
- Understand rows vs columns
- Extract a column as a Series
What’s Next?
In the next lesson, you will learn how to read real data files such as CSV files into Pandas DataFrames using the dataset you downloaded.