Adding and Removing Columns in Pandas
In real-world data analysis, datasets rarely remain static. You often need to create new columns, modify existing ones, or remove unnecessary columns.
In this lesson, you will learn how to add, calculate, and delete columns safely using Pandas.
Loading the Dataset
We continue using the same dataset used in previous lessons.
import pandas as pd
df = pd.read_csv("dataplexa_pandas_sales.csv")
Viewing Current Columns
Before making changes, always inspect the existing columns.
df.columns
Adding a New Column with a Fixed Value
You can add a column by assigning a value directly.
Example: Add a column that labels all rows as
"Online" sales.
df["sales_channel"] = "Online"
This value is applied to every row in the dataset.
Adding a Column Using Calculations
One of the most powerful features of Pandas is creating columns using existing data.
Example: Calculate total revenue using
quantity and price.
df["total_revenue"] = df["quantity"] * df["price"]
This creates a new column where each row contains a calculated value.
Adding a Conditional Column
You can create columns based on conditions.
Example: Mark high-value sales.
df["high_value_sale"] = df["total_revenue"] > 500
This produces True or False
values for each row.
Inserting a Column at a Specific Position
Sometimes column order matters, especially for reports.
Example: Insert a column at position index 2.
df.insert(2, "discount", 0)
This inserts the column without overwriting others.
Removing a Column
To remove a column, use the
drop() method.
Example: Remove the discount column.
df.drop(columns=["discount"], inplace=True)
The column is permanently removed from the DataFrame.
Removing Multiple Columns
You can remove multiple columns at once.
df.drop(columns=["sales_channel", "high_value_sale"], inplace=True)
Checking the Final Structure
Always verify changes after modifying columns.
df.head()
Common Mistakes to Avoid
- Overwriting existing columns accidentally
- Using incorrect column names
- Forgetting
inplace=True - Dropping columns before confirming they are not needed
Practice Exercise
Using the dataset:
- Add a column that calculates total sales value
- Add a column that marks sales above a threshold
- Remove one unnecessary column
- Verify the final column list
What’s Next?
In the next lesson, you will learn how to work with indexes and understand how Pandas uses them internally.