Pandas Lesson 11 – Columns | Dataplexa

Adding and Removing Columns in Pandas

In real-world data analysis, datasets rarely remain static. You often need to create new columns, modify existing ones, or remove unnecessary columns.

In this lesson, you will learn how to add, calculate, and delete columns safely using Pandas.


Loading the Dataset

We continue using the same dataset used in previous lessons.

import pandas as pd

df = pd.read_csv("dataplexa_pandas_sales.csv")

Viewing Current Columns

Before making changes, always inspect the existing columns.

df.columns

Adding a New Column with a Fixed Value

You can add a column by assigning a value directly.

Example: Add a column that labels all rows as "Online" sales.

df["sales_channel"] = "Online"

This value is applied to every row in the dataset.


Adding a Column Using Calculations

One of the most powerful features of Pandas is creating columns using existing data.

Example: Calculate total revenue using quantity and price.

df["total_revenue"] = df["quantity"] * df["price"]

This creates a new column where each row contains a calculated value.


Adding a Conditional Column

You can create columns based on conditions.

Example: Mark high-value sales.

df["high_value_sale"] = df["total_revenue"] > 500

This produces True or False values for each row.


Inserting a Column at a Specific Position

Sometimes column order matters, especially for reports.

Example: Insert a column at position index 2.

df.insert(2, "discount", 0)

This inserts the column without overwriting others.


Removing a Column

To remove a column, use the drop() method.

Example: Remove the discount column.

df.drop(columns=["discount"], inplace=True)

The column is permanently removed from the DataFrame.


Removing Multiple Columns

You can remove multiple columns at once.

df.drop(columns=["sales_channel", "high_value_sale"], inplace=True)

Checking the Final Structure

Always verify changes after modifying columns.

df.head()

Common Mistakes to Avoid

  • Overwriting existing columns accidentally
  • Using incorrect column names
  • Forgetting inplace=True
  • Dropping columns before confirming they are not needed

Practice Exercise

Using the dataset:

  • Add a column that calculates total sales value
  • Add a column that marks sales above a threshold
  • Remove one unnecessary column
  • Verify the final column list

What’s Next?

In the next lesson, you will learn how to work with indexes and understand how Pandas uses them internally.