Applying Functions in Pandas
In real-world data analysis, you often need to transform, calculate, or customize values inside a DataFrame.
Pandas provides powerful ways to apply functions
to rows, columns, and individual values.
In this lesson, you will learn how to use
apply(), applymap(), and map().
Loading the Dataset
We continue using the same dataset throughout this course.
import pandas as pd
df = pd.read_csv("dataplexa_pandas_sales.csv")
Why Applying Functions Is Important
Applying functions allows you to:
- Perform calculations on columns
- Create derived values
- Standardize or clean data
- Apply business logic to datasets
Applying a Function to a Column
Suppose we want to calculate a 10% tax
on the total_amount column.
df["tax"] = df["total_amount"].apply(lambda x: x * 0.10)
This creates a new column where each value is 10% of the total amount.
Using apply() on Multiple Columns
You can also apply a function across rows. For example, calculating final price including tax.
df["final_price"] = df.apply(
lambda row: row["total_amount"] + row["tax"],
axis=1
)
Here:
axis=1means row-wise operation- Each row is passed to the function
Creating Custom Functions
Instead of lambda functions, you can define your own reusable functions.
def discount(amount):
if amount > 500:
return amount * 0.90
else:
return amount
df["discounted_price"] = df["total_amount"].apply(discount)
This applies a 10% discount only for orders greater than 500.
Applying Functions to All Values
If you want to apply a function to
every value in the DataFrame,
use applymap().
df_numeric = df.select_dtypes(include="number")
df_numeric.applymap(lambda x: round(x, 2))
This rounds all numeric values to two decimals.
Using map() for Value Mapping
The map() function is useful for
replacing values based on a mapping.
Example: converting payment types into readable labels.
payment_map = {
"CC": "Credit Card",
"DC": "Debit Card",
"COD": "Cash on Delivery"
}
df["payment_method"] = df["payment_code"].map(payment_map)
Handling Missing Values During apply()
Always handle missing values to avoid errors.
df["safe_total"] = df["total_amount"].apply(
lambda x: 0 if pd.isna(x) else x
)
Performance Considerations
apply()is flexible but slower- Vectorized operations are faster when possible
- Use apply only when logic cannot be vectorized
Practice Exercise
Using the dataset:
- Create a new column with 5% tax
- Apply a discount for large orders
- Round numeric values
- Map coded values to readable labels
What’s Next?
In the next lesson, you will learn how to perform string operations for cleaning and transforming text data.