Data Frames in R
In this lesson, you will learn about data frames in R. Data frames are one of the most important data structures in R and are widely used in real-world data analysis.
Almost all datasets you work with in R will be stored as data frames. Understanding data frames clearly is a key step in becoming confident with R.
What Is a Data Frame?
A data frame is a table-like structure that stores data in rows and columns.
Each column can contain a different data type, but all values in a column must be of the same type.
You can think of a data frame as similar to a spreadsheet or a table in a database.
Why Data Frames Are Important
Data frames allow you to store structured datasets where each row represents an observation and each column represents a variable.
They are easy to analyze, filter, summarize, and visualize.
Creating a Data Frame
Data frames are created using the data.frame() function.
Each column is defined as a vector, and all vectors must be of the same length.
data <- data.frame(
id = c(1, 2, 3),
score = c(85, 90, 88),
passed = c(TRUE, TRUE, TRUE)
)
data
Here, the data frame contains numeric, logical, and integer data together.
Viewing a Data Frame
You can print a data frame directly to view its contents.
For larger datasets, R provides functions to preview the data.
head(data)
tail(data)
These functions show the first or last few rows of the data frame.
Checking Structure of a Data Frame
The str() function displays the structure of a data frame.
It shows column names, data types, and example values.
str(data)
This is very helpful when exploring a new dataset.
Accessing Columns in a Data Frame
You can access a column using the column name.
This returns all values stored in that column.
data$score
This method is commonly used for analysis and calculations.
Accessing Rows and Columns by Index
You can access specific rows and columns using row and column numbers.
The format is [row, column].
data[1, ]
data[ ,2]
data[2, 2]
This allows precise selection of data.
Adding a New Column
You can add a new column to a data frame easily.
The new column must have the same number of values as existing rows.
data$grade <- c("A", "A", "A")
data
This updates the data frame instantly.
Adding a New Row
New rows can be added using the rbind() function.
The new row must match the structure of the data frame.
new_row <- data.frame(id = 4, score = 92, passed = TRUE, grade = "A")
data <- rbind(data, new_row)
data
This is useful when appending new records.
Removing Columns or Rows
You can remove columns or rows by selecting what you want to keep.
This helps clean and simplify datasets.
data$passed <- NULL
data
This removes the specified column.
Why Data Frames Are Used Everywhere
Most R functions and packages are designed to work with data frames.
They are the foundation for data cleaning, visualization, and modeling.
📝 Practice Exercises
Exercise 1
Create a data frame with three columns: name, age, and score.
Exercise 2
Display the structure of the data frame.
Exercise 3
Add a new column called status.
Exercise 4
Add a new row to the data frame.
✅ Practice Answers
Answer 1
df <- data.frame(
name = c("Alex", "Jordan", "Taylor"),
age = c(25, 30, 28),
score = c(80, 90, 85)
)
df
Answer 2
str(df)
Answer 3
df$status <- "Active"
df
Answer 4
df <- rbind(df, data.frame(name="Morgan", age=26, score=88, status="Active"))
df
What’s Next?
Now that you understand data frames, the next lesson will focus on basic operations.
You will learn how to perform calculations and transformations on data stored in R.