Data Frames | Dataplexa

Data Frames in R

In this lesson, you will learn about data frames in R. Data frames are one of the most important data structures in R and are widely used in real-world data analysis.

Almost all datasets you work with in R will be stored as data frames. Understanding data frames clearly is a key step in becoming confident with R.


What Is a Data Frame?

A data frame is a table-like structure that stores data in rows and columns.

Each column can contain a different data type, but all values in a column must be of the same type.

You can think of a data frame as similar to a spreadsheet or a table in a database.


Why Data Frames Are Important

Data frames allow you to store structured datasets where each row represents an observation and each column represents a variable.

They are easy to analyze, filter, summarize, and visualize.


Creating a Data Frame

Data frames are created using the data.frame() function.

Each column is defined as a vector, and all vectors must be of the same length.

data <- data.frame(
  id = c(1, 2, 3),
  score = c(85, 90, 88),
  passed = c(TRUE, TRUE, TRUE)
)

data

Here, the data frame contains numeric, logical, and integer data together.


Viewing a Data Frame

You can print a data frame directly to view its contents.

For larger datasets, R provides functions to preview the data.

head(data)
tail(data)

These functions show the first or last few rows of the data frame.


Checking Structure of a Data Frame

The str() function displays the structure of a data frame.

It shows column names, data types, and example values.

str(data)

This is very helpful when exploring a new dataset.


Accessing Columns in a Data Frame

You can access a column using the column name.

This returns all values stored in that column.

data$score

This method is commonly used for analysis and calculations.


Accessing Rows and Columns by Index

You can access specific rows and columns using row and column numbers.

The format is [row, column].

data[1, ]
data[ ,2]
data[2, 2]

This allows precise selection of data.


Adding a New Column

You can add a new column to a data frame easily.

The new column must have the same number of values as existing rows.

data$grade <- c("A", "A", "A")
data

This updates the data frame instantly.


Adding a New Row

New rows can be added using the rbind() function.

The new row must match the structure of the data frame.

new_row <- data.frame(id = 4, score = 92, passed = TRUE, grade = "A")
data <- rbind(data, new_row)

data

This is useful when appending new records.


Removing Columns or Rows

You can remove columns or rows by selecting what you want to keep.

This helps clean and simplify datasets.

data$passed <- NULL
data

This removes the specified column.


Why Data Frames Are Used Everywhere

Most R functions and packages are designed to work with data frames.

They are the foundation for data cleaning, visualization, and modeling.


📝 Practice Exercises


Exercise 1

Create a data frame with three columns: name, age, and score.

Exercise 2

Display the structure of the data frame.

Exercise 3

Add a new column called status.

Exercise 4

Add a new row to the data frame.


✅ Practice Answers


Answer 1

df <- data.frame(
  name = c("Alex", "Jordan", "Taylor"),
  age = c(25, 30, 28),
  score = c(80, 90, 85)
)

df

Answer 2

str(df)

Answer 3

df$status <- "Active"
df

Answer 4

df <- rbind(df, data.frame(name="Morgan", age=26, score=88, status="Active"))
df

What’s Next?

Now that you understand data frames, the next lesson will focus on basic operations.

You will learn how to perform calculations and transformations on data stored in R.