Correlation & Regression in R
Correlation and regression are statistical techniques used to understand relationships between variables.
They help answer questions like whether variables move together and how one variable can be used to predict another.
What Is Correlation?
Correlation measures the strength and direction of a relationship between two numeric variables.
It tells us whether variables increase together, decrease together, or move in opposite directions.
Correlation Coefficient
The correlation coefficient ranges between -1 and +1.
- +1 → Perfect positive relationship
- 0 → No relationship
- -1 → Perfect negative relationship
Calculating Correlation in R
R provides the cor() function to calculate correlation.
Both vectors must be numeric and of the same length.
x <- c(10, 20, 30, 40, 50)
y <- c(15, 25, 35, 45, 55)
cor(x, y)
Types of Correlation
There are different methods to calculate correlation depending on data type.
- Pearson – Linear relationship (default)
- Spearman – Rank-based relationship
- Kendall – Ordinal data relationship
cor(x, y, method = "spearman")
What Is Regression?
Regression explains how one variable changes based on another variable.
It is commonly used for prediction and trend analysis.
Simple Linear Regression
Simple linear regression models the relationship between one independent variable and one dependent variable.
The equation follows a straight-line pattern.
model <- lm(y ~ x)
model
Understanding Regression Output
The regression model provides important information such as:
- Intercept
- Slope
- Residuals
The slope indicates how much the dependent variable changes when the independent variable increases by one unit.
summary(model)
Making Predictions
Regression models can be used to predict new values.
This is one of the most practical uses of regression analysis.
new_data <- data.frame(x = c(60, 70))
predict(model, new_data)
Difference Between Correlation and Regression
- Correlation measures relationship strength
- Regression models cause-and-effect trends
- Correlation does not imply prediction
- Regression supports prediction
Why This Matters
Correlation and regression are used widely in data analysis, business forecasting, research, and machine learning.
They form the foundation for more advanced statistical models.
📝 Practice Exercises
Exercise 1
Calculate correlation between two numeric vectors.
Exercise 2
Create a simple linear regression model.
Exercise 3
Display the summary of a regression model.
Exercise 4
Use a regression model to predict new values.
✅ Practice Answers
Answer 1
a <- c(2, 4, 6, 8)
b <- c(3, 6, 9, 12)
cor(a, b)
Answer 2
model <- lm(b ~ a)
model
Answer 3
summary(model)
Answer 4
new_points <- data.frame(a = c(10, 12))
predict(model, new_points)
What’s Next?
In the next lesson, you will explore Time Series Analysis, which focuses on data collected over time.