Mathematics Lesson 72 – Hypothesis Testing Basics | Dataplexa

Hypothesis Testing Basics

In real life and in data analysis, we often need to make decisions based on limited information.

Hypothesis testing is a statistical framework that helps us decide whether an assumption about a population is reasonable based on sample data.

This lesson is extremely important for school exams, competitive exams, research, business analytics, data science, and machine learning.


Why Hypothesis Testing Is Needed

We usually cannot observe an entire population. Instead, we take a sample and try to draw conclusions.

But samples vary due to randomness. Hypothesis testing helps us decide whether observed differences are real or just due to chance.

Without hypothesis testing, decisions would be based on guesswork.


What Is a Statistical Hypothesis?

A statistical hypothesis is a statement about a population parameter, such as mean, proportion, or variance.

It is something we want to test using data.

Hypotheses are always stated before analyzing the data.


Two Types of Hypotheses

Hypothesis testing always involves two competing statements:

  • Null hypothesis (H₀)
  • Alternative hypothesis (H₁ or Hₐ)

Understanding the difference between these two is critical for exams and applications.


Null Hypothesis (H₀)

The null hypothesis represents the default or existing belief.

It usually states that:

  • There is no effect
  • There is no difference
  • The status quo is true

In hypothesis testing, we assume H₀ is true initially.


Alternative Hypothesis (H₁)

The alternative hypothesis represents what we want to investigate.

It states that:

  • There is an effect
  • There is a difference
  • The claim differs from the null

Evidence from data is used to support H₁.


Example: Simple Hypothesis Setup

Suppose a company claims that the average lifetime of a battery is 10 hours.

We can define:

  • H₀: Mean battery life = 10 hours
  • H₁: Mean battery life ≠ 10 hours

We then collect data to test this claim.


Types of Alternative Hypotheses

Alternative hypotheses can be:

  • Two-tailed (≠)
  • Right-tailed (>)
  • Left-tailed (<)

The choice depends on the research question.


Two-Tailed Test

A two-tailed test checks for differences in both directions.

Example:

  • H₀: μ = 50
  • H₁: μ ≠ 50

We are interested in both higher and lower values.


One-Tailed Test

A one-tailed test checks for deviation in only one direction.

Examples:

  • H₁: μ > 50 (right-tailed)
  • H₁: μ < 50 (left-tailed)

One-tailed tests are used when direction matters.


Test Statistic (Core Idea)

A test statistic is a numerical value calculated from sample data.

It measures how far the sample result is from what we expect under the null hypothesis.

Common test statistics include z-statistic and t-statistic.


Sampling Distribution Under H₀

When H₀ is true, the test statistic follows a known distribution.

This allows us to compute probabilities and make decisions objectively.

This idea is closely linked to the Central Limit Theorem.


Significance Level (α)

The significance level, denoted by α, is the maximum probability of rejecting H₀ when it is actually true.

Common values of α are:

  • 0.05 (5%)
  • 0.01 (1%)

This value is chosen before testing.


Meaning of Significance Level

If α = 0.05, we are willing to accept a 5% risk of making a wrong rejection.

Smaller α means stricter evidence is required.

This balance is important in real-world decisions.


p-Value (Very Important)

The p-value is the probability of obtaining a result as extreme as the observed one, assuming the null hypothesis is true.

It measures the strength of evidence against H₀.

Lower p-value → stronger evidence.


Decision Rule Using p-Value

The decision rule is:

  • If p-value ≤ α → Reject H₀
  • If p-value > α → Fail to reject H₀

We never say “accept H₀”, we say “fail to reject”.


Critical Region Concept

The critical region is the set of values of the test statistic that lead to rejection of H₀.

It depends on:

  • Significance level
  • Type of test (one-tailed or two-tailed)

This concept is often tested in exams.


Errors in Hypothesis Testing

Because decisions are based on samples, errors are possible.

There are two types of errors:

  • Type I error
  • Type II error

Understanding them is critical.


Type I Error

A Type I error occurs when we reject H₀ even though it is true.

Probability of Type I error = α.

This is a false alarm.


Type II Error

A Type II error occurs when we fail to reject H₀ even though it is false.

This means missing a real effect.

Reducing Type II error usually requires larger samples.


Hypothesis Testing in Real Life

Examples include:

  • Medical drug effectiveness testing
  • Quality control in manufacturing
  • A/B testing in products

Decisions must balance risk and evidence.


Hypothesis Testing in Business

Businesses use hypothesis testing to:

  • Compare marketing strategies
  • Test pricing changes
  • Evaluate process improvements

Data-driven decisions reduce guesswork.


Hypothesis Testing in Data Science

In data science:

  • A/B testing is hypothesis testing
  • Feature impact is tested statistically

p-values help decide if results are meaningful.


Hypothesis Testing in Machine Learning

Machine learning uses hypothesis testing in:

  • Model comparison
  • Validation of improvements
  • Feature significance analysis

It supports evidence-based model selection.


Common Mistakes to Avoid

  • Confusing p-value with probability that H₀ is true
  • Changing α after seeing data
  • Ignoring practical significance

Statistical significance does not always mean practical importance.


Practice Questions

Q1. What does the null hypothesis represent?

The default assumption or status quo

Q2. When do we reject H₀ using p-value?

When p-value is less than or equal to α

Q3. What is a Type I error?

Rejecting a true null hypothesis

Quick Quiz

Q1. Does failing to reject H₀ mean H₀ is true?

No

Q2. Is α chosen before or after data collection?

Before

Quick Recap

  • Hypothesis testing is a decision-making framework
  • H₀ represents no effect, H₁ represents a claim
  • p-value measures evidence against H₀
  • α controls Type I error
  • Used widely in exams, business, DS, and ML

With hypothesis testing basics understood, you are now ready to study Type I and Type II Errors in deeper detail.