Stat Trek

Teach yourself statistics

Stat Trek

Teach yourself statistics

How to Test Statistical Hypotheses

This lesson describes a general procedure that can be used to test statistical hypotheses.

How to Conduct Hypothesis Tests

All hypothesis tests are conducted the same way. The researcher states a hypothesis to be tested, formulates an analysis plan, analyzes sample data according to the plan, and accepts or rejects the null hypothesis, based on results of the analysis.

  • State the hypotheses. Every hypothesis test requires the analyst to state a null hypothesis and an alternative hypothesis. The hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true, the other must be false; and vice versa.
  • Formulate an analysis plan. The analysis plan describes how to use sample data to accept or reject the null hypothesis. It should specify the following elements.
    • Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
    • Test method. Typically, the test method involves a test statistic and a sampling distribution. Computed from sample data, the test statistic might be a mean score, proportion, difference between means, difference between proportions, z-score, t statistic, chi-square, etc. Given a test statistic and its sampling distribution, a researcher can assess probabilities associated with the test statistic. If the test statistic probability is less than the significance level, the null hypothesis is rejected.
  • Analyze sample data. Using sample data, perform computations called for in the analysis plan.
    • Test statistic. When the null hypothesis involves a mean or proportion, use either of the following equations to compute the test statistic.

      Test statistic = (Statistic - Parameter) / (Standard deviation of statistic)

      Test statistic = (Statistic - Parameter) / (Standard error of statistic)

      where Parameter is the value appearing in the null hypothesis, and Statistic is the point estimate of Parameter. As part of the analysis, you may need to compute the standard deviation or standard error of the statistic. Previously, we presented common formulas for the standard deviation and standard error. When the parameter in the null hypothesis involves categorical data, you may use a chi-square statistic as the test statistic. Instructions for computing a chi-square test statistic are presented in the lesson on the chi-square goodness of fit test.
    • P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic, assuming the null hypothesis is true.
  • Interpret the results. If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.

Applications of the General Hypothesis Testing Procedure

The next few lessons show how to apply the general hypothesis testing procedure to different kinds of statistical problems.

At this point, don't worry if the general procedure for testing hypotheses seems a little bit unclear. The procedure will be clearer as you see it applied in the next few lessons.

Test Your Understanding

Problem 1

In hypothesis testing, which of the following statements is always true?

I. The P-value is greater than the significance level.
II. The P-value is computed from the significance level.
III. The P-value is the parameter in the null hypothesis.
IV. The P-value is a test statistic.
V. The P-value is a probability.

(A) I only
(B) II only
(C) III only
(D) IV only
(E) V only


The correct answer is (E). The P-value is the probability of observing a sample statistic as extreme as the test statistic. It can be greater than the significance level, but it can also be smaller than the significance level. It is not computed from the significance level, it is not the parameter in the null hypothesis, and it is not a test statistic.