Hypothesis Test for Regression Slope
This lesson describes how to conduct a hypothesis test to determine
whether there is a significant linear relationship between
an independent variable X and a dependent variable
Y.
The test focuses on the
slope
of the
regression
line
Y = Β0 + Β1X
where Β0 is a constant,
Β1 is the slope (also called the regression coefficient),
X is the value of the independent variable, and Y is the
value of the dependent variable.
If we find that the slope of the regression line is significantly different
from zero, we will conclude that there is a significant relationship
between the independent and dependent variables.
Test Requirements
The approach described in this lesson is valid whenever the
standard requirements for simple linear regression are met.
- The dependent variable Y has a linear relationship
to the independent variable X.
- For each value of X, the probability distribution of Y has the
same standard deviation σ.
- For any given value of X,
- The Y values are independent.
- The Y values are roughly normally distributed
(i.e.,
symmetric and
unimodal). A little
skewness
is ok if the sample size is large.
Previously, we described
how to verify that regression requirements are met.
The test procedure consists of four steps: (1) state the hypotheses,
(2) formulate an analysis plan, (3) analyze sample data, and
(4) interpret results.
State the Hypotheses
If there is a significant linear relationship between the independent
variable X and the dependent variable
Y, the slope will not equal zero.
Ho: Β1 = 0
Ha: Β1 ≠ 0
The
null hypothesis states that the slope is equal to zero,
and the alternative hypothesis states that the slope is not equal
to zero.
Formulate an Analysis Plan
The analysis plan describes
how to use sample data to accept or reject the null
hypothesis. The plan should specify the following elements.
- Significance level. Often, researchers choose
significance levels
equal to
0.01, 0.05, or 0.10; but any value between 0 and
1 can be used.
- Test method. Use a linear regression t-test (described in the
next section)
to determine whether the slope of the regression line differs
significantly from zero.
Analyze Sample Data
Using sample data, find the
standard error of the slope, the slope of the regression line, the
degrees of freedom, the
test statistic, and the P-value associated with the test statistic.
The approach described in this section is illustrated in the
sample problem at the end of this lesson.
Interpret Results
If the sample findings are unlikely, given
the null hypothesis, the researcher rejects the null hypothesis.
Typically, this involves comparing the P-value to the
significance level,
and rejecting the null hypothesis when the P-value is less than
the significance level.
Test Your Understanding
Problem
The local utility company surveys 101 randomly selected
customers. For each survey participant, the company collects
the following: annual electric bill (in dollars) and home size
(in square feet). Output from a regression analysis
appears below.
Regression equation:
Annual bill = 0.55 * Home size + 15
|
Predictor |
Coef |
SE Coef |
T |
P |
Constant |
15 |
3 |
5.0 |
0.00 |
Home size |
0.55 |
0.24 |
2.29 |
0.01 |
Is there a significant linear relationship between annual bill and
home size? Use a 0.05 level of significance.
Solution
The solution to this problem takes four steps:
(1) state the hypotheses, (2) formulate an analysis plan,
(3) analyze sample data, and (4) interpret results.
We work through those steps below:
- State the hypotheses. The first step is to
state the
null hypothesis and an alternative hypothesis.
Ho: The slope of the regression line is equal
to zero.
Ha: The slope of the regression line is not
equal to zero.
If the relationship between home size and electric bill is
significant, the slope will not equal zero.
- Formulate an analysis plan. For this analysis,
the significance level is 0.05. Using sample data, we will
conduct a linear regression t-test
to determine whether the slope of the regression line differs
significantly from zero.
- Analyze sample data. To apply the linear
regression t-test to sample data, we require the
standard error of the slope, the slope of the regression
line, the degrees of freedom,
the t statistic test statistic, and the P-value of the test
statistic.
We get the slope (b1) and the standard error (SE)
from the regression output.
b1 = 0.55
SE = 0.24
We compute the degrees of
freedom and the t statistic test statistic,
using the following equations.
DF = n - 2 = 101 - 2 = 99
t = b1/SE = 0.55/0.24 = 2.29
where
DF is the degrees of freedom,
n is the number of observations in the sample,
b1 is the slope of the regression line, and
SE is the standard error of the slope.
Based on the
t statistic test statistic and the
degrees of freedom, we determine the
P-value. The P-value is the probability that a t statistic
having 99 degrees of freedom is more extreme than 2.29.
Since this is a
two-tailed test, "more extreme" means greater than 2.29
or less than -2.29.
We use the
t Distribution Calculator
to find P(t > 2.29) = 0.0121 and P(t < -2.29) = 0.0121.
Therefore, the P-value is 0.0121 + 0.0121 or 0.0242.
- Interpret results. Since the P-value (0.0242) is
less than the significance level (0.05), we cannot accept the
null hypothesis.
Note: If you use this approach on an exam, you may also want to mention
that this approach is only appropriate when the
standard requirements for simple linear regression are satisfied.