Hypothesis Test of a Proportion (Small Sample)
This lesson explains how to test a hypothesis about a proportion when a simple random sample has fewer than 10 successes or 10 failures - a situation that often occurs with small samples. (In a previous lesson, we showed how to conduct a hypothesis test for a proportion when a simple random sample includes at least 10 successes and 10 failures.)
Key Steps
The approach described in this lesson is appropriate, as long as the sample includes at least one success and one failure. The key steps are:
- Formulate the hypotheses to be tested. This means stating the null hypothesis and the alternative hypothesis.
- Determine the sampling distribution of the proportion. If the sample proportion is the outcome of a binomial experiment, the sampling distribution will be binomial. If it is the outcome of a hypergeometric experiment, the sampling distribution will be hypergeometric.
- Specify the significance level. (Researchers often set the significance level equal to 0.05 or 0.01, although other values may be used.)
- Based on the hypotheses, the sampling distribution, and the significance level, define the region of acceptance.
- Test the null hypothesis. If the sample proportion falls within the region of acceptance, do not reject the null hypothesis; otherwise, reject the null hypothesis.
The following examples illustrate how to test hypotheses with small samples. The first example involves a binomial experiment; and the second example, a hypergeometric experiment.
Example 1: Sampling With Replacement
Suppose an urn contains 30 marbles. Some marbles are red, and the rest are green. A researcher hypothesizes that the urn contains 15 or more red marbles. The researcher randomly samples five marbles, with replacement, from the urn. Two of the selected marbles are red, and three are green. Based on the sample results, should the researcher reject the null hypothesis? Use a significance level of 0.20.
Solution: There are five steps in conducting a hypothesis test, as described in the previous section. We work through each of the five steps below:
-
Formulate hypotheses. The first step is to state the null
hypothesis and an alternative hypothesis.
Null hypothesis: P >= 0.50
Alternative hypothesis: P < 0.50
Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected only if the sample proportion is too small. -
Determine sampling distribution. Since we sampled with
replacement, the sample proportion can be considered an outcome of a binomial
experiment. And based on the null hypothesis, we assume that at least 15 of 30
marbles are red. Thus, the true population proportion is assumed to be 15/30 or
0.50.
Given those inputs (a binomial distribution where the true population proportion is equal to 0.50), the sampling distribution of the proportion can be determined. It appears in the table below, which shows individual probabilities for single events and cumulative probabilities for multiple events. (Elsewhere on this website, we showed how to compute binomial probabilities that form the body of the table.)
Number of red marbles in sample Sample prop Prob Cumulative probability 0 0.0 0.03125 0.03125 1 0.2 0.15625 0.1875 2 0.4 0.3125 0.5 3 0.6 0.3125 0.8125 4 0.8 0.15625 0.96875 5 1.0 0.03125 1.00 - Specify significance level. The significance level was set at 0.20. (This means that the probability of making a Type I error is 0.20, assuming that the null hypothesis is true.)
-
Define the region of acceptance. From the sampling
distribution (see above table), we see that it is not possible to define a
region of acceptance for which the significance level is exactly 0.20.
However, we can define a region of acceptance for which the significance level would be no more than 0.20. From the table, we see that if the true population proportion is equal to 0.50, we would be very unlikely to pick 0 or 1 red marble in our sample of 5 marbles. The probability of selecting 1 or 0 red marbles would be 0.1875. Therefore, if we let the significance level equal 0.1875, we can define the region of rejection as any sampled outcome that includes only 0 or 1 red marble (i.e., a sampled proportion equal to 0 or 0.20). We can define the region of acceptance as any sampled outcome that includes at least 2 red marbles. This is equivalent to a sampled proportion that is greater than or equal to 0.40.
- Test the null hypothesis. Since the sample proportion (0.40) is within the region of acceptance, we cannot reject the null hypothesis.
Example 2: Sampling Without Replacement
The Acme Advertising company has 25 clients. Account executives at Acme claim that 80 percent of these clients are very satisfied with the service they receive. To test that claim, Acme's CEO commissions a survey of 10 clients. Survey participants are randomly sampled, without replacement, from the client population. Six of the ten sampled customers (i.e., 60 percent) say that they are very satisfied. Based on the sample results, should the CEO accept or reject the hypothesis that 80 percent of Acme's clients are very satisfied. Use a significance level of 0.10.
Solution: There are five steps in conducting a hypothesis test, as described in the previous section. We work through each of the five steps below:
-
Formulate hypotheses. The first step is to state the null
hypothesis and an alternative hypothesis.
Null hypothesis: P >= 0.80
Alternative hypothesis: P < 0.80
Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected only if the sample proportion is too small. -
Determine sampling distribution. Since we sampled without
replacement, the sample proportion can be considered an outcome of a
hypergeometric experiment. And based on the null hypothesis, we assume that at
least 80 percent of the 25 clients (i.e. 20 clients) are very satisfied.
Given those inputs (a hypergeometric distribution where 20 of 25 clients are very satisfied), the sampling distribution of the proportion can be determined. It appears in the table below, which shows individual probabilities for single events and cumulative probabilities for multiple events. (Elsewhere on this website, we showed how to compute hypergeometric probabilities that form the body of the table.)
Number of satisfied clients in sample Sample prop Prob Cumulative probability 4 or less 0.4 or less 0.00 0.00 5 0.5 0.00474 0.00474 6 0.6 0.05929 0.06403 7 0.7 0.23715 0.30119 8 0.8 0.38538 0.68656 9 0.9 0.25692 0.94348 10 1.0 0.05652 1.00 - Specify significance level. The significance level was set at 0.10. (This means that the probability of making a Type I error is 0.10, assuming that the null hypothesis is true.)
-
Define the region of acceptance. From the sampling
distribution (see above table), we see that it is not possible to define a
region of acceptance for which the significance level is exactly 0.10.
However, we can define a region of acceptance for which the significance level would be no more than 0.10. From the table, we see that if the true proportion of very satisfied clients is equal to 0.80, we would be very unlikely to have fewer than 7 very satisfied clients in our sample. The probability of having 6 or fewer very satisfied clients in the sample would be 0.064. Therefore, if we let the significance level equal 0.064, we can define the region of rejection as any sampled outcome that includes 6 or fewer very satisfied customers. We can define the region of acceptance as any sampled outcome that includes 7 or more very satisfied customers. This is equivalent to a sample proportion that is greater than or equal to 0.70.
- Test the null hypothesis. Since the sample proportion (0.60) is outside the region of acceptance, we cannot accept the null hypothesis at the 0.064 level of significance.