Stat Trek

Teach yourself statistics

Stat Trek

Teach yourself statistics

How to Choose Between Normal and t Distribution

Choosing between a normal distribution and a t distribution for statistical analysis depends on the nature of the data. Factors to consider include the type of statistic (e.g., proportion, sample mean), sample size, and knowledge of population variance. By following these guidelines, you can select the appropriate distribution for your analysis.

When to Use the Normal Distribution

The normal distribution can be used for statistical analysis of sample proportions and sample means. However, certain conditions need to be met. Here's when it is safe to use the normal distribution.

Sample Proportions

Use the normal distribution with a sample proportion when all of the following conditions are true:

  • Population size (N) is at least 10 times sample size (n).
  • The sampling method is simple random sampling.
  • n * p ≥ 10, where p is the sample proportion.
  • n * (1 - p) ≥ 10.

Note: When the sample proportion p equals 0.5, the last two conditions require that at least 20 observations be sampled from a population for the sampling distribution to be approximatley normal. When the sample proportion p is more extreme than 0.5, more observations are required.

Sample Means

Use the normal distribution with a sample mean when both of the following conditions are true:

  • Sample size is large. When the sample size is large (n ≥ 30), the central limit theorem ensures the sampling distribution of a mean is approximately normal, even if the population distribution is not.
  • Population variance is known. If the population variance (σ) is known and sample size is large, the test statistic follows a normal distribution.

When to Use the t Distribution

Use the t distribution with a sample mean when any of the following conditions are true:

  • The population distribution is normal.
  • The sample size is at least 15; and the sample has no outliers and exhibits no skewness.
  • The sample size is at least 30; and the sample has little skewness and no outliers.
  • The sample size is greater than 30.

The t distribution has two advantages over the normal distribution.

  • When sample size is small (n < 30), the distribution of the sample mean is not well approximated by the normal distribution. The t distribution more accurately represents the distribution of the mean.
  • If the population variance is unknown and you are estimating it using the sample standard deviation (s), the sample mean follows a t distribution - not a normal distribution.

If you use the t distribution, you will have to specify degrees of freedom. Guidelines for calculating degrees of freedom are described at https://stattrek.com/statistics/degrees-of-freedom.

Other Considerations

When the population distribution is not heavily skewed and does not have outliers, the t distribution is often the safest choice.

  • Robustness. If the sample size is large, the normal and t distributions give nearly identical results, so the choice between them becomes less critical.
  • Software defaults. Some statistical software tools automatically choose the t distribution when the population variance is unknown, regardless of sample size.

If the population is characterized by heavy skewness and/or extreme outliers, a larger sample size will be needed to ensure that the sampling distribution of the mean is approximately normal. Lacking sufficient sample, consider alternative methods of analysis (e.g., non-parametric tests, transformations of the data).