Teach yourself statistics

Teach yourself statistics

Student's t-Distribution

The t-distribution (aka, Student’s t-distribution) is a probability distribution that is used to estimate population parameters when the sample size is small and/or when the population variance is unknown.

Why Use the t-Distribution?

According to the central limit theorem, the sampling distribution of a sample mean will follow a normal distribution, as long as the sample size is sufficiently large. Therefore, when we know the standard deviation of the population, we can compute a z-score, and use the normal distribution to evaluate probabilities with the sample mean.

But sample sizes are sometimes small, and often we do not know the standard deviation of the population. When either of these problems occur, statisticians rely on the distribution of the t statistic (also known as the t score), whose values are given by:

t = [ x - μ ] / [ s / sqrt( n ) ]

where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size. The distribution of the t statistic is called the t distribution or the Student t-distribution.

The t-distribution allows us to conduct statistical analyses on certain data sets that are not appropriate for analysis, using the normal distribution.

Degrees of Freedom

There are actually many different t-distributions. The particular form of the t-distribution is determined by its degrees of freedom. The degrees of freedom refers to the number of independent observations in a set of data.

When estimating a mean score or a proportion from a single sample, the number of independent observations is equal to the sample size minus one. Hence, the distribution of the t statistic from samples of size 8 would be described by a t-distribution having 8 - 1 or 7 degrees of freedom. Similarly, the distribution of the t statistic from samples of size 16 would be described by a t-distribution having 16 - 1 or 15 degrees of freedom.

For other applications, the degrees of freedom may be calculated differently. We will describe those computations as they come up.

Properties of the t-Distribution

The t-distribution has the following properties:

The mean of the distribution is equal to 0 .
The variance is equal to v / ( v - 2 ), where v is the degrees of freedom (see last section) and v > 2.
The variance is always greater than 1, although it is close to 1 when there are many degrees of freedom.
Like the normal distribution, the t distribution is bell-shaped, symmetric around the mean.
Though they are similar in shape, the t distribution is shorter and wider than the normal distribution.
As the number of degrees of freedom increases, the t-distribution more closely resembles the normal distribution.

When to Use the t-Distribution

The t-distribution can be used to analyze a sample mean when the population distribution is approximately normal (i.e., bell-shaped). It is reasonable to assume that the sampling distribution of a mean will be bell-shaped when any of the following conditions apply.

The population distribution is normal.
The sample size is at least 15; and the sample has no outliers and exhibits no skewness.
The sample size is at least 30; and the sample has little skewness and no outliers.
The sample size is greater than 30.

Note: The conditions listed above are guidelines. If the population is characterized by heavy skewness and/or extreme outliers, a larger sample size will be needed.

Probability and Student's' t-Distribution

When a sample of size n is drawn from a population having a normal (or nearly normal) distribution, the sample mean can be transformed into a t statistic, using the equation presented at the beginning of this lesson. We repeat that equation below:

t = [ x - μ ] / [ s / sqrt( n ) ]

where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, n is the sample size, and degrees of freedom are equal to n - 1.

The t statistic produced by this transformation can be associated with a unique cumulative probability. This cumulative probability represents the likelihood of finding a sample mean less than or equal to x, given a random sample of size n.

To find the probability associated with a t statistic, use a t-distribution table (found in the appendix of most introductory statistics texts), a graphing calculator, an online t-distribution calculator, like Stat Trek's T Distribution Calculator.

t-Distribution Calculator

The t-distribution Calculator solves common statistics problems, based on the t distribution. The calculator computes cumulative probabilities, based on simple inputs. Clear instructions guide you to an accurate solution, quickly and easily. If anything is unclear, frequently-asked questions and sample problems provide straightforward explanations. The calculator is free. It can found in the Stat Trek main menu under the Stat Tools tab. Or you can tap the button below.

T Distribution Calculator

Notation and t Statistics

Statisticians use t_α to represent the t statistic that has a cumulative probability of (1 - α). For example, suppose we were interested in the t statistic having a cumulative probability of 0.95. In this example, α would be equal to (1 - 0.95) or 0.05. We would refer to the t statistic as t_0.05

Of course, the value of t_0.05 depends on the number of degrees of freedom. For example, with 2 degrees of freedom, t_0.05 is equal to 2.92; but with 20 degrees of freedom, t_0.05 is equal to 1.725.

Note: Because the t-distribution is symmetric about a mean of zero, the following is true.

t_α = -t_{1 - alpha} And t_{1 - alpha} = -t_α

Thus, if t_0.05 = 2.92, then t_0.95 = -2.92.

Test Your Understanding

Problem 1

Acme Corporation manufactures light bulbs. The CEO claims that an average Acme light bulb lasts 300 days. A researcher randomly selects 15 bulbs for testing. The sampled bulbs last an average of 290 days, with a standard deviation of 50 days. If the CEO's claim were true, what is the probability that 15 randomly selected bulbs would have an average life of no more than 290 days?

Note: There are two ways to solve this problem, using the T Distribution Calculator. Both approaches are presented below. Solution A is the traditional approach. It requires you to compute the t statistic, based on data presented in the problem description. Then, you use the t-distribution Calculator to find the probability. Solution B is easier. You simply enter the problem data into the t-distribution Calculator. The calculator computes a t statistic "behind the scenes", and displays the probability. Both approaches come up with exactly the same answer.

Solution A

The first thing we need to do is compute the t statistic, based on the following equation:

t = [ x - μ ] / [ s / sqrt( n ) ]
t = ( 290 - 300 ) / [ 50 / sqrt( 15) ]
t = -10 / 12.909945 = - 0.7745966

where x is the sample mean, μ is the population mean, s is the standard deviation of the sample, and n is the sample size.

Now, we are ready to use the T Distribution Calculator. Since we know the t statistic, we select "t score" from the Random Variable dropdown box. Then, we enter the following data:

The degrees of freedom are equal to 15 - 1 = 14.
The t statistic is equal to - 0.7745966.

The calculator displays the cumulative probability: 0.22573.

Screenshot of t-distribution Calculator

Hence, if the true bulb life were 300 days, there is about a 22.6% chance that the average bulb life for 15 randomly selected bulbs would be less than or equal to 290 days.

Solution B:

This time, we will not compute the t statistic manually; the T Distribution Calculator will do that work for us. We select "mean score" from the Random Variable dropdown box. Then, we enter the following data:

The degrees of freedom are equal to 15 - 1 = 14.
Assuming the CEO's claim is true, the population mean equals 300.
The sample mean equals 290.
The standard deviation of the sample is 50.

The calculator displays the cumulative probability: 0.22573.

Screenshot of t-distribution Calculator

Hence, there is a 22.6% chance that the average sampled light bulb will burn out within 290 days.

Problem 2

Suppose scores on an IQ test are normally distributed, with a population mean of 100. Suppose 20 people are randomly selected and tested. The standard deviation in the sample group is 15. What is the probability that the average test score in the sample group will be at most 110?

Solution:

To solve this problem, we will not compute the t statistic; the T Distribution Calculator will do that work for us. We select "mean score" from the Random Variable dropdown box. Then, we enter the following data:

The degrees of freedom are equal to 20 - 1 = 19.
The population mean equals 100.
The sample mean equals 110.
The standard deviation of the sample is 15.

We enter these values into the T Distribution Calculator.

Screenshot of t-distribution Calculator

The calculator displays the cumulative probability: 0.99616. Hence, there is a 99.6% chance that the sample average will be no greater than 110.

Last lesson Next lesson