Populations and Samples
The study of statistics revolves around the study of data sets.
This lesson describes two important types of data sets -
populations and samples.
Along the way, we'll introduce simple random sampling, the main method
used in this tutorial to select samples.
Population vs Sample
The main difference between a population and sample has to do with how
observations are assigned to the data set.
- A population includes all of the
elements
from a set of data.
- A sample consists one or more observations drawn from the
population.
Depending on the sampling method, a sample can have fewer observations than
the population, the same number of observations, or more observations.
More than one sample can be derived from the same population.
Other differences have to do with nomenclature, notation, and computations.
For example,
- A measurable characteristic of a population, such as a
mean or
standard deviation, is called a parameter; but
a measurable characteristic of a sample is called a
statistic.
- We will see in future lessons that the mean of a
population is denoted by the symbol μ; but the mean of a sample is denoted
by the symbol x.
- We will also learn in future lessons
that the formula for the standard deviation of a population is different from
the formula for the standard deviation of a sample.
What is Simple Random Sampling?
A sampling method
is a procedure for selecting sample elements from a population.
Simple random sampling refers to a sampling method that has the
following properties.
-
The population consists of N
objects.
-
The sample consists of n
objects.
-
All possible samples of n objects are equally likely to occur.
An important benefit of simple random sampling is that it allows researchers to use
statistical methods to analyze sample results. For example, given a simple random
sample, researchers can use statistical methods to define a
confidence interval around a sample mean. Statistical
analysis is not appropriate when non-random sampling methods are used.
There are many ways to obtain a simple random sample. One way would be the
lottery method. Each of the N population members is assigned a unique
number. The numbers are placed in a bowl and thoroughly mixed. Then, a
blind-folded researcher selects n numbers. Population members having the
selected numbers are included in the sample.
Random Number Generator
In practice, the lottery method described above can be cumbersome, particularly
with large sample sizes. As an alternative, use Stat Trek's Random Number
Generator. With the Random Number Generator, you can select up to 1000 random
numbers quickly and easily. The Random Number Generator can found in the Stat Trek
main menu under the Stat Tools tab. Or you can tap the button below.
Random Number Generator
Sampling With Replacement and Without Replacement
Suppose we use the lottery method described above to select a simple random
sample. After we pick a number from the bowl, we can put the number aside or we
can put it back into the bowl. If we put the number back in the bowl, it may be
selected more than once; if we put it aside, it can selected only one time.
When a population element can be selected more than one time, we are
sampling with replacement.
When a population element can be selected only
one time, we are sampling without replacement.
Test Your Understanding
Problem 1
Which of the following statements are true?
I. The mean of a population is denoted by x.
II. Sample size is never bigger than population size.
III. The population mean is a statistic.
(A) I only.
(B) II only.
(C) III only.
(D) All of the above.
(E) None of the above.
Solution
The correct answer is (E), none of the above.
The mean of a population is denoted by μ; not x. When
sampling with replacement, sample size can be greater than population size. And
the population mean is a parameter; the sample mean is a statistic.