Statistics Notation
This web page describes how symbols are used on the Stat Trek web
site to represent numbers,
variables,
parameters,
statistics, etc.
Capitalization
In general, capital letters refer to population attributes
(i.e., parameters); and lower-case letters refer to sample attributes
(i.e., statistics). For example,
- P refers to a population proportion;
and p, to a sample proportion.
- X refers to a set of population elements; and
x, to a set of sample elements.
- N refers to population size; and
n, to sample size.
Greek vs. Roman Letters
Like capital letters, Greek letters refer to population attributes.
Their sample counterparts, however, are usually Roman letters.
For example,
- μ refers to a population mean;
and x, to a sample mean.
- σ refers to the standard deviation of a population; and
s, to the standard deviation of a sample.
Population Parameters
By convention, specific symbols represent certain population parameters.
For example,
- μ refers to a population mean.
- σ refers to the standard deviation of a population.
- σ2 refers to the variance of a population.
- P refers to the proportion of population
elements
that have a particular attribute.
- Q refers to the proportion of population elements
that do not have a particular attribute, so
Q = 1 - P.
- ρ is the population correlation coefficient, based on all of
the elements from a population.
- N is the number of elements in a population.
Sample Statistics
By convention, specific symbols represent certain sample statistics.
For example,
- x refers to a sample mean.
- s refers to the standard deviation of a sample.
- s2 refers to the variance of a sample.
- p refers to the proportion of sample elements
that have a particular attribute.
- q refers to the proportion of sample elements
that do not have a particular attribute, so
q = 1 - p.
- r is the sample correlation coefficient, based on all of
the elements from a sample.
- n is the number of elements in a sample.
Simple Linear Regression
- Β0 is the intercept constant in a
population regression line.
- Β1 is the regression coefficient (i.e., slope)
in a population regression line.
- R2 refers to the coefficient of determination.
- b0 is the intercept constant in a sample
regression line.
- b1 refers to the regression coefficient in a
sample regression line (i.e., the slope).
- sb1 refers to the refers to the
standard error of the slope of a regression line.
Probability
Counting
- n! refers to the
factorial value of n.
- nPr refers to the number of
permutations of n things taken r
at a time.
- nCr refers to the number of
combinations of n things taken r
at a time.
Set Theory
Hypothesis Testing
Random Variables
- Z or z refers to a
standardized score, also known as a z-score.
- zα refers to the
standardized score that has a cumulative probability
equal to 1 - α.
- tα refers to the
t statistic that has a cumulative probability
equal to 1 - α.
- fα refers to a
f statistic that has a cumulative probability
equal to 1 - α.
- fα(v1, v2)
is a
f statistic with a cumulative probability
of 1 - α, and v1 and
v2 degrees of freedom.
- Χ2 refers to a chi-square statistic.
Special Symbols
Throughout the site, certain symbols have special meanings.
For example,
- Σ is the summation symbol, used to compute sums over
a range of values.
- Σx or Σxi refers
to the sum of a set of n observations. Thus,
Σxi = Σx =
x1 + x2 + . . . +
xn.
- sqrt refers to the square root function. Thus,
sqrt(4) = 2 and sqrt(25) = 5.
- Var(X) refers to the variance of the random variable X.
- SD(X) refers to the standard deviation of the random variable
X.
- SE refers to the
standard error of a statistic.
- ME refers to the
margin of error.
- DF refers to the
degrees of freedom.