Part 1: Understanding Data

Name
Translation
- the mean is the sum of all data values divided by the number
of values
Function
- the mean provides a measure of the center (also called "central
tendency" and "location") of a data set of quantitative
values
Appropriateness
- the mean is appropriate for quantitative (interval or ratio)
variables when one wants all values to have a "voice"
- the mean may not be appropriate for highly skewed distributions
or for distributions with outliers
Relationships
- the mean is directly related to the data values
- the mean is inversely related to the number of observations
(n) (if sum remains the same)
Other
- the mean is the most commonly used measure of center (central
tendency, location) for quantitative data

Name
Translation
- the variance is [the sum of the squared differences between
each data value and the mean] divided by [the number of data
values]
Function
- the variance provides a measure of the spread (variation)
of a quantitative variable
Appropriateness
- the variance is used for quantitative (interval or ratio)
variables
Relationships
- the variance is directly related to the differences between
the mean and each data value
- the variance is inversely related to the number of observations
(if sum of squared differences remains the same)
Other
- the variance is not very useful as a descriptive statistic
because it is in squared units of the original measurement
- the variance is influenced by outliers because every value
in the data set influences the variance

Name
Translation
- the standard deviation is the square root of [[the sum of
the squared differences between each data value and the mean]
divided by [the number of values]]
Function
- the standard deviation provides a measure of the spread (variation)
of a quantitative variable
Appropriateness
- the standard deviation is used for quantitative (interval
or ratio) variables
Relationships
- the standard deviation is directly related to the differences
between the mean and each data value
- the standard deviation is inversely related to the number
of observations (if sum of squared differences remains the same)
Other
- the standard deviation is useful as a descriptive statistic
because it is expressed in the original units of the measurement
(unlike the variance)
- the standard deviation is influenced by outliers because
it is affected by every data value in the data set

Name
- interquartile range (IQR)
Translation
- IQR is the difference between the third quartile (75th percentile)
and the first quartile (25the percentile)
Function
- IQR is a measure of the spread (variability) of the data
values in a distribution
Appropriateness
- IQR is used for quantitative data, including ordinal data
- IQR is not influenced by extreme values or outliers
Relationships
- IQR is directly related to the difference between the first
and third quartile
- IQR is directly related to the spread of the data values
(the more spread out, the higher the IQR)
Other
- IQR is an appropriate measure of spread when the median (50th
percentile) is used as the measure of center
- IQR contains the middle 50% of data values when centered
on the median

Name
Translation
- a z-score is [the difference between a data value
and the mean of a distribution] divided by [the standard deviation
of the distribution]
Function
- a z-score compares a value to that of a group (distribution)
of values
- a z-score tells how many standard deviations a given
data value is above (positive z score) or below (negative z score)
the mean
- a z-score is used to find areas under the normal curve
Appropriateness
- z-score are used for quantitative (interval or ratio)
data
Relationships
- a z-score is directly related to the difference between
the given value and the mean
- a z-score is inversely related to the standard deviation
Other
- while z-score are used with the normal distribution,
the use of z scores does not necessarily mean that a distribution
is normally distributed