Descriptive and Multivariate Statistics
Statistics are numbers that help you make sense of and use the data collected by your survey.
Statistics in Snap can be used to:
- Summarise the data – Summary Statistics with Counts and percents
- Describe the distribution of data values – Descriptive Statistics with Means, Medians and Modes
- Make inferences about the population based on the sample data – Inferential Statistics, including Confidence Intervals and Significance tests – z, t, u, and chi-squared
Further statistical techniques can also be used to explore your data for patterns of responses. These Exploratory statistical techniques include Factor Analysis and Cluster Analysis.
For your reference, here are a few key points of the various descriptive statistical tests in Snap:
1. Measures of typical or central or most common value in distribution
Mode - The mode of a distribution is the most frequent or most popular item. If two values tie for the mode, Snap will choose the lower.
Median - The midpoint or 50% through a range of values. To calculate the median, the items of the distribution are arranged in order of magnitude starting with either the smallest or the largest, Then if the number of items is odd, the median is the value of the middle item. If the number of items is even, the median is the mean of the two middle items.
Mean - This is often called the average, and is defined as the sum of the items divided by the number of items. Missing values are excluded from the calculation.
2. Measures of Spread of values in distribution
1st Quartile – 25% through a range of values
3rd Quartile – 75% through a range of values.
Minimum – The minimum is the smallest value of the distribution.
Maximum – The maximum is the largest value of the distribution.
Range – The range shows the spread of the distribution and is calculated by subtracting the smallest value (minimum) from the largest value (maximum).
Standard Deviation – The standard deviation is a measure of the spread of values in a distribution. It gives an indication of how much the values deviate from the mean. Thus, a distribution with a large range would have a larger standard deviation than one with a small range
Variance – The variance is another measure of the spread of values in a distribution and is calculated as the square of the standard deviation.
Snap calculates the standard deviation and variance by assuming the data represents a sample rather than an entire population. I.e. The divisor is n rather than n-1.
3. Measures of the shape of a distribution of values
Skewness – A distribution that is not symmetrical but has more cases toward one end of the distribution than the other is called skewed.
The measures of central tendency (mean, mode and median) can vary considerably. If the mean is larger than the mid point of the range (the median) and the most frequently occurring value (the mode), the sample is said to be positively skewed.
If the mean is smaller than the mid point of the range (the median) and the most frequently occurring value (the mode), the sample is said to be negatively skewed.
Kurtosis – Kurtosis also gives an indication of the shape of a distribution in the form of the extent to which, for a given standard deviation, the data clusters around a central point. A negative value for kurtosis indicates a distribution that has more data points closer to the mean than normal. A positive value for kurtosis indicates a flatter centre giving a more widely dispersed distribution.