Every survey contains some form of error. Even a complete census of all known members of a population is subject to random error or potential measurement error. There are two major forms of sampling error that might be encountered in a survey: random error and systematic error.
Random error occurs when a particular sample is not representative of the population of interest due to random variation. It can be expressed as the difference between the sample results and the true results. Even if all aspects of the sample are executed properly, the results are still subject to a certain amount of error because of random, chance variation.
A systematic error occurs when something is wrong with the technique being used or when an instrument is not calibrated correctly. This results in an error throughout the sample.
Calculation of sampling error (also called standard error) is based on the standard deviation of the sample: the greater the sample standard deviation, the greater the sampling error. The sampling error is also related to the sample size. The greater your sample size, the smaller the sample error. This error cannot be avoided, only reduced by increasing the sample size.
It is possible to estimate the range of random error at a particular level of confidence. Suppose we surveyed 500 people and found that 65% of them said that vanilla is their favorite ice cream. For a sample of 500, sampling error is 4 percent. This means that we can expect our sample results to be within 4 percentage points of the actual figure for the population — in other words, as high as 69% or as low as 61%. As sample size increases, sampling error decreases. Sampling error is 10% for a sample of 100 and 3% for a sample of 1000.
When the area of the standard normal curve is divided into sections by standard error above and below the mean, the area in each section is a known quantity. The areas above and below the mean can be added together to get the probability of obtaining a value within (plus or minus) a given number of standard errors. There is a 65% chance of a value falling within one standard error of the mean, a 95% chance within two standard errors, and a 99% chance that it will be within three.
Suppose a normal distribution has a mean of 3.75 (highest point on the graph below) and a standard deviation of 0.25. Then 65% of the values will fall between 3.5 and 4.0, as shown below.
Confidence levels are used when two sets of data are being compared. Confidence level, also called significance level, is the likelihood of obtaining a particular result by chance rather than due to a truly significant difference in the two sets of data. The smaller the significance level, the more stringent the test, and the greater the likelihood the conclusion is correct. Common confidence levels are 0.05 (1 chance in 20), 0.01 (1 chance in 100) and 0.001 (1 chance in 1000).
"Q&A: Measurement errors" is a series of questions and answers about statistics written for teachers and students. The questions are ones that students might ask while studying statistics. Teachers can use this Q&A to gain additional knowledge about statistics, or use it in the classroom as outlined below.
• An engagement activity. Use selected questions to start a discussion.
• An inquiry tool. Use selected questions and answers to help students generate questions. Propose a question, such as "What is sampling error?" Have students read the answer to the question and write down 3–5 questions they would like answered as a result of reading the material.
• A source of information. Students can use the questions and answers as part of their research on statistics.
• A form of review. Use the questions as a review at the end of a unit on statistics.
• A follow-up. Have students read the questions and answers to gain additional information about statistics following a related activity.
Online Exploration: Galaxy Hunter