A box and whisker plot is a way of summarizing a set of data measured on an interval scale. It is often used in exploratory data analysis to show the shape of the distribution, its central value, and variability. The picture produced consists of the most extreme values in the data set (maximum and minimum values at the ends of the line), the lower and upper quartiles (edges of the box), and the median (line through the box). (NOTE: The lines extending from the box may be adjusted to represent a certain fraction of the data: they could be set at 5% and 95% or they could represent the minimum and maximum values.)
A box plot, as it is often called, is especially helpful for indicating whether a distribution is skewed and whether there are any unusual observations (outliers) in the data set. Box and whisker plots are also very useful when large numbers of observations are involved and when two or more data sets are being compared.
A frequency table is a way of summarizing a set of data. It is a record of how often each value (or set of values) of the variable in question occurs. It may be enhanced by the addition of percentages that fall into each category. A frequency table is used to summarize categorical, nominal, and ordinal data. It may also be used to summarize continuous data once the data set has been divided up into sensible groups.
Example: Suppose that in 30 shots at a target, a marksman makes the scores shown at left, in the graphic below. The frequencies of the different scores are summarized at right, in the graphic.
A normal distribution is a bell curve that extends to infinity in both directions. The high point represents the mean. Examples of normal distributions are shown below. Notice that they differ in how spread out they are but the area under each curve is the same. If the area under the curve is defined to be 1 and you multiply that by 100, then there is a 100% chance that any value you name will be somewhere in the distribution.
Because half the area of the curve is below the mean and half is above the mean, there is a 50% chance that a randomly chosen value will be above the mean and the same chance that it will be below it. The area under the normal curve is equivalent to the probability of randomly drawing a value in that range. The area is greatest in the middle where the "hump" is and thins out toward the tails.
(Based on graphic from http://davidmlane.com/hyperstat/normal_distribution.html)
"Q&A: Ways to display data" is a series of questions and answers about statistics written for teachers and students. The questions are ones that students might ask while studying statistics. Teachers can use this Q&A to gain additional knowledge about statistics, or use it in the classroom as outlined below.
• An engagement activity. Use selected questions to start a discussion.
• An inquiry tool. Use selected questions and answers to help students generate questions. Propose a question, such as "What is a normal distribution?" Have students read the answer to the question and write down 3–5 questions they would like answered as a result of reading the material.
• A source of information. Students can use the questions and answers as part of their research on statistics.
• A form of review. Use the questions as a review at the end of a unit on statistics.
• A follow-up. Have students read the questions and answers to gain additional information about statistics following a related activity.
Online Exploration: Galaxy Hunter