Resources

Math: Statistics and estimating

Sampling

The population size includes all the individuals in the identified group to be studied. This may be the number of people in a city, or the number of people who buy new cars. Often you may not know the exact population size, which is not a problem. The mathematics of probability proves that the size of the population is irrelevant, unless the size of the sample exceeds a few percent of the total population you are examining. This means that a sample of 500 people is equally useful in examining the opinions of a state of 15,000,000 as it would a city of 100,000.

For this reason, The Survey System ignores the population size when it is "large" or unknown. A large population is referred to as infinite, while a small population is considered finite. Population size is only likely to be a factor when you work with a relatively small, known, finite group of people (e.g., the members of an association).

It is essential to use the correct sample size to accurately represent the population. Choosing a sample size that is too small may not give an accurate representation of the population distribution. Too large a sample size is wasteful and sometimes impossible to complete.

For example, you want to change something in a school with a population of 500 students, and decide to survey the school but ask only ten people. Is this truly representative of the school community? No. Ten people are not enough to accurately represent the school. Suppose you tried to ask every person in the school. Sometimes this is not easily accomplished and can be unnecessary. In this case, a sample of 23 should be enough to represent the population. Reasonable sample size is dependent on population size and how much sampling error is tolerated.

A simple random sample is formed by assigning each member of the population a number and then indiscriminately selecting from these numbers. One way to make the selection random is to use a random number table or let a computer generate a series of random numbers.

Each member of the population is assigned a unique number, or perhaps a number is already assigned to each member, such as a social security number or telephone number. The members of the population chosen for the sample will be those whose numbers are identical to the ones extracted from the random number table (or computer), in succession, until the desired sample size is reached.

The simple random sample requires less knowledge about the population than other techniques, but it does have two major drawbacks. One is the fact that if the population is large, a great deal of time must be spent listing and numbering the members. The other is the fact that a simple random sample will not adequately represent many population attributes (characteristics) unless the sample is relatively large.

That is, if you are interested in choosing a sample to be representative of a population on the basis of the distribution in the population of gender, age, and economic status, a simple random sample will need to be very large to ensure that all of these distributions are equivalent to (or representative of) the population.

**Systematic sampling** — Similar to simple random sampling, but instead of selecting random numbers from tables, you move through a list (sample frame) picking every nth name. For example, pick every 10th name from an alphabetical list of students enrolled in a school.

**Random route sampling** — Used in market research surveys, mainly for sampling households, shops, garages, and other premises in urban areas. A starting address is randomly selected and, taking alternate left- and right-hand turns at road junctions, every nth address is selected.

**Stratified sampling** — All people in the sampling frame are divided into "strata" (groups or categories). Within each stratum, a simple random sample or systematic sample is selected. For example, a politician wishes to poll his/her constituents regarding taxation. The constituents are broken into income brackets and then each bracket is polled.

**Cluster or area random sampling** — In cluster sampling, the population is divided into clusters (usually along geographic boundaries), the clusters are randomly sampled, and all units within the sampled cluster are measured. For example, a survey of town governments that will require going to the towns personally could be done by using county boundaries as the clusters and randomly selecting five counties. All the town governments in these selected counties would then be measured.

**Multi-stage cluster sampling** — As the name implies, this involves drawing several different samples. The first stage would be a cluster sample (as described above) but then another sample is taken from these samples. For example, a face-to-face survey of the residents of a state could be done by first selecting a sample of counties and then doing another sample, such as systemic sampling, of the residents of those selected counties. Thus the cost of interviewing is minimized.

There are many other methods of sampling that are more advanced.

Description

**"Q&A: Sampling"** is a series of questions and answers about statistics written for teachers and students. The questions are ones that students might ask while studying statistics. Teachers can use this Q&A to gain additional knowledge about statistics, or use it in the classroom as outlined below.

FORMATS AVAILABLE:

Printer-friendly web page

GRADES:

Adaptable, at teacher's discretion

How to use in the classroom

**• An engagement activity.** Use selected questions to start a discussion.

**• An inquiry tool.** Use selected questions and answers to help students generate questions. Propose a question, such as "What is population size?" Have students read the answer to the question and write down 3–5 questions they would like answered as a result of reading the material.

**• A source of information.** Students can use the questions and answers as part of their research on statistics.

**• A form of review.** Use the questions as a review at the end of a unit on statistics.

**• A follow-up.** Have students read the questions and answers to gain additional information about statistics following a related activity.

Related materials

Online Exploration: Galaxy Hunter