Sample size calculation

Descriptive studies

Descriptive studies are usually undertaken using surveys. This requires some understanding of the target population, i.e., the entire group of individuals to which researchers are interested in generalizing the results to. For example, you might be undertaking a survey of adults (20 years and over) in South Australia (survey population of approximately 1,200,000), but wish to be able to generalize the results to all adults in Australia (approximately 16,800,000) – the target population.

 

Surveys

Sample sizes for surveys are based on accuracy in other words, if you are trying to measure a rate (or proportion), how accurate do you want the estimate to be. For example, you might want to estimate the proportion (p) of adults in Australia who are married. You think it is probably around 50%, and you would like to get an estimate within ± 1% accuracy – that is the true value, is likely to fall within the range 49 – 51% with 95% confidence. This is in fact the 95% Confidence Interval for the sample proportion (p) of 0.5, and we use this to determine a required sample size.

The formula for the 95% Confidence Interval (CI) for p is:

where SE (p) is the standard error of the proportion. The formula for SE (p) has the square root of n, the sample size, in the denominator.  In other words, as the sample size gets bigger, SE (p) gets smaller, the 95% CI (p) gets narrower, and we get a more accurate estimate. So to increase accuracy, we simply need to increase the sample size.

From the above example, with an expected proportion of 50% and a large target population, and a required accuracy of ± 1%, Figure 1 shows the required sample size is about 6,500 responses out of a sample population of 1,200,000 adults in South Australia.

Figure 1: Required sample size

In fact the required sample size is very much dependent on the expected response for each individual question in a survey. The required sample size is at a maximum when 50% of respondents say “yes” and 50% say “no”. For this reason, we tend to use 50% in the sample size formula since it will provide us with the maximum sample size required for any question.

Some points to remember are:

  • Higher requirements for accuracy, e.g., ±1% require a higher sample size than lower accuracy, e.g., ±5%
  • Any deviation from an expected prevalence estimate of 50% requires a smaller sample size
  • The bigger the target population, the bigger the required sample size, tailoring off after about 3,000.

 A website that allows you to undertake these calculations online can be found here:

https://www.surveymonkey.com/mp/sample-size-calculator/