This blog post is the final in a series of three sampling-focused posts.
The first two posts in this series describe commonly used research sampling strategies and provide some guidance on how to choose from this range of sampling methods. Here we delve further into the sampling world and address sample sizes for qualitative research and evaluation projects. Specifically, we address the often-asked question: How many in-depth interviews/focus groups do I need to conduct for my study?
Within the qualitative literature (and community of practice), the concept of “saturation” – the point when incoming data produce little or no new information – is the well-accepted standard by which sample sizes for qualitative inquiry are determined (Guest et al. 2006; Guest and MacQueen 2008). There’s just one small problem with this: saturation, by definition, can be determined only during or after data analysis. And most of us need to justify our sample sizes (to funders, ethics committees, etc.) before collecting data!
Until relatively recently, researchers and evaluators had to rely on rules of thumb or their personal experiences to estimate how many qualitative data collection events they needed for a study; empirical data to support these sample sizes were virtually non-existent. This began to change a little over a decade ago. Morgan and colleagues (2002) decided to plot (and publish!) the number of new concepts identified in successive interviews across four datasets. They found that nearly no new concepts were found after 20 interviews. Extrapolating from their data, we see that the first five to six in-depth interviews produced the majority of new data, and approximately 80% to 92% of concepts were identified within the first 10 interviews.
Building on this work, Guest et al. 2006 conducted a systematic inductive thematic analysis of 60 in-depth interviews among female sex workers in West Africa. Of the 114 themes identified in the entire dataset, 80 (70%) turned up in the first six interviews, and 100 themes (92%) were identified within the first 12 interviews (Figure 1). Additionally, those 100 themes comprised 97% of the most common (highest prevalence) themes, indicating that the “big ones” were evident early on.
Since Guest et al.’s publication in 2006, other researchers have confirmed that 6-12 interviews seem to be a sweet spot for the number of qualitative interviews needed to reach saturation. We provide the following table as a summary.
|Study authors||Saturation definition||Findings|
|Morgan and colleagues (2002)||Not defined||
|Guest et al. 2006||The proportion of identified themes at a given point in analysis divided by the total number of themes identified in that analysis||
|Francis et al. (2010) (gated)||The point, after conducting 10 interviews, when three additional interviews yield no new themes||
|Coenen et al. (2012) (gated)||The point at which linking concepts from two consecutive focus groups or individual interviews reveals no additional second-level categories||
|Hagaman and Wutich (2016) (gated)||The number of interviews required to identify the most common themes in a total of three interviews||
|Namey et al. (2016)||The proportion of identified themes at a given point in analysis divided by the total number of themes identified in that analysis||
“But what about focus groups?” you ask. An empirically-based study by Coenen et al. (2012) (gated) found that five focus groups were enough to reach saturation for their inductive thematic analysis. In a recent methodological study (gated), we followed a similar approach used by Guest et al. (2006) and monitored thematic discovery and code creation after each of 40 focus groups conducted among African-American men in North Carolina on the topic of health-seeking behavior (more on this study and its methodological findings here). We found the majority of themes were identified within the first focus group, and nearly all of the important (read most frequently expressed) themes were discovered within the first three focus groups (Figure 2).
These data from our study suggest that a sample size of two to three focus groups will likely capture about 80% of themes on a topic — including those most broadly shared — in a study with a relatively homogeneous population, and using a semi-structured guide. As few as three to six focus groups are likely enough to identify 90% of important themes.
Note that these sample sizes, for both interviews and focus groups, apply per sub-population of interest. Note too that thematic saturation will vary based on a number of factors (keep watch for a future blog post) and sample size should be adjusted accordingly.
Use this catchy poem to remember how many in-depth interviews or focus groups you need.
*per sub-population of interest