Encyclopedia of Social Work is now a consistently updated digital resource. Visit About to learn more, meet the editorial board, or explore the latest articles.

Show Summary Details

Updated and expanded to cover recent advances in the field. Further Reading expanded.

Updated on 01 Oct 2013. The previous version of this content can be found here.
Page of

PRINTED FROM the Encyclopedia of Social Work, accessed online. (c) National Association of Social Workers and Oxford University Press USA, 2016. All Rights Reserved. Under the terms of the applicable license agreement governing use of the Encyclopedia of Social Work accessed online, an authorized individual user may print out a PDF of a single article for personal use, only (for details see Privacy Policy).

Subscriber: null; date: 26 February 2017


Abstract and Keywords

Meta-analysis is widely used in the social, behavioral, and medical sciences to combine results of multiple studies and produce relevant information for clinical practice and social policy. It is most often used to synthesize quantitative data on treatment effects, but it has many potential applications. Meta-analysis includes a set of techniques for quantitative data synthesis that can (and should) be performed in the context of systematic efforts to minimize bias at each step in the research review process (see Systematic Reviews). Without careful efforts to eliminate bias, meta-analysis can lead to wrong conclusions.

Keywords: evidence, empirical, research, literature reviews, quantitative data, systematic reviews

Meta-analysis is the quantitative synthesis of results of multiple studies. It can estimate trends, assess variations across studies, and correct for errors and bias in a body of research. This is important given the rapid accumulation of evidence on many topics relevant for social work, inconsistencies across studies, and well-known limitations of traditional research-review methods. A systematic review uses replicable procedures to minimize bias in research synthesis; meta-analysis is usually one of the last steps in this process (Littell, Corcoran, & Pillai, 2008). Meta-analysis has distinct advantages over other approaches to research synthesis, and it has limitations.


Karl Pearson conducted the first quantitative synthesis in 1904, when he computed an average correlation from 11 studies of a vaccine against typhoid. Methods for quantitative synthesis appeared in statistical texts and articles in the 1930s, but they were rarely used until the late 1970s, when several teams produced large meta-analyses on psychotherapy (Smith & Glass, 1977), class size (Glass & Smith, 1978), interpersonal expectancy effects (Rosenthal & Rubin, 1979), and the validity of employment tests by race (Hunter, Schmidt, & Hunter, 1979). Books on the conceptual, theoretical, and statistical foundations of meta-analysis appeared in the 1980s and 1990s (Cooper & Hedges, 1994; Hedges & Olkin, 1985; Light & Pillemer, 1984). There have been many recent advances in the science of research synthesis, including developments in information retrieval (Hammerstrøm, Wade, & Jørgensen, 2010), systematic-review methods (Cooper, Hedges, & Valentine, 2009; Higgins & Green, 2011; Shadish & Myers, 2004), and statistics for meta-analysis (Becker, Hedges, & Pigott, 2004; Bornstein, Hedges, Higgins, & Rothstein, 2009; Hedges & Pigott, 2001).


Meta-analysis can combine many forms of quantitative data and address diverse research questions. For example, Ahnert, Pinquart, and Lamb (2006) synthesized results of 40 studies to estimate the proportion of children who had secure attachments to nonparental caregivers. Syntheses have assessed the strength of associations between attitudes and behavior (Glasman & Albarracin, 2006), sensation-seeking and alcohol use (Hittner & Swickert, 2006), and interpersonal stress and psychosocial health in youth (Clarke, 2006). Several thousand meta-analyses have been conducted on effects of social, behavioral, educational, and medical interventions (see http://www.cochrane.org and http://www.campbellcollaboration.org). In addition to assessing main effects, meta-analysis can explore variations. For instance, Wilson, Lipsey, and Soydan (2003) compared the effects of mainstream programs for juvenile delinquency for minority versus majority youths. Quantitative methods have been used to synthesize information on diagnostic accuracy (for example, misdiagnosis of conversion symptoms; Stone et al., 2005) and the prognostic performance of tests.

Issues That Affect the Design and Interpretation of Meta-Analysis

Credible meta-analyses are designed to minimize error and bias in the synthesis of results across multiple studies. Errors and biases can arise in the original studies, in the dissemination of study results, and in the review process itself. Primary studies may systematically overestimate or underestimate effects, because of design and implementation issues that leave them vulnerable to threats to validity (Shadish, Cook, & Campbell, 2002). Confirmation bias (the tendency to support favored hypotheses and ignore evidence to the contrary) can arise in the reporting, publication, and dissemination of results of original studies. Statistically significant results are more likely to be reported, published, and cited (Dickersin, 2005; Dwan et al., 2008; Song et al., 2009), making these results more readily available than other, equally valid findings. Thus, meta-analyses that are based only on published studies are likely to produce inaccurate results.

The integration of results from multiple studies is a complex task not easily performed with “cognitive algebra.” The conclusions of narrative reviews can be influenced by trivial properties of research reports (Bushman & Wells, 2001). Several quantitative approaches to research synthesis have been developed and tested. “Vote counting” (tallying the number of studies that provide evidence for and against a hypothesis) relies on tests of significance, and it can lead to the wrong conclusions (Carlton & Strawderman, 1996). Meta-analysis provides more reliable estimates of results across studies.

A research synthesis is vulnerable to bias when the sample of studies is restricted to published reports, when reviewers fail to consider variations in study qualities, and when results are reported selectively. For these reasons, meta-analyses are often lodged in systematic reviews (Littell & Shlonsky, 2011).

Systematic-Review Methods

Systematic reviews are conducted in phases that are parallel to the steps in primary research (Cooper, 1998). Objectives and methods are laid out in advance. Reviewers specify the study designs, populations, interventions, comparisons, and outcome measures that will be included and excluded; this limits reviewers’ freedom to select studies on the basis of their results, or on some other basis (Higgins & Green, 2011; Littell et al., 2008).

Diverse sources and strategies are used to locate potentially relevant studies (Hammerstrøm et al., 2010). In addition to keyword searches of several electronic databases, hand searching of relevant journals may be needed to find eligible studies that are not properly indexed (Hopewell, Clarke, Lefebvre, & Scherer, 2006). Reviewers make special attempts to locate relevant “gray literature” (unpublished and hard-to-find studies), in order to minimize the “file drawer problem” (Hopewell, McDonald, Clarke, & Egger, 2006; Rothstein, Sutton, & Bornstein, 2005).

Key decisions are made by independent raters who compare notes, resolve differences, and document reasons for their decisions (Higgins & Green, 2011). Raters extract data from study reports onto paper or electronic coding forms to capture information about treatment characteristics, settings, participants, study design and implementation characteristics, data-collection procedures, measures, raw data, statistical results, and the coding process itself. These data are then available for use in the analysis of results.

Reviewers assess methodological characteristics of primary studies, because study design and implementation issues affect the credibility of results (Glazerman, Levy, & Myers, 2002; Kunz & Oxman, 1998; Schultz, Chalmers, Hayes, &Altman, 1995; Schulz &Grimes, 2002; Shadish & Ragsdale, 1996). There are many approaches to study quality assessment (Jüni, Altman, & Egger, 2001; Jüni, Witschi, Bloch, & Egger, 1999). Some focus on overall design features; others emphasize threats to validity or vulnerability to certain types of bias (Higgins & Green, 2011). There is, however, general consensus among methodologists and meta-analysts that study qualities should be assessed individually rather than being summed into total study-quality scores (Shadish & Myers, 2004; Wells & Littell, 2009). The impact of specific study qualities can then be examined in the analysis.

Understanding Meta-Analysis

Meta-analysis includes an array of statistical methods and techniques. Results of primary studies are converted to common metrics, called effect sizes, before they are pooled across studies. The term effect size (ES) refers to a class of statistics that represent the direction and strength of a relationship between two variables. ES metrics include the odds ratio, risk ratio, correlation coefficient, standardized mean difference (SMD), and standardized mean gain score (Bornstein et al., 2009; Lipsey & Wilson, 2001).

The most common ES for continuous data, the SMD, is the difference between the means of two groups, divided by their pooled standard deviation. When group means and standard deviations are not available, SMDs can be calculated or estimated from a variety of other statistics (Lipsey & Wilson, 2001). Since SMDs are upwardly biased when based on small samples (Hedges, 1981), meta-analysts use a correction for small sample bias, known as Hedges’ g. Similar corrections are available for odds ratios and correlation coefficients. Other adjustments can be made to handle outliers, compensate for restrictions in range, and adjust for unreliable measures (Hunter & Schmidt, 2004).

Depending on the review’s central questions, it may or may not make sense to pool results across studies with different sample characteristics, types of treatments, or outcome measures. Meta-analysis can produce an overall (average) ES estimate that accounts for different sample sizes and variances (study-level ES is weighted using inverse variance methods, so that larger studies and those with more precise estimates contribute more to the overall average than smaller studies and those with less precision).

Several statistical models are available for pooling data. Fixed-effect models assume that all studies provide estimates of the same population ES and that any differences between studies are due to chance (sampling error). Random-effects models assume that there are other sources of variation that are not taken into account. Mixed-effects models assume that some of the variation in the ES distribution is systematic (and can be accounted for by moderators) and that some of the variation is random.

Forest plots provide graphic displays of the ES distribution on a given outcome in a set of studies, showing point estimates and confidence intervals. Pooled estimates and results of homogeneity tests are often reported on Forest plots.

Homogeneity analysis is used to determine whether a mean ES is representative of the distribution of data from a set of studies. It compares the observed variance in effects across studies with the variance that would be expected because of sampling error.

Subgroup analysis is performed by partitioning the sample and calculating average ES for subgroups. Meta-analysts caution against the use of many subgroup analyses, because this can become akin to “fishing” for significant results in primary studies.

There are two primary approaches to moderator analysis. The first is an analog to the analysis of variance, in which the moderator is a categorical variable. The average ES is calculated for each category (or subgroup), and tests of significance are used to assess between-group differences. For example, one can use this approach to see whether effects differ for randomized experiments versus nonrandomized studies, younger versus older children, or shorter- versus longer-term treatments. The second method uses weighted multiple-regression analysis to assess the potential impact of one or more continuous moderators on the ES; this is sometimes called meta-regression.

Funnel plots are used to identify possible biases in a distribution of ES. In a bivariate scatterplot, study ES is plotted on the horizontal axis with their standard errors (or inverse variance) on the vertical axis. In the absence of publication bias, the plot will resemble an inverse funnel. The appearance of asymmetry in the funnel plot indicates that results may be biased, perhaps by the systematic exclusion of studies with negative or null results (Egger, Smith, Schneider, & Minder, 1997). Several methods are used to identify and correct for publication bias (Rothstein, et al., 2005).

Sensitivity analysis is used to determine whether findings are robust (consistent) under different assumptions. It is often used to explore the potential impact of outliers or missing data on overall results.

Most statistical software programs can perform basic meta-analysis (Lipsey & Wilson, 2001). The Cochrane Collaboration offers a no-cost, downloadable program called Review Manager (RevMan) that includes routines for computing, weighting, and pooling the most common ES metrics; it also produces Forest plots and funnel plots (http://ims.cochrane.org/revman). Stand-alone meta-analysis programs, such as Comprehensive Meta-Analysis, have additional capabilities for moderator and sensitivity analyses.

Newer developments in network meta-analysis (also known as multiple-treatments or multiple-comparisons meta-analysis) permit the use of both indirect and direct comparisons to assess the relative effects of different treatments for the same condition (Salanti, 2012). For example, Cipriani and colleagues (2009) were able to use network meta-analysis to rank the relative benefits and acceptability of 12 anti-depressants for clients with major depression, even though no primary studies had been conducted with all 12 drugs.

Reporting Guidelines

Moher, Liberati, Tetzlaff, Altman, and The PRISMA Group (2009) developed the PRISMA statement (Preferred Reporting Items for Systematic reviews and Meta-analyses) to guide reporting on systematic reviews and meta-analyses. The statement includes a checklist of items and a flow diagram that should be used to describe how studies were identified, screened, and selected for the review. Additional guidelines are available for meta-analyses of observational studies (Stroup et al., 2000), individual participant data (Riley, Lambert, & bo-Zaid, 2010), and equity issues (Welch, Petticrew, Tugwell, Moher, O’Neill, 2012)


Meta-analysis imposes discipline on the process of research synthesis and offers more transparency than traditional narrative methods. It provides an efficient way to summarize the results of a large number of studies, and it can lead to the discovery and exploration of important variations across studies. Compared with available alternatives (narrative synthesis and vote counting), meta-analysis offers more accurate summaries of quantitative data (Bushman & Wells, 2001; Carlton & Stawderman, 1996; Cooper & Rosenthal, 1980).


Meta-analysis requires considerable effort and expertise. In the hands of analysts who are unaware of important substantive issues, meta-analysis can become a mechanical exercise. Conceptual issues are especially important in the lumping and splitting decisions that go into meta-analysis: absent a strong theoretical rationale, pooling results across different types of treatments, samples, or outcomes produces results that are not meaningful. Finally, a meta-analysis of a biased sample of studies may lead to the wrong conclusions, and meta-analysis of very weak studies will produce unreliable results.

Current Status

Meta-analysis is widely used in the social sciences, especially in psychology and education (Petticrew & Roberts, 2006), and it is the standard for synthesizing the results of clinical trials in medicine. Several governmental and nonprofit organizations sponsor or produce systematic reviews and meta-analyses. Of particular relevance for social work are the international, interdisciplinary Cochrane Collaboration and the Campbell Collaboration, which synthesize studies on health care and social care, respectively. Building on advances in the science of research synthesis, these groups produce guidelines for systematic reviews and meta-analysis.

Implications for Social Work Practice and Policy

Meta-analysis can add rigor and transparency to ongoing efforts to synthesize the growing bodies of empirical research that are relevant for social work. Thus, it has an important role in the development of knowledge for social work and human services.


Ahnert, L., Pinquart, M., & Lamb, M. E. (2006). Security of children’s relationships with nonparental care providers: A meta-analysis. Child Development, 77, 664–679.Find this resource:

    Becker, B. J., Hedges, L., & Pigott, T. D. (2004). Campbell collaboration statistical analysis policy brief. Retrieved June 7, 2012, from http://www.campbellcollaboration.org/artman2/uploads/1/C2_Statistical_Analysis_Policy_Brief-2.pdf

    Bornstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. West Sussex, UK: John Wiley & Sons.Find this resource:

      Bushman, B. J., & Wells, G. L. (2001). Narrative impressions of literature: The availability bias and the corrective properties of meta-analytic approaches. Personal and Social Psychology Bulletin, 27, 1123–1130.Find this resource:

        Carlton, P. L., & Strawderman, W. E. (1996). Evaluating cumulated research I: The inadequacy of traditional methods. Biological Psychiatry, 39, 65–72.Find this resource:

          Cipriani, A., Furukawa, T. A., Salanti, G., Geddes, J. R., Higgins, J. P. T., Churchill, R., et al. (2009). Comparative efficacy and acceptability of 12 new-generation antidepressants: A multiple-treatments meta-analysis. The Lancet, 373, 746–758.Find this resource:

            Clarke, A. T. (2006). Coping with interpersonal stress and psychosocial health among children and adolescents: A meta-analysis. Journal of Youth and Adolescence, 35, 11–24.Find this resource:

              Cooper, H. (1998). Synthesizing research (3rd ed.). Thousand Oaks, CA: Sage.Find this resource:

                Cooper, H., & Hedges, L. (1994). Handbook of research synthesis. New York: Russell Sage Foundation.Find this resource:

                  Cooper, H. M., Hedges, L. V., & Valentine, J. C. (2009). The handbook of research synthesis and meta-analysis (2nd ed.). New York: Russell Sage Foundation.Find this resource:

                    Cooper, H. M., & Rosenthal, R. (1980). Statistical versus traditional procedures for summarizing research findings. Psychological Bulletin, 87(3), 442–449.Find this resource:

                      Dickersin, K. (2005). Publication bias: Recognizing the problem, understanding its origins and scope, and preventing harm. In H. R. Rothstein, A. J. Sutton, & M. Borenstein (Eds.), Publication bias in meta-analysis: Prevention, assessment, and adjustments. Chichester, UK: Wiley.Find this resource:

                        Dwan, K., Altman, D. G., Arnaiz, J. A., Bloom, J., Chan, A.-W., Cronin, E., et al. (2008). Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PLoS ONE3(8), e3081. doi:10.1371/journal.pone.0003081Find this resource:

                          Egger, M., Smith, G. D., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected by a simple, graphical test. British Medical Journal, 315, 629–634.Find this resource:

                            Glasman, L. R., & Albarracin, D. (2006). Forming attitudes that predict future behavior: A meta-analysis of the attitude–behavior relation. Psychological Bulletin, 132, 778–822.Find this resource:

                              Glass, G. V., & Smith, M. K. (1978). Meta-analysis of research on the relationship of class size and achievement. Educational Evaluation and Policy Analysis, 1, 2–16.Find this resource:

                                Glazerman, S., Levy, D. M., & Myers, D. (2002). Nonexperimental replications of social experiments: A systematic review. Princeton, NJ: Mathematica Policy Research.Find this resource:

                                  Hammerstrøm, K., Wade, A., & Jørgensen, A.-M. K. (2010). Searching for studies: A guide to information retrieval for Campbell systematic reviews. Retrieved January 15, 2013, from http://www.campbellcollaboration.org/resources/research/new_information_retrieval_guide.php

                                  Hedges, L. V. (1981). Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics, 7, 119–128.Find this resource:

                                    Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.Find this resource:

                                      Hedges, L. V., & Pigott, T. D. (2001). The power of statistical tests in meta-analysis. Psychological Methods, 6(3), 2003–2217.Find this resource:

                                        Higgins, J. P. T., & Green, S. (Eds.) (2011). Cochrane handbook for systematic reviews of interventions, Version 5.1.0. Available from http://handbook.cochrane.org/

                                        Hittner, J. B., & Swickert, R. (2006). Sensation seeking and alcohol use: A meta-analytic review. Addictive Behaviors, 31, 1383–1401.Find this resource:

                                          Hopewell, S., Clarke, M., Lefebvre, C., & Scherer, R. (2006). Hand searching versus electronic searching to identify reports of randomized trials. In The Cochrane database of systematic reviews, Issue 4. Chichester, UK: Wiley.Find this resource:

                                            Hopewell, S., McDonald, S., Clarke, M., & Egger, M. (2006). Grey literature in meta-analyses of randomized trials of health care interventions. In The Cochrane database of systematic reviews, Issue 2. Chichester, UK: Wiley.Find this resource:

                                              Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings (2nd ed.). Thousand Oaks, CA: Sage Publications.Find this resource:

                                                Hunter, J. E., Schmidt, F. L., & Hunter, R. (1979). Differential validity of employment tests by race: A comprehensive review and analysis. Psychological Bulletin, 86, 721–735.Find this resource:

                                                  Jüni, P., Altman, D. G., & Egger, M. (2001). Assessing the quality of controlled clinical trials. British Medical Journal, 323(7303), 42–46.Find this resource:

                                                    Jüni, P., Witschi, A., Bloch, R., & Egger, M. (1999). The hazards of scoring the quality of clinical trials for meta-analysis. Journal of the American Medical Association, 282, 1054–1060.Find this resource:

                                                      Kunz, R., & Oxman, A. D. (1998). The unpredictability paradox: Review of empirical comparisons of randomised and nonrandomised clinical trials. British Medical Journal, 317(7167), 1185–1190.Find this resource:

                                                        Light, R. J., & Pillemer, D. B. (1984). Summing up: The science of reviewing research. Cambridge, MA: Harvard University Press.Find this resource:

                                                          Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage Publications.Find this resource:

                                                            Littell, J. H., Corcoran, J., & Pillai, V. (2008). Systematic reviews and meta-analysis. New York: Oxford University Press.Find this resource:

                                                              Littell, J., & Shlonsky, A. (2011). Making sense of meta-analysis: A critique of “effectiveness of long-term psychodynamic psychotherapy.” Clinical Social Work Journal, 39, 340–346.Find this resource:

                                                                Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & The PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med6(7), e1000097. doi:10.1371/journal.pmed.1000097Find this resource:

                                                                  Petticrew, M., & Roberts, H. (2006). Systematic reviews in the social sciences: A practical guide. Oxford, UK: Blackwell.Find this resource:

                                                                    Riley, R. D., Lambert, P. C., & bo-Zaid, G. (2010). Meta-analysis of individual participant data: Rationale, conduct, and reporting. British Medical Journal, 340: c221.Find this resource:

                                                                      Rosenthal, R., & Rubin, D. B. (1979). Interpersonal expectancy effects: The first 345 studies. Behavioral and Brain Sciences, 3, 377–386.Find this resource:

                                                                        Rothstein, H., Sutton, A. J., & Bornstein, M. (Eds.). (2005). Publication bias in meta-analysis: Prevention, assessment, and adjustments. Chichester, UK: Wiley.Find this resource:

                                                                          Salanti, G. (2012). Indirect and mixed-treatment comparison, network, or multiple-treatments meta-analysis: Many names, many benefits, many concerns for the next generation evidence synthesis tool. Research Synthesis Methods, 2012(3), 80–97.Find this resource:

                                                                            Schultz, K. F., Chalmers, I., Hayes, R. J., & Altman, D. G. (1995). Empirical evidence of bias: Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. Journal of the American Medical Association, 273, 408–412.Find this resource:

                                                                              Schulz, K. F., & Grimes, D. A. (2002). Allocation concealment in randomised trials: Defending against deciphering. The Lancet, 359, 614–618.Find this resource:

                                                                                Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.Find this resource:

                                                                                  Shadish, W. R., & Myers, D. (2004). Campbell Collaboration Research Design Policy Brief. Retrieved from http://www.campbellcollaboration.org/artman2/uploads/1/C2_Research_Design_Policy_Brief-2.pdf

                                                                                  Shadish, W. R., & Ragsdale, K. (1996). Random versus nonrandom assignment in controlled experiments: Do you get the same answer? Journal of Consulting and Clinical Psychology, 64(6), 1290–1306.Find this resource:

                                                                                    Smith, M. L., & Glass, G. V. (1977). Meta-analysis of psychotherapy outcome studies. American Psychologist, 32, 752–760.Find this resource:

                                                                                      Song, F., Parekh-Bhurke, S., Hooper, L., Loke, Y., Ryder, J., Sutton, A., et al. (2009). Extent of publication bias in different categories of research cohorts: A meta-analysis of empirical studies. BMC Medical Research Methodology9(1), 79.Find this resource:

                                                                                        Stone, J., Smyth, R., Carson, A., Lewis, S., Prescott, R., Warlow, C., et al. (2005). Systematic review of misdiagnosis of conversion symptoms and “hysteria.” British Medical Journal, 332, 989–994.Find this resource:

                                                                                          Stroup, D. F., Berlin, J. A., Morton, S. C., Olkin, I., Williamson, G. D., Rennie, D., et al.; Meta-analysis of Observational Studies in Epidemiology (MOOSE) Group. (2000). Meta-analysis of observational studies in epidemiology: A proposal for reporting. Journal of the American Medical Association, 283(15), 2008–2012.Find this resource:

                                                                                            Welch, V., Petticrew, M., Tugwell, P., Moher, D., O’Neill, J., et al. (2012). PRISMA-equity 2012 extension: Reporting guidelines for systematic reviews with a focus on health equity. PLoS Med, 9(10), 31001333.Find this resource:

                                                                                              Wells, K., & Littell, J. H. (2009). Study quality assessment in systematic reviews of research on intervention effects. Research on Social Work Practice, 19(1), 52–62.Find this resource:

                                                                                                Wilson, S. J., Lipsey, M. W., & Soydan, H. (2003). Are mainstream programs for juvenile delinquency less effective with minority than majority youth? A meta-analysis of outcomes research. Research on Social Work Practice, 13, 3–26.Find this resource: