Bulletin of the World Health Organization

Accounting for model uncertainty in estimating global burden of disease

David M Vock a, Elizabeth A Atchison b, Julie M Legler c, David RJ McClure d, Jamie C Carlyle e, Elysia N Jeavons f & Anthony H Burton g

a. Department of Statistics, North Carolina State University, 2311 Stinson Drive (Campus Box 8203), Raleigh, NC, 27695-8203, United States of America (USA).
b. Feinberg School of Medicine, Northwestern University, Chicago, USA.
c. Department of Mathematics, Statistics and Computer Science, St Olaf College, Northfield, USA.
d. Department of Statistical Science, Duke University, Durham, USA.
e. Carver College of Medicine, University of Iowa, Iowa City, USA.
f. Department of Health Sciences Research, Mayo Clinic, Rochester, USA.
g. World Health Organization, Geneva, Switzerland.

Correspondence to David M Vock (e-mail: dmvock@ncsu.edu).

(Submitted: 21 October 2009 – Revised version received: 14 July 2010 – Accepted: 01 October 2010 – : 04 November 2010.)

Bulletin of the World Health Organization 2011;89:112-120. doi: 10.2471/BLT.09.073577

Introduction

Because vital registration and surveillance data on etiology-specific causes of death are frequently unavailable, incomplete or inaccurate, statistical models are commonly used to estimate burden of disease. Mathers et al. argue that while such estimates have a degree of uncertainty, they are far more useful for policy-makers than fragmented and unsynthesized data.1 A key mission of the World Health Organization (WHO) is to disseminate coherent and complete disease burden estimates, and quantifying uncertainty is one of WHO’s five principles of data dissemination.2 Accurately reporting uncertainty is essential so that policy-makers can determine whether an estimate with a given level of uncertainty is appropriate for a given purpose. Furthermore, quantifying uncertainty is important to establish global health priorities and make accurate cross-national comparisons.1

Estimates of disease burden based on statistical models involve uncertainty not only because of sampling variability but also because of the choice of potential covariates and model structure. Researchers typically select a single model structure and subset of predictors and proceed to analyse the data as if the correct model structure and relevant covariates were known. Throughout this paper, we use model uncertainty to refer to the uncertainty that results from both the structure of the model and the covariates included in the model. Failing to account for model uncertainty leads to standard error calculations that underestimate the overall uncertainty of the results. While previous uncertainty measures of global burden of disease have described the uncertainty generated by the sampling variance in the underlying data, to the best of our knowledge no one to date has attempted to describe the uncertainty associated with the choice of the model used to estimate disease burden. We suggest using Bayesian model averaging to account for model uncertainty when using statistical modelling to make disease burden predictions.35

We illustrate the effects of failing to account for model uncertainty on the estimated global burden of child deaths from rotavirus infection, a major cause of childhood diarrhoea.6 According to WHO, in 2004 an estimated 527 000 (confidence interval, CI: 475 000–580 000) deaths from rotavirus infection occurred worldwide in children younger than 5 years.7,8 The full analytic framework for deriving the WHO estimate is described in Parashar et al.8 and summarized in this paper. The WHO estimate and previous estimates in the literature were computed by multiplying each country’s estimated number of child deaths from diarrhoea by each country’s estimated proportion of diarrhoea deaths attributable to rotavirus and then summing the product over all countries.711 The number of child deaths from diarrhoea in each country is routinely reported by WHO. Although no data exist on the proportion of these deaths that is caused by rotavirus, this proportion can be estimated from 76 studies across the world that have reported it for particular regions within each country. But because the proportion of child diarrhoea deaths attributed to rotavirus varies considerably across countries, using the sample average of all 76 studies to estimate this proportion for any given country would be inappropriate. Instead, to account for the heterogeneity in the proportion of diarrhoea deaths WHO has employed a statistical model using covariates thought to be associated with that proportion.

The WHO model divided the Member States, which at the time numbered 192, into five strata based on under-5 mortality and region (Table 1). The proportion of diarrhoea deaths caused by rotavirus infection was estimated for countries within each stratum based on a DerSimonian and Laird random-effects meta-analysis of studies from countries in the stratum.12 However, other reasonable covariates besides geographical area and under-5 mortality could be used to predict the proportion, as noted by Parashar et al.8

This paper explores how estimates of the number of child deaths from rotavirus diarrhoea and their uncertainty change when different model structures and covariates are included in the model used to estimate the proportion of diarrhoea deaths caused by rotavirus. Furthermore, we show how Bayesian model averaging can be used to account for model uncertainty.

Methods

The WHO model is based on the 76 prospective observational studies previously mentioned, all of which lasted for at least one year and were conducted between 1990 and 2004. Each study reported data on the proportion of diarrhoea hospitalizations among children less than 5 years of age in whom rotavirus was detected, and this proportion was used as a proxy for the proportion of diarrhoea deaths caused by rotavirus. For the purposes of this illustrative example, we have assumed that the proxy is valid.7,8 The inverse of the sample variance of this proportion was the only factor used to weight the studies, although measures of study quality could have been easily incorporated into the models. All other data used to model the proportion of diarrhoea deaths caused by rotavirus were gathered through the WHO Statistical Information System.

Strata models

Table 2 shows the variables used to stratify countries and believed to be associated with the proportion of diarrhoea deaths caused by rotavirus based on consultation with expert WHO staff, including a co-author, and a review of the existing literature on estimates of the global burden of rotavirus infection.611 Although other relevant covariates certainly exist, to predict the proportion of diarrhoea deaths caused by rotavirus in each country we were restricted to using only those covariates that were available for all countries. Since WHO initially estimated the burden of rotavirus deaths in children by dividing countries into five strata based on under-5 mortality and geography, we compared the WHO estimate with the results obtained with models in which countries were stratified according to the criteria listed in Table 1. We chose these criteria so that the five strata would have approximately equal ranges of the covariate.

We performed a DerSimonian and Laird random-effects meta-analysis to estimate the mean proportion of diarrhoea deaths in children caused by rotavirus in each stratum and its standard error.12 More specifically, for each study j in stratum i we assumed that


(1)


where Pij is the proportion of diarrhoea deaths due to rotavirus, μi is the mean proportion across stratum i, δij is the between-study random effect and eij is the within-study random error. We assumed that both δij and eij were independent and normally distributed. For consistency with the WHO model, we used the user-contributed package, rmeta, in the statistical computing language R (R Project for Statistical Computing, Vienna, Austria) to fit these models. For consistency with other studies of disease burden, we used Monte Carlo methods in all the models with which we estimated the global disease burden and its CI.7,8 We drew 10 000 parameter estimates from their sampling distributions and used them to estimate the disease burden. The mean was used as the global burden estimate and percentiles 2.5 and 97.5 were used as the CI bounds.

For comparison purposes only, the estimated proportion of diarrhoea deaths caused by rotavirus was determined using a random effects meta-analysis with all 76 studies grouped into a single stratum.

Regression models

While stratifying the study data does reduce some of the between-study variability, van Houwelingen et al. have noted the importance of exploring all sources of study heterogeneity by including study-level covariates.15 Including additional predictors can also account for the fact that the 76 studies may not represent a random sample of diarrhoea deaths in children, a critical assumption in meta-analysis.16 Because few study-level covariates were consistently reported, we used country-level covariates instead, just as we used country-level covariates to stratify the studies.

To include continuous country-level covariates and still separate within-study error from between-study error, a mixed-effects model or a meta-regression model was fitted using maximum likelihood in SAS version 9.1 (SAS Institute, Cary, United States of America).15 In principle, the model is quite similar to the one described by Equation 1, only instead of just including indicator variables for each stratum we allowed continuous covariates to be included and assumed that the variance in the between-study error was the same across all studies. Specifically, we assumed that


(2)


where Pj is the proportion of diarrhoea deaths due to rotavirus in study j, xj is the vector of country-level covariates, β is the vector of country-level parameters, δj is the between-study random effect and ej is the within-study error. As before, both δj and ej were assumed to be independent and normally distributed.


We again considered models with the covariates listed in Table 2. To derive a set of models we used a combination of expert knowledge and numerical methods. Among the eight variables considered for inclusion, models with every combination of one, two and three variables were fit and the Bayesian information criterion (BIC) was tabulated for each model.17 BIC is a measure of how well the model fits the data but penalizes models with a larger number of covariates; lower BICs indicate better fits.

Model averaging

To account for model uncertainty, we averaged across several models rather than choosing a single model’s results. The BIC can be used to empirically approximate the posterior probability of the model, which represents the likelihood that the model is correct.3,18 These posterior model probabilities can be used to derive an overall averaged point estimate and a standard error that includes model uncertainty.35 To obtain the model-averaged point estimate, E(θ|p), of the number of child deaths caused by rotavirus θ, we averaged the estimates from each of the different models, E(θ|Mk,p), weighted by how strongly the data supported each model or the posterior probability, P(θ|Mk,p):

(3)


where p is the vector of study data, K is the total number of models considered, and Mk is model k.

The model-averaged estimated variance of the burden estimate, Var(θ|p), is given by:

(4)
where Var(θ|p,Mk) is the variance of the estimate of the number of global childhood deaths from rotavirus infection in model k. The model-averaged variance takes into account both the variability in each model’s estimate due to sampling variability (second term) and the variability between the different models or model uncertainty (first term).

To compare the quality of the strata models to that of the meta-regression models, the stratification models were refit using maximum likelihood to obtain a valid BIC, which only slightly alters the standard error estimates. Because one stratum in the WHO model had only one study assigned to it, WHO used some studies assigned to other strata to also estimate the proportion in the stratum with one study.8 A statistical comparison with other models that did not use those studies twice is invalid. A conceptually similar but slightly different stratification scheme was used for comparisons and is referred to as the WHO alternative model (Table 1).

Results

Stratification models

The one-stratum model, which was constructed for comparative purposes only, produced a point estimate of 664 000 child deaths (95% CI: 608 000–721 000). This is substantially greater than the WHO’s estimate of 527 000 (CI: 475 000–580 000). Among the models with five strata, large variability was noted in the estimated number of rotavirus deaths among children (Fig. 1, Table 3). Point estimates ranged from 526 000 to 630 000 deaths. The stratification methods that used mortality rates other than under-5 mortality yielded point estimates similar to each other and ranging from 526 000 to 537 000 deaths. All but one of these strata models yielded predictions greater than the WHO estimate. The lengths of the model-based CIs were fairly consistent between models. Among the stratification models, the alternative WHO model best fit the data according to the BIC.

Fig. 1. Point estimatesa and Monte Carlo 95% confidence intervals (CIs) for the models developed from stratifying countries into five stratab based on the variables listed
DTP3, third dose of diphtheria–tetanus–pertussis vaccine; GNI, gross national income; log, logarithm; WHO, World Health Organization.
a The vertical corresponds to the WHO estimate.
b Cut-off values for each stratum are presented inTable 1.
c “ WHO” refers to the stratification method used by the WHO.
d “ WHO alternative” refers to the WHO stratification method with a slight adjustment (Table  1).
e The model average estimate averages across all the models here except for the WHO model.

Meta-regression models

Within the meta-regression models, the model that included the infant mortality rate and the natural logarithm of gross national income per capita was found to best fit the data according to the BIC (Table 4). The 20 models with the lowest BIC scores showed little difference in BIC scores. In spite of the models’ comparable predictive capacity, differences in the estimated number of rotavirus deaths in children were large, having ranged from 492 000 to 588 000 (Fig. 2). Two of the point estimates of those 20 models exceeded the limits of the 95% CI of the WHO model. In fact, the CIs yielded by the models produced by meta-regression were all wider than those produced by WHO and by most of the other stratification models. The stratification models impose a stronger assumption that the mean proportion is constant over all countries in a particular stratum, which leads to narrower CIs. The BIC scores for the meta-regression models were all considerably lower than the stratification models, including the alternative WHO model. This suggests that meta-regression models can achieve improved prediction over stratification models.

Fig. 2. Point estimatesa and Monte Carlo 95% confidence intervals (CIs) for the 20 models with the lowest Bayesian information criterion (BIC) developed from meta-regressionb
Fig. 2. Point estimates<sup>a</sup> and Monte Carlo 95% confidence intervals (CIs) for the 20 models with the lowest Bayesian information criterion (BIC) developed from meta-regression<sup>b</sup>
a The vertical line corresponds to the World Health Organization estimate.
b Based on the covariates in Table 2.
c The model average estimate averages across all the models here.

Model-averaged results

The model-averaged point estimates of the global burden of rotavirus deaths in children for each of the two broad methods (five strata and meta-regression) were similar to each other and to the WHO estimate, but the CIs were all considerably wider than the WHO’s. The model-averaged estimate of the number of deaths among all stratification models with five strata, 530 000 deaths, was similar to the point estimates of the individual models that stratified by mortality. The 95% CI interval, 473 000 to 586 000 deaths, was moderately wider than the CI yielded by the WHO model. Among the top 20 meta-regression models, the model-averaged estimate of the number of rotavirus deaths in children, 540 000, was slightly larger than the WHO estimate, and the 95% CI: 443 000 to 638 000 deaths, was far wider (195 000 versus 105 000 deaths). Averaging across all the models considered gave an estimated number of childhood rotavirus deaths of 541 000 with a 95% CI of 442 000 to 640 000 deaths, almost identical to the model-averaged results of the meta-regression models.

Discussion

The confidence limits of each model developed only account for variability in the data. Repeating the process of conducting 76 studies would yield slightly different results owing to sampling variability, which is captured by each model’s standard errors. However, the estimates between the different models exhibit substantial variability that is not taken into account by the model-based standard errors. Even after limiting the models considered to only the stratification models and the 20 regression models with the lowest BIC, the estimated number of rotavirus deaths in children in 2004 ranged from 492 000 to 664 000. This variability exists within each particular class of models.

Widely varying predictions between different models does not automatically suggest that one must account for model uncertainty because the quality of the models or their posterior probabilities should also be considered. Several poor models will yield highly variable predictions in the burden estimates, but because the model-average point estimate (Equation 3) and Equation 4 are weighted by the model quality, , models poorly supported by the data do not practically affect the model average point estimate or increase the model uncertainty. Our concern arises when good fitting models lead to variable predictions.

For example, among the meta-regression models, all 20 of the models with the lowest BIC scores were strongly supported by the data. Even the 20 best models yielded significant variability (95 000 deaths) in the estimated global burden. In this case, choosing a single set of covariates is clearly inadequate because many models that are well supported by the data give vastly different estimates of the number of rotavirus deaths in children. However, the overall model-averaged estimate of the global burden of diarrhoea deaths caused by rotavirus among children and its uncertainty are quite similar to the model average and uncertainty from the meta-regression models, since the strata models are poorly supported by the data. The data also suggest that limiting consideration to just stratification models (i.e. to a single type of model structure) is a poor decision since the data more strongly support the meta-regression models.

The results suggest that the uncertainty captured due to sampling variability is only a portion of the overall uncertainty in the estimate. Model uncertainty represents another source of variability that must be accounted for in the model. Model averaging over a broad class of models in the case of the disease burden estimates for childhood diarrhoea rotavirus deaths increases the width of the 95% CI from 105 000 to 198 000 deaths. By not accounting for model uncertainty, the WHO model substantially understates the level of uncertainty of its estimates.

Besides the model selected by WHO, many others could also produce reasonable results. The alternative models we selected are only a subset of all possible models. From a statistical perspective, we could have allowed the covariates to vary nonlinearly with the response or allowed some parameters to be estimated at the regional level (hierarchical model). For the rotavirus example presented in this paper, we lacked enough data to reliably estimate parameters at the regional level or to entertain nonlinear functional forms. From an epidemiologic perspective, we could have considered using several different methods, including a natural history model, to estimate the number of rotavirus deaths. Different epidemiologic approaches would have probably increased the model uncertainty substantially. Unfortunately, the data available to us did not allow us to consider other modelling approaches in this application. For other disease burden estimates, a broader class of models could be considered, but the underlying message remains the same: different model choices result in variable burden estimates and model averaging provides an attractive way of accounting for model uncertainty.

The methods used in this study to develop new measures for quantifying the uncertainty in the estimated global burden of childhood deaths from rotavirus diarrhoea illustrates the importance of accounting for model uncertainty and are applicable to other disease burden estimations. Since the accuracy of the data needed to inform different policy decisions will vary, failure to accurately describe the uncertainty surrounding disease burden estimates may lead to inappropriate uses of the data. WHO needs to publish disease burden estimates that fully describe the uncertainty so that its Member States can accurately prioritize global health problems and spend public health money in a cost-effective way. Furthermore, by overestimating certainty in disease burden estimates, epidemiologists downplay the need for improvements in disease surveillance and statistical capacity. Whenever a statistical model is used to estimate disease burden, we recommend developing multiple candidate models and assessing the quality of the models using fit statistics such as BIC or Akaike’s information criterion or, in the Bayesian context, posterior model probability.5,19 If a large amount of variability exists between model results that are strongly supported by the data, then model averaging should be used to account for model uncertainty.


Funding:

United States National Science Foundation Grant No. 0354308.

Competing interests:

None declared.

References

Share