The world health report

Chapter 2

Overview of risk assessment methods

The overall aim of the analyses reported here was to obtain reliable and comparable estimates of attributable and avoidable burden of disease and injury, for selected risk factors. More specifically, the objectives were to estimate, by age, sex and region, for selected risk factors:

  • attributable burden of disease and injury for 2000, compared to the theoretical minimum;
  • avoidable burden of disease and injury in 2010, 2020 and 2030, for a standardized range of reductions in risk factors.

Standard WHO age groups were chosen (0--4, 5--14, 15--24, 25--44, 45--59, 60--69, 70--79, and 80+ years) and epidemiological subregions were based on WHO regions, subdivided by mortality patterns (see the List of Member States by WHO Region and mortality stratum).

The methodology involved calculating population attributable risk, or where multi-level data were available, potential impact fractions. These measures estimate the proportional reduction in disease burden resulting from a specific change in the distribution of a risk factor. The potential impact fraction (PIF) is given by the following equation:

where RR is the relative risk at a given exposure level, P is the population level or distribution of exposure, and n is the maximum exposure level.

Potential impact fractions require three main categories of data input, as summarized in Figure 2.5. The relationship between these key input variables and the basic methodology involved in calculating and applying population attributable fractions is summarized in Figure 2.6. It is clear from Figure 2.6 that risk factors that are more prevalent or that affect common diseases can be responsible for a greater attributable burden than other risk factors that have much higher relative risks.

Choosing and defining risks to health

The risk factors assessed in this report were chosen with the following considerations in mind.

  • Potential global impact: likely to be among leading causes of disease burden as a result of high prevalence and/or large increases in risk for major types of death and disability.
  • High likelihood of causality.
  • Potential modifiability.
  • Neither too specific nor too broad (for example, environmental hazards as a whole).
  • Availability of reasonably complete data on risk factor distributions and risk factor--disease relationships.

There is unavoidably an arbitrary component to any choice of risk factors for assessment, as time and resource constraints will always operate and trade-offs will be required. For example, some factors like global warming where data are substantially incomplete may nonetheless be of such potential importance that they should be included and their impact estimated based on possible scenarios and theoretical models. These trade-offs should be made clear when the data sources, methods and results are reported in detail, including estimation of uncertainty.

Clearly, one risk factor can lead to many outcomes, and one outcome can be caused by many risk factors. For each possible risk factor--burden relationship, a systematic and documented assessment of causality was performed. Many approaches have been proposed for the assessment of causality. One that is widely known and reasonably well accepted is the set of "standards" proposed by Hill (29). These are not indisputable rules for causation, and Hill emphasized that they should not be taken directly as a score. It is, however, widely agreed that a judgement of causality should be increasingly confident with the accumulation of satisfied standards including the following.

  • Temporality -- Cause must precede effect in time.
  • Strength -- Strong associations that are credible are more likely to be causal than weak associations, because if a strong association were wholly to result from some other factor, then it is more likely that other factor would be apparent. But a weak association does not rule out a causal connection.
  • Consistency -- Repeated observations of associations in different populations under different circumstances increase a belief that they are causal. But some effects are produced by their causes only under specific circumstances.
  • Biological gradient -- Presence of a dose--response curve suggests causality, although some causal associations do have a threshold, and for others the dose--response can arise from confounding factors.
  • Plausibility -- Biological plausibility is relevant, but can be subjective and is based on current level of knowledge and beliefs.
  • Experimental evidence -- Experimental evidence, in which some groups differ only with respect to the risk factor of interest, provides powerful evidence of causation. But evidence from human experiments is often not available.

Systematic assessments of causality, along with the other criteria listed above, led to the inclusion in this report of a number of risks to health and affected outcomes, which are discussed in Chapter 4.

Estimating current risk factor levels and choosing counterfactuals

Risk factor levels in the population are the first main data input in estimating potential impact fractions. Extensive searches were required to estimate risk factor levels by the 224 age, sex and country groups used as the basis for analysis, particularly for data in economically developing countries. For all risk factors, there was a need to extrapolate data to some age, sex and country groups for which direct information was not available. Wherever possible, this extrapolation was based on generalizing from a particular subgroup that had similar health, demographic, socioeconomic or other relevant indicators.

The theoretical minimum was chosen as the counterfactual for all risk factors. For risk factors for which zero is not possible (for example, cholesterol), the theoretical minimum was the distribution associated with lowest overall risk. For some exposures (such as alcohol) there may be subgroups (by region, age or sex) for which zero exposure may not always be associated with the lowest risk. To maximize comparability, however, the theoretical minimum counterfactual was taken to be the same across population groups. This aided overall interpretation of the results, avoiding "shifting goal posts", yet still allowed for estimation of when minimum risks occurred at non-zero levels. Since policy-relevant reductions are likely to vary by, for example, age, sex or region, a range of estimates was made for counterfactual distributions at set intervals between the current situation and the theoretical minimum.

For the purposes of this report, risk factors were defined in light of data availability, the requirement for consistency, and a preference to assess multiple levels of exposure and hence the likely impact of shifting the risk factor distribution in the population.

Estimating current and future disease and injury burden

The second data input into potential impact fractions is information on amounts of burden of disease and injury in the population, by age, sex and region. Current and future disease and injury burden was estimated as part of the ongoing global burden of disease project (30).

Estimating risk factor--burden relationships

The third data input into potential impact fractions comprised estimates of risk factor--burden relationships by age, sex and subregion. For most risks, direct information on such relationships came only from developed countries. This highlights the importance of assessing generalizability of data, in view of the need to extrapolate results to age, sex and region groups for which direct evidence is not available. For risk factor levels, there is often no particular reason to expect levels to be consistent between regions. Risk factor--disease relationships will, however, often be more generalizable, since they may, at least in part, be intrinsic biological relationships. Consistency between the results of reliable studies conducted in different settings is an indicator of causality and generalizability. While the representativeness of a study population is an essential component of extrapolating results for risk factor levels, study reliability and comparability will often be more important in assessing risk factor--disease relationships. Since relative risks tend to be the most generalizable entity, these were typically reported. When relative risk per unit exposure varied between populations, this was incorporated wherever possible. For example, the relative risk for current tobacco smoking and heart disease appears to be less in the People's Republic of China than in North America and Europe, principally because of a shorter history of smoking among the Chinese.

Estimates of avoidable burden

Current action to target risks to health can change the future but cannot alter the past. Future disease burden can be avoided but nothing can be done about attributable burden. For this analysis, avoidable burden was defined as the fraction of disease burden in a particular year that would be avoided with a specified alternative current and future exposure. Estimates of avoidable burden are particularly challenging, given that they involve all the uncertainty in the estimates of attributable burden plus those in a number of extra data inputs, described below.

  • Projected global burden of disease.
  • Risk factor levels under a "business as usual" scenario. Some projections were based on observed trends over the past few decades (for example, childhood malnutrition) and others based on models using exposure determinants and their expected trends (for example, physical inactivity, indoor smoke from solid fuels).
  • Projected risk factor levels under a counterfactual scenario -- for example, a 25% transition towards the theoretical minimum, starting from 2000 and remaining at 25% of the distance from business as usual and theoretical minimum exposure.
  • Estimates of risk "reversibility". These may occur to different extents and over different time frames for various risk factor--burden relationships. After some time, the excess risk of a "previously exposed" group may reach that of the "never exposed" group, or may only be partially reversed. For all acute or almost-acute hazards, including injuries and childhood mortality risk factors, immediate reversibility was assumed. The impact of cessation of the use of alcohol and illicit drugs on neuropsychological diseases, while known to be delayed, was assumed to be fully reversed by 2010, the earliest reporting year. Thus ex-exposed in 2010 were assumed to have the same risk as never-exposed. For blood pressure (31,32) and cholesterol (33), most or all of the risks were assumed to be reversed within five years and all within 10 years. Since more distal risk factors such as obesity and physical inactivity operate in large part through these exposures, these data forme

Estimating the joint effects of multiple risks

The main estimates presented in this report are for burden resulting from single risk factors, with the assumption that all others are held constant. Such estimates are valuable for comparative assessments, but there is also a need for estimates of the net effects of clusters of risk factors. When two risks affect different diseases, then clearly their net effects are simply the sum of their separate effects. However, when they affect the same disease or injury outcomes, then the net effects may be less or more than the sum of their separate effects. The size of these joint effects depends principally on the amount of prevalence overlap (for example, how much more likely people who smoke are to drink alcohol) and the biological effects of joint exposures (for example, whether the risks of alcohol are greater among those who smoke) (27). However, these have very little influence on net effects when the population attributable fractions are high for individual risk factors, as was often the case in these analyses -- for example, more than 80% of diarrhoeal disease was attributed to unsafe water, sanitation and hygiene. The data requirements for ideal assessment of joint effects are substantial and assumptions were made of multiplicatively independent relative risks, except for empirical assessments of joint effects for two main clusters -- risk factors that are major causes of cardiovascular disease and those that are major causes of childhood mortality. An alternative approach is outlined in Box 2.6. This simulation method based on individual participant data from a single cohort is compatible with the joint effects estimated from aggregate data as described above.

Box 2.6 Estimating the combined effects of cardiovascular disease risk factors

There are several major risk factors for cardiovascular disease, and the actions of some are mediated through others. For example, overweight and obesity increase the risk of coronary disease in part through adverse effects on blood pressure, lipid profile and insulin sensitivity. The causal web model of disease causation reflects the fact that risk factors often increase not only the risk of disease, but also levels of other risk factors.

Separate estimation of the effects of individual risk factors does not typically take into account the effect of changes on the levels of other risk factors. One way of achieving this is to use measured relationships between the levels of the different risk factors to simulate what would happen in a `counterfactual cohort', if levels of one or more risk factors were altered. The relationship between levels of risk factors and disease can then be used to determine the rate of disease in the simulated cohort. The proportion of people in the population that would develop coronary heart disease (CHD) under each intervention is a counterfactual (unobserved) quantity. The g-formula (Robins, 1986) is a general nonparametric method that allows estimation of the counterfactual proportions under the assumption of no unmeasured confounders. This approach was taken using data from the Framingham Offspring Study on the risk factors body mass index, smoking, alcohol consumption, diabetes, cholesterol and systolic blood pressure.

A formula for predicting risk of CHD, given risk factor history, was estimated, and also the history of the other risk factors was used to predict future values of each risk factor following changes in some. A simulated cohort was generated from the study by sampling with replacement and various scenarios were applied to the cohort to assess the impact on 12-year CHD risk, taking into account the joint effects of all the risk factors. A combination of complete cessation of smoking, setting all individuals' body mass index to no more than 22, and a simulated mean cholesterol level of 2.3 mmol/l and corresponding variance was estimated to halve the 12-year risk of CHD in both women and men. The estimated effect of all three interventions -- a 50% relative risk reduction in coronary disease -- was less than a crude sum of the separate effects (19%, 9% and 31%, respectively). This is because some people suffered CHD resulting from the joint actions of two or more of the risk factors, and this model estimates the size of these joint effects.

Sources: (35,36).

Estimates of uncertainty

Confidence intervals for the attributable burden were estimated by a simulation procedure (37) incorporating sources of uncertainty from domains of the exposure distribution and the exposure--response relationships. Briefly, the method involved simultaneously varying all input parameters within their respective distributions and reiterating the calculation of the population attributable fraction. An uncertainty distribution around each estimate of population attributable fraction was obtained after 500 iterations of the simulation and, from this, 95% confidence intervals were derived. Each risk factor group provided data characterizing the uncertainty in the estimates of exposure distribution and exposure--response relationships. To the extent possible, the uncertainty estimates accounted for statistical uncertainty in available data as well as uncertainty in the methods used to extrapolate parameters across regions or countries.

Still further refinements would improve the current estimates and are not reflected in the reported uncertainty indicators. These include uncertainty in the burden of disease estimates; lack of data on prevalence among those with disease, such data ideally being required in population attributable fraction estimates that incorporate adjusted relative risks (38); and the likelihood that reduction of exposure to risks such as unsafe medical injections in 2000 would lead to less infection in subsequent years and also a smaller pool of infected people from whom transmission could be propagated. Finally, competing risks -- for example, someone saved from a stroke in 2001 is then "available" to die from other diseases in ensuing years -- have not been estimated, which is likely to lead to an overestimate of the absolute amount of attributable and avoidable disease burden, although it may not substantially affect the ranking of risk factors. However, competing risks are accounted for in the dynamic models that assessed the joint effects of risks on healthy life expectancy. This topic, along with appropriate discount rates, is considered in Chapter 5.