Bulletin of the World Health Organization

Methodological trends in studies based on verbal autopsies before and after published guidelines

Rohina Joshi a, Andre Pascal Kengne a & Bruce Neal a

a. The George Institute for International Health, University of Sydney, PO Box M201, Missenden Road, Sydney NSW 2050, Australia.

Correspondence to Rohina Joshi (e-mail: rjoshi@thegeorgeinstitute.org).

(Submitted: 14 November 2007 – Revised version received: 18 September 2008 – Accepted: 30 November 2008 – Published online: 30 June 2009.)

Bulletin of the World Health Organization 2009;87:678-682. doi: 10.2471/BLT.07.049288


Information about the causes of death in a population is essential for monitoring health and planning appropriate health services.1 While most higher-income countries have in place robust processes for documenting deaths and their causes, many developing countries lack such systems due to organizational and economic constraints and other factors.2

Verbal autopsy-based cause of death recording systems are widely used in countries where vital registration and death certification systems are weak and most people die at home without medical certification of the cause of death.3 A verbal autopsy is a method used to determine the cause of death from data collected about the symptoms and signs of illness and the events preceding death.4,5 This method is based on the hypothesis that the symptoms and signs surrounding most causes of death can be recognized, recollected and reported by a person present during the period prior to death.6,7 For the information provided by verbal autopsies to be maximally reliable, certain design characteristics need to be incorporated into studies based on verbal autopsy methods. To ensure that the data collected is of high quality, the data collection tool should have structured and unstructured questions, interviewers should be specially trained, interviewees should have remained close to the deceased during illness, and between death and data collection only a short time interval should have elapsed. Similarly, for cause of death to be accurately assigned, algorithms for translating the data into causes of death must be clearly defined and the possibility of assigning multiple causes of death (i.e. immediate, underlying and contributory) must be present. Finally, subsequent validation studies should be carried out.

To optimize the use of these methods, in the early 1990s WHO convened a series of expert meetings that led to the publication of several reports4,5,7,8 in which key design features for studies based on verbal autopsy methods were recommended. Three of the four reports – one of them the outcome of a workshop sponsored jointly by WHO and the United Nations Children’s Fund – were published in international journals, and the fourth was published by WHO. However, the extent to which the guidelines were systematically disseminated to relevant parties and their effect on study design are unclear. Thus, the objective of this study was to determine whether the expert recommendations published from 1992 to 1994 influenced the way researchers conduct verbal autopsy studies.


Search strategy

From June 2005 to May 2006, we conducted computerized searches of PubMed and WHO9 databases using as key phrases “verbal autopsy”, “mortality surveillance”, “mortality statistics”, “post mortem interview” and “cause of death”. References quoted in original publications and in the websites of the International Network of Field Sites with Continuous Demographic Evaluation of Populations and their Health (INDEPTH Network)10 and the Adult Morbidity and Mortality Project11 were manually searched for additional information.

Study inclusion and exclusion criteria

To be included in this review, studies had to fulfil certain criteria. They had to: (i) have been published before 2006 in a peer reviewed journal or as a report accessible through a web search; (ii) describe the verbal autopsy method used, and (iii) be written in English or French. A study reporting data in more than one paper was included only once, and data from the paper which described the methods in the greatest detail were used.

Data extraction

Standard information was extracted from all eligible studies by one reviewer for English-language studies (RJ) and another for French-language studies (A-PK). The information included the following: country in which the data were collected; age group studied; number of deaths observed; approach used to classify deaths; questionnaire format; interviewers’ education and gender; type of respondent from whom information was obtained; recall period; granting of consent; type and nature of adjudication process; number of causes of death assignable per case; and presence or absence of a validation study. Unless otherwise indicated, all studies were assumed to have been designed three years prior to publication.


For this study, outcomes were based on seven key recommendations extracted from the published guidelines (Table 1), as follows: (i) a combined questionnaire including both a series of structured questions and an open narrative section, (ii) a trained interviewer educated in verbal autopsy methods and interviewing skills, (iii) a suitable respondent who tended to the deceased during illness (usually a family member or caretaker), (iv) a recall period of less than 5 years between the dates of death and data collection, (v) predefined criteria or algorithms to guide the assignment of causes of death, (vi) ability to assign multiple causes of death, such as immediate, underlying, and contributory, and (vii) the performance of a validation study to ascertain the reliability of the causes of death assigned. Immediate, underlying and contributory causes are illustrated by the example of a death from stroke (underlying cause) resulting from a secondary pneumonia (immediate cause) in a patient with a history of hypertension (contributory cause).


For the primary analyses, studies were divided into those designed before and after 31 December 1994. The proportion of studies in each of these two groups that contained each recommended methodological feature was compared. Sensitivity analyses were performed using the dates when the guidelines were laid down – 31 December 1992 and 31 December 1996 – with no effect on the study conclusions for any measure. Studies for which the particular indicator measured was not reported were excluded from the relevant analysis. Means were compared with Pearson’s χ² test, and all analyses were performed with SPSS version 12.0 (SPSS Inc., Chicago, IL, United States of America). Fisher’s exact test was used for expected counts less than 5.


The search identified 102 studies conducted in 39 developing countries (Appendix A, available at: http://www.thegeorgeinstitute.org/iih/index.cfm?7E6B9894-AED4-1907-C50B-BBFFB82559F8#methodological-trends-in-studies-using-verbal-autopsies). The first study was conducted in Bangladesh in 196812 and the last in the Democratic Republic of the Congo in 2003–2004.13 The 99 studies that reported data comprised a total of 139 258 deaths. The number of deaths in each study ranged from 15 to 80 000, with a mean of 1407 and a median of 271. Sixty studies were designed before and 42 studies after the end of 1994. Half of all the studies were of deaths in infants and children, about one-fourth were of maternal deaths and about one-fourth were about deaths in adulthood or at any age.

Methodology indices

The methodology indices used were reported in 80% of the studies or more, except in the case of the recall period and the performance of a validation study, which were reported in only 58.8% and 27% of the studies, respectively (Table 2). Of the studies that provided methodology data, two-thirds or more followed each of the four main guideline recommendations regarding data collection. Almost all studies used a suitable respondent and had a recall period of less than 5 years. There was, however, no strong evidence that these or other recommended data collection indices were used more often after 1994 (P > 0.22).

Recommended methods for assigning the cause of death were applied much less frequently than recommended data collection methods. Predefined algorithms for assigning the cause of death or the ability to assign multiple causes of death were found in only about one-third of the verbal autopsy studies. Once again, there was no evidence of a post-1994 increase in the proportion of studies in which the recommended ways of assigning the cause of death had been used (P > 0.33).Validation studies had been conducted in about three-fourths of the studies that reported their presence or absence, but only about one-fourth of all studies provided data about this aspect of study design. No statistical difference was found in the proportion of studies carried out before and after the end of 1994 for which a validation study was conducted (P = 0.69).


This review provides a comprehensive update on the verbal autopsy studies performed in the world up to the time of this review and of the methods applied in them. It highlights the variability of the methods used and the negligible influence exerted by a series of methodological guidelines and reports developed in the early 1990s.4,5,7

The efforts undertaken in the early 1990s to standardize the design and conduct of verbal autopsy studies lead one to assume that suboptimal methods were felt to have important consequences. Similarly, the content of the recommendations issued at the time suggests that certain methodological features were felt to ensure a yield of maximally reproducible, valid and comparable information from subsequent verbal autopsy studies. Unfortunately, these features, which were identified and specified more than a decade ago, have scarcely been implemented, perhaps because researchers have not known about them, have lacked access to the journals that published them or have been unable to afford to apply them in full. It is also possible that the absence of evidence in support of many of the recommendations lessened their value in the eyes of researchers designing new studies. The explanation is beyond the scope of this study but would be a good topic for future research, since some of the possibilities listed above can be addressed quite easily.

It is difficult to know for certain what effect, if any, not following the guidelines had on study findings. Nearly all guideline recommendations appear to have been based on consensus expert opinion rather than on scientific evidence, and it is impossible to know if they do in fact represent best practice. However, despite the lack of evidence, the rationale behind most of the recommendations is plain. For example, the ability to assign multiple causes of death rather than a single cause leads to better opportunities to explore existing problems, such as national biases in cause of death assignment, and allows a more informed evaluation of study results. Likewise, a data collection strategy that combines structured and open data collection techniques is likely to lead to better data collection than one restricted to either technique alone.

In the present study, the evaluation of verbal autopsy study techniques was restricted to a few key design features that were identified as important in most of the recommendations and that could be easily and objectively extracted from most of the study reports identified. Aspects of study design besides those reported here may have been of equal or greater importance to study integrity. Of the methodological indices studied, several showed positive trends. However, there is no evidence to indicate that such trends are not the result of chance. While the analyses were limited in their ability to detect progress towards the incorporation of some recommended design features that were already being used fairly widely (mainly data collection indices), the study was well placed to detect the potential effects of the guidelines on less widely used features pertaining to cause of death assignment and validation. In particular, little movement was noted towards the use of pre-specified algorithms to aid the allocation of cause of death or the assignment of multiple causes of death, two methodological features that would probably greatly enhance the quality of information derived from verbal autopsy projects. Furthermore, despite specific recommendations in the guidelines, very few studies have attempted to validate their findings, and those that have reported doing so have used very different methods. Validation studies that compare the causes of death derived from the verbal autopsy against gold standard reference diagnoses are particularly necessary.

This review was limited by an absence of data from several studies, which reduced statistical power and introduced the risk of bias. Given the long time elapsed since many of the reports were published, contacting their authors directly was considered impossible and further bias could have been introduced by the likelihood that it was easier to contact the authors of more recent reports. Although efforts were made to identify all verbal autopsy studies conducted, some may have been missed due to publication in a language other than English or French or in technical reports not widely available. However, it is unlikely that the missing data would have led to a different conclusion regarding the uneven and rather poor quality of many verbal autopsy studies.

In the context of changing patterns of death in many lower-income countries that rely on verbal autopsy systems for mortality surveillance,14,15 high quality, standardized techniques are becoming increasingly important.16 Better uptake of key design recommendations would undoubtedly increase the reproducibility of verbal autopsy studies and the validity and comparability of their results, as well as enhance decision-making processes surrounding health care in lower-income settings. However, how uptake might be further enhanced is not immediately clear, since the recommendations already appear in leading medical journals and/or as downloadable technical reports.17,18 The establishment of an international network of researchers involved in conducting verbal autopsy projects may facilitate the design of future studies, although it would clearly be impossible to see to it that they were all performed within the framework of such a group. More systematic interaction between international collaborations, such as the INDEPTH Network,10 the Adult Morbidity and Mortality Project,11 India’s Sample Registration System’s prospective study19 and the Sample Vital Registration with Verbal Autopsy14 initiative, might be one practical way of enhancing the way verbal autopsy studies are conducted. A recent publication by WHO, representing one more effort to set international standards for verbal autopsy methods, is a positive step forward but unlikely to suffice in and of itself.20

Competing interests: None declared.