Tags

Type your tag names separated by a space and hit enter

[Multiple imputation of missing at random data: General points and presentation of a Monte-Carlo method].
Rev Epidemiol Sante Publique. 2009 Oct; 57(5):361-72.RE

Abstract

BACKGROUND

Statistical analysis of a data set with missing data is a frequent problem to deal with in epidemiology. Methods are available to manage incomplete observations, avoiding biased estimates and improving their precision, compared to more traditional methods, such as the analysis of the sub-sample of complete observations.

METHODS

One of these approaches is multiple imputation, which consists in imputing successively several values for each missing data item. Several completed data sets having the same distribution characteristics as the observed data (variability and correlations) are thus generated. Standard analyses are done separately on each completed dataset then combined to obtain a global result. In this paper, we discuss the various assumptions made on the origin of missing data (at random or not), and we present in a pragmatic way the process of multiple imputation. A recent method, Multiple Imputation by Chained Equations (MICE), based on a Monte-Carlo Markov Chain algorithm under missing at random data (MAR) hypothesis, is described. An illustrative example of the MICE method is detailed for the analysis of the relation between a dichotomous variable and two covariates presenting MAR data with no particular structure, through multivariate logistic regression.

RESULTS

Compared with the original dataset without missing data, the results show a substantial improvement of the regression coefficient estimates with the MICE method, relatively to those obtained on the dataset with complete observations.

CONCLUSION

This method does not require any direct assumption on joint distribution of the variables and it is presently implemented in standard statistical software (Splus, Stata). It can be used for multiple imputation of missing data of several variables with no particular structure.

Authors+Show Affiliations

UR010, santé de la mère et de l'enfant en milieu tropical, institut de recherche pour le développement, 08 BP 841, Cotonou, Bénin. Gilles.Cottrell@ird.frNo affiliation info availableNo affiliation info available

Pub Type(s)

English Abstract
Journal Article

Language

fre

PubMed ID

19674855

Citation

Cottrell, G, et al. "[Multiple Imputation of Missing at Random Data: General Points and Presentation of a Monte-Carlo Method]." Revue D'epidemiologie Et De Sante Publique, vol. 57, no. 5, 2009, pp. 361-72.
Cottrell G, Cot M, Mary JY. [Multiple imputation of missing at random data: General points and presentation of a Monte-Carlo method]. Rev Epidemiol Sante Publique. 2009;57(5):361-72.
Cottrell, G., Cot, M., & Mary, J. Y. (2009). [Multiple imputation of missing at random data: General points and presentation of a Monte-Carlo method]. Revue D'epidemiologie Et De Sante Publique, 57(5), 361-72. https://doi.org/10.1016/j.respe.2009.04.011
Cottrell G, Cot M, Mary JY. [Multiple Imputation of Missing at Random Data: General Points and Presentation of a Monte-Carlo Method]. Rev Epidemiol Sante Publique. 2009;57(5):361-72. PubMed PMID: 19674855.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR T1 - [Multiple imputation of missing at random data: General points and presentation of a Monte-Carlo method]. AU - Cottrell,G, AU - Cot,M, AU - Mary,J-Y, Y1 - 2009/08/11/ PY - 2007/08/06/received PY - 2009/03/06/revised PY - 2009/04/15/accepted PY - 2009/8/14/entrez PY - 2009/8/14/pubmed PY - 2009/12/24/medline SP - 361 EP - 72 JF - Revue d'epidemiologie et de sante publique JO - Rev Epidemiol Sante Publique VL - 57 IS - 5 N2 - BACKGROUND: Statistical analysis of a data set with missing data is a frequent problem to deal with in epidemiology. Methods are available to manage incomplete observations, avoiding biased estimates and improving their precision, compared to more traditional methods, such as the analysis of the sub-sample of complete observations. METHODS: One of these approaches is multiple imputation, which consists in imputing successively several values for each missing data item. Several completed data sets having the same distribution characteristics as the observed data (variability and correlations) are thus generated. Standard analyses are done separately on each completed dataset then combined to obtain a global result. In this paper, we discuss the various assumptions made on the origin of missing data (at random or not), and we present in a pragmatic way the process of multiple imputation. A recent method, Multiple Imputation by Chained Equations (MICE), based on a Monte-Carlo Markov Chain algorithm under missing at random data (MAR) hypothesis, is described. An illustrative example of the MICE method is detailed for the analysis of the relation between a dichotomous variable and two covariates presenting MAR data with no particular structure, through multivariate logistic regression. RESULTS: Compared with the original dataset without missing data, the results show a substantial improvement of the regression coefficient estimates with the MICE method, relatively to those obtained on the dataset with complete observations. CONCLUSION: This method does not require any direct assumption on joint distribution of the variables and it is presently implemented in standard statistical software (Splus, Stata). It can be used for multiple imputation of missing data of several variables with no particular structure. SN - 0398-7620 UR - https://www.unboundmedicine.com/medline/citation/19674855/[Multiple_imputation_of_missing_at_random_data:_General_points_and_presentation_of_a_Monte_Carlo_method]_ L2 - https://linkinghub.elsevier.com/retrieve/pii/S0398-7620(09)00345-9 DB - PRIME DP - Unbound Medicine ER -
Try the Free App:
Prime PubMed app for iOS iPhone iPad
Prime PubMed app for Android
Prime PubMed is provided
free to individuals by:
Unbound Medicine.