Missing Value Strategies


There is no better solution than to implement system capture mechanism that encourages/ coaxes/ incentivizes the end users of the application to provide the data so that it does not become a missing value situation.
One of the slides/tabulation in your study should be explicitly identifying the % missing for all the variables that are used in the study.
Explain why missing values for each of the study variables occur.
Sometimes, missing values are confused with naturally occurring non-availablity (existence) of values due to the construction/definition of variable.  This should be differentiated in the explanation of the % missing in the study variables. For example, in surveys that use hierarchical rule based questions, there will be missing situations that would occur because of the structure of the questions.
Use multiple imputation methods for application situations for best results.
Both “Delete Strategy” and “Mean Strategy” could be very biased methods, more so with the first one, than the later, depending on how much data is missing.
References:
http://www.ke.tu-darmstadt.de/publications/reports/tud-ke-2009-03.pdf.  This has more slant towards “machine learning” supporters.  According to the authors’ experimental approach with different strategies, for small incidence of missing values, a large number of different methods yield almost similar effectiveness. If the missing is too much, they diverge in effectiveness.  Also, “delete strategy” is the worst of all, not surprisingly. Also, combining multiple strategies is likely to yield better results.
http://people.oregonstate.edu/~acock/growth-curves/working%20with%20missing%20values.pdf   This document provides an excellent summary of statistical software approaches.  The authors provide a list of software and also example codes for some on how to use them.  The following is taken from the authors publication.
http://maartenbuis.nl/presentations/missing_cifor.pdfis an interesting presentation that walks through systematically, the problem and a desired approach to solving missing values.
http://www.ats.ucla.edu/stat/sas/library/multipleimputation.pdfprovides notes on how to use SAS to solve missing value imputations using PROC MI.
From Data Monster & Insight Monster

Also, see the following video for recent advances in missing value imputations.

http://www.youtube.com/watch?v=xnQ17bbSeEk

Leave a Reply

Your email address will not be published. Required fields are marked *