Consider the problem of characterizing and identifying people based on 20 attributes of “some one who exhibits the characteristics of recession affected vs. some one who is not recession affected”.
The data is coming from a market research done on say 5000 people who answered among other questions, questions like
– what is your view on the economy, will it recover before some time point, will it recover as much as before year XXXX,…
– are you planning to pay off your debt
– are you planning to postpone major purchases
– reduce vacations.
So if we want to characterize people based on the number of times they say “YES” vs. “NO” or some such binary classifictions, then our dependent variable is a count variable. Typically there are four different ways to model
– Use Poisson (it assumes the mean=variance)
– Use negative binomial regression (good for handling excessive occurrence of zero value of the random variable)
– Use hurdle model
– Zero inflated regression
See “Count Data Models in SAS”, by WenSui Liu and Jimmy Cela, SAS SUGI2008.
From Data Monster & Insight Monster