Free ZIP Code Level Census Data – Also, Reasons Why Your ZIP code Level Census Data Are Incorrectly Sold, Interpreted or Used

Join my site using the top right button “JOIN MY SITE” to receive login and password to download the ZIP level census data freely.

In the recent publication of 5 year American Community Survey data, US government published data for nearly 125 variables, with four types of element for each of the variable, for each of the zip code.

There are 32,989 zip codes.  This is the number of zipcodes for which www.Census.gov site, publishes data.  As to different reasons why different number of zip codes will be stated are given in http://www.zip-codes.com/zip-code-statistics.asp.

Join my site using the top right button “JOIN MY SITE” to receive login and password to download the ZIP level census data freely.

The meta data and actual data are available.

If you believe the margin of errors do not matter, you can use the specific column of data you are interested in; otherwise, how do you adjust for the sampling variation in the data.

As you can see, the zip data has four elements for each of the variable.

– Estimate (because it is based on a sample survey, no doubt the largest well designed survey done ever, possibly except India and China)

– Margin of error in the estimate (tells you how much uncertainty is built in)

– Percent of the estimate

– Percent margin of error

Here is the notes from census bureau verbatim.

“Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau’s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties.

Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section.

Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section.

Source:  U.S. Census Bureau, 2007-2011 American Community Survey

Explanation of Symbols:An ‘**’ entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.
A ‘-‘ entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.
A ‘-‘ following a median estimate means the median falls in the lowest interval of an open-ended distribution.
A ‘+’ following a median estimate means the median falls in the upper interval of an open-ended distribution.
A ‘***’ entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate.
A ‘*****’ entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate.
A ‘N’ entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small.
A ‘(X)’ means that the estimate is not applicable or not available.

Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data).  The effect of nonsampling error is not represented in these tables.

While the 2007-2011 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities.

Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization.”

To address the sampling variability, Census bureau in a serious and sincere way provides these margin of errors and it is fascinating the level of details they provide to make sure people interpret these numbers.

However, neither the vendors provide the data, nor help interpret the data correctly.

What is better? give you data which is not correct but you can do a quick and dirty analysis (very important to keep up with time) or give you right data and take time to do the right job?

From Data Monster & Insight Monster

Leave a Reply

Your email address will not be published. Required fields are marked *