American Sociological Association

Estimating Income Statistics from Grouped Data: Mean-constrained Integration over Brackets

Researchers studying income inequality, economic segregation, and other subjects must often rely on grouped data—that is, data in which thousands or millions of observations have been reduced to counts of units by specified income brackets. The distribution of households within the brackets is unknown, and highest incomes are often included in an open-ended top bracket, such as “$200,000 and above.” Common approaches to this estimation problem include calculating midpoint estimators with an assumed Pareto distribution in the top bracket and fitting a flexible multiple-parameter distribution to the data. The authors describe a new method, mean-constrained integration over brackets (MCIB), that is far more accurate than those methods using only the bracket counts and the overall mean of the data. On the basis of an analysis of 297 metropolitan areas, MCIB produces estimates of the standard deviation, Gini coefficient, and Theil index that are correlated at 0.997, 0.998, and 0.991, respectively, with the parameters calculated from the underlying individual record data. Similar levels of accuracy are obtained for percentiles of the distribution and the shares of income by quintiles of the distribution. The technique can easily be extended to other distributional parameters and inequality statistics.


Paul A. Jargowsky and Christopher A. Wheeler





Starting Page


Ending Page