Mooney, S. J., Bader, M. D. M., Lovasi, G. S., Neckerman, K. M., Rundle, A. G., & Teitler, J. O.
Mooney, S. J., Bader, M. D. M., Lovasi, G. S., Neckerman, K. M., Rundle, A. G., & Teitler, J. O. (2018). Using universal kriging to improve neighborhood physical disorder measurement. Sociological Methods & Research, 1–23. Advance online publication.
Ordinary kriging, a spatial interpolation technique, is commonly used in social sciences to estimate neighborhood attributes such as physical disorder. Universal kriging, developed and used in physical sciences, extends ordinary kriging by supplementing the spatial model with additional covariates. We measured physical disorder on 1,826 sampled block faces across four U.S. cities (New York, Philadelphia, Detroit, and San Jose) using Google Street View imagery. We then compared leave-one-out cross-validation accuracy between universal and ordinary kriging and used random subsamples of our observed data to explore whether universal kriging could provide equal measurement accuracy with less spatially dense samples. Universal kriging did not always improve accuracy. However, a measure of housing vacancy did improve estimation accuracy in Philadelphia and Detroit (7.9 percent and 6.8 percent lower root mean square error, respectively) and allowed for equivalent estimation accuracy with half the sampled points in Philadelphia. Universal kriging may improve neighborhood measurement.
Universal kriging that incorporates relevant covariates like population density and housing vacancy helped to modestly reduce the sampling error and sampling variance over ordinary kriging when predicting neighborhood disorder in most, but not all, instances. A geostatistical method to estimate data points between sampled locations, this methodology also required no additional time spent auditing in the field by utilizing preexisting census data. Universal kriging showed modest improvements in estimation accuracy, but not uniformly across multiple cities nor census measures. For example, housing vacancy was the covariate which improved estimation accuracy the most, 6.8% and 7.9% in Detroit and Philadelphia, respectively. However, the addition of population density to the models showed the least improvement, no more than 1.4% in any city. The universal kriging model incorporating housing vacancy in Philadelphia was also successful in substantially reducing the sampling density necessary for accuracy, while the incorporation of population density was not. In some cases, ordinary kriging models performed better than universal kriging models. The mixed results across the cities and different measures could be related to the covariates’ theoretical relevance to each of the four U.S. cities studied.
Description of method used in the article
The estimation accuracy of ordinary kriging was compared to the accuracy of universal kriging in estimating physical disorder measures’ values. Universal kriging is a geostatistical method of estimating values between sampled locations that incorporates spatial correlations between observations and measured covariates in the same prediction model. In contrast, other common methodologies only incorporate one or the other. The authors' universal kriging prediction model incorporated 2010 U.S. Census measures of population density, housing vacancy, and proportion of owner-occupied housing units occupied per census block group to interpolate better a measure of neighborhood disorder than ordinary kriging models. First, the models attempted to predict the data from a prior study that audited measures of physical disorder in New York City, Philadelphia, Detroit, and San Jose. For each observation from the prior study, the researchers used all other observations to predict its value and then computed the root mean squared error. This was done with models using ordinary kriging, universal kriging, and the inclusion and exclusion of different census covariates. Second, 800 universal kriging models with continuously smaller sample sizes of the Philadelphia data were created to test the estimation accuracy of population density and housing vacancy at different sampling densities. All analyses used R for Windows.
Of some practical use if combined with other research