Statistically invalid results may be returned when enriching small sites and/or polygons that are comparatively smaller than a block group polygon
In ArcGIS Business Analyst Desktop, statistically invalid results may be returned when enriching small sites and/or polygons that are comparatively smaller than a block group polygon.
Before beginning a discussion of how append/spatial overlay works in Business Analyst Desktop, it is important to understand some terminology used to define census geographies in United States.
United States census geography
The smallest unit of census geography in United States is the Census block. The census block is contained in a block group that within a census tract. Census tracts are contained within counties, and counties are then contained in states.
Census tracts according to Census website, can contain 1,200 to 8,000 people, with an optimum size of 4,000 people. The census website states that, the tract boundaries are delineated with the goal that, the delineation be maintained for a long time, as this allows statistical comparison from census to census.
Census block groups according to census website contains 600 to 3,000 people. Each census tract contains at least one block group, these Block Groups are uniquely numbered.
Census block is an area that is bounded by visible features, such as streets, roads, streams, and railroad tracks. This means that within a city, a block can be very small, such as an area bounded by surrounding streets, but in suburban and rural regions, a block can be very large.
Esri Business Analyst Data
Esri provides data at the block group level, as the block group geographies capture an area that efficiently and accurately captures the demographics. The data is not available at the block level, what does exist is usually modeled, and may lack accuracy as compared to the block group level.
Append/Spatial Overlay analysis or Enrich a layer:
Block group apportionment is the most accurate apportionment method, especially when working with smaller areas. The other apportionment methods, such as the cascading or the hybrid methods, are less accurate but faster.
This article focuses only on the block apportionment method and why this may sometimes return unexpected results. The Block apportionment method uses a weighted centroid geographic method to aggregate data for the study area to be enriched. This method uses the block point data to apportion sites that may not fully envelop a block group.
For more information about how the weighting works, the ArcGIS REST API document: Data apportionment.
Business analyst data comes with a feature class called USA_ESRI_20XX_blocks that resides in the block_data geodatabase. This feature class is used in the apportioning of the data. The feature class contains demographics attributes that are used for weighing all variables and include population, households, housing units, and business variables.
The following simple example illustrates why one may see unexpected results after running the Spatial overlay/Append tool.
The following image shows a block group (BG), a polygon feature class that shows the census blocks within this block group, and the census block points (centroids of census blocks)
In this example we are enriching the census blocks that reside in the block group polygon. As mentioned previously, the census block points from Esri have weights for population, household, business, and weights for other attributes. When Append/spatial overlay or Enrich is carried out, these individual weights for the respective blocks are multiplied by the corresponding variables from the Esri block group layer.
For example, if one is enriching incremental population variables such as 2019 Population Age <1 + 2019 Population Age 1 + ……+ 2019 Population Age 84 + 2019 Population Age 85+, then the population weight from the block point layer is multiplied by the respective population variable from the Esri block group layer.
|Esri block group population (A)||Block population weight (B)||Total (A x B)|
|2019 Population Age < 1||
|2019 Population Age 1||
|2019 Population Age 2||14||0.03144||0.4402|
The calculations are statistically valid if they can round up to at least one. From the above table only 2019 Population Age 1 variable will be rounded to one, the others will be rounded to zero. Similarly, if all the incremental variables are summed up and compared to the total population for the block group, then the number can be more or less that the total population.
Note: Depending on the density of the population in a particular census block, the weight can be small or large. The summation of all census block point weights in the block group will always equal to one.
Solution or Workaround
This discrepancy is expected because when enriching a site/polygon that is small and envelops just one census block point, the calculation becomes problematic and the results returned may be statistically invalid. If the same analysis is carried out on polygons that overlap more census block points and block group boundaries, then the results will be more accurate.
Last Published: 10/22/2020
Article ID: 000023704
Software: ArcGIS Business Analyst Desktop 10.8.1, 10.8, 10.7.1, 10.7, 10.6.1, 10.6, 10.5.1, 10.5, 10.4.1, 10.4, 10.3.1, 10.3, 10.2.2, 10.2.1, 10.2, 10.1, 10.0