Census geography: Bridging data from prior years to the 2010 tract boundaries

The continual change in geography between successive censuses is a major barrier for longitudinal analysis. Census tracts are fundamental enumeration units for the U.S. decennial censuses and their boundaries very often change over time. In every new census many tracts are split, consolidated, or changed in other ways from the previous boundaries to reflect population growth or decline.

US2010 presents here a Longitudinal Tract Data Base (LTDB), which provides public-use tools to create estimates within 2010 tract boundaries for any tract-level data (from the census or other sources) that are available for prior years as early as 1970. We also provide a Backwards LTDB in which data provided in 2010 tract boundaries can be estimated within 2000 boundaries.

The LTDB was developed by a team including John Logan (Brown University), Zengwang Xu (University of Wisconsin, Milwaukee) and Brian Stults (Florida State University). The tract data for 1970-2010 that are included in the LTDB standard data set were prepared by Miao Chunyu (Brown University) from files downloaded from the NHGIS.

Key points:

  • The combination of population and area weighting that we employ between 2000 and 2010 has a high degree of accuracy. This is in many respects the same approach taken by the commercially available Neighborhood Change Data Base or NCDB (Tatian 2003) for 1990-2000. There are two sources of error in these estimates.
    • One problem is that blocks in 2000 are sometimes split into different portions that are assigned to different 2010 tracts. We allocate the block population to 2010 tracts in proportion to the area of these block fragments. Since blocks are usually small and have few residents, this aspect of the estimation is unlikely to cause much error.
    • A second source of error is probably more important, and it is found in both the LTDB and NCDB. When part of a tract in 2000 is reallocated into a new tract in 2010, we assume that the kinds of people living in that fragment are the same as in the fragment that is reallocated into another new tract. Census tracts are somewhat homogeneous, and on average the people living anywhere in a tract tend to be more like one another than like people in other tracts. But there remains the possibility that in some cases the composition of different tract fragments is quite dissimilar. A reliable check on this problem would require access to the original census data.
  • For researchers wishing to harmonize data for pre-2000 census tracts, one option is to acquire the NCDB files for 1970-1990 adjusted to 2000 boundaries and then apply the LTDB to bridge these data to 2010. The LTDB also provides a method for bridging to 1970-1990 tract boundaries. However the LTDB relies solely on interpolation based on land area for 1990, while NCDB interpolates based on block populations and (using the street grid as a proxy for population density) on the distribution of population within blocks. Therefore NCDB estimates for 1990 should be more reliable. (Both sources uses area interpolation for 1970 and 1980.)
  • There are situations in which utilizing the NCDB in this fashion is a less satisfactory solution, two of which deserve emphasis here. Most important, NCDB does not provide linked files for all census variables, but only for a selection of variables from the sample count files. In addition researchers are increasingly working with information aggregated to the tract level from non-census sources, such as criminal justice, public health, and voting records. To meet their needs requires a tool to convert such data to the 2010 boundaries. The LTDB offers an open-source crosswalk to link data from 1970-2000 to 2010, and it also provides user-friendly programming code to bridge data across years.

Note that much of the U.S. was not divided into census tracts in 1970, and even in 1980 many less populated areas were not tracted. Consequently many 2010 tracts will have missing values for 1970 and 1980.

fig1

Evolution of tracted areas in the U.S., 1970-1990. Blue areas were tracted in 1970; areas in red were added in 1980. In 1990 the entire nation was divided into census tracts

For more information about the LTDB, select one of the following links:

Reference:

Logan, John R., Zengwang Xu, and Brian Stults. 2012. "Interpolating US Decennial Census Tract Data from as Early as 1970 to 2010: A Longitudinal Tract Database" Professional Geographer, forthcoming.

Tatian, P. A. 2003. Neighborhood Change Database (NCDB) 1970-2000 Tract Data: Data Users Guide. Washington, DC: Urban Institute.

 

© Spatial Strucures in the Social Sciences, Brown University