Census geography: Bridging data for census tracts across time

The Longitudinal Tract Data Base (LTDB) provides public-use tools to create estimates within 2010 tract boundaries for any tract-level data (from the census or other sources) that are available for prior years as early as 1970 and also for 2015-2019 and 2020. We provide a Backwards LTDB in which data provided in 2010 tract boundaries can be estimated within 2000 boundaries. Researchers increasingly work with information aggregated to the tract level from non-census sources, such as criminal justice, public health, and voting records. To meet their needs requires a tool to convert such data to the 2010 boundaries. The LTDB offers an open-source crosswalk to link data from 1970-2000 to 2010, and it also provides user-friendly programming code to bridge data across years.

Note that much of the U.S. was not divided into census tracts in 1970, and some areas did not have tracts even in 1980. Some tracted areas had no reported population. The LTDB files do not include these tracts because estimates in 2010 boundaries cannot be made.

References:
Logan, John R., Zengwang Xu, and Brian J. Stults. 2014. "Interpolating US Decennial Census Tract Data from as Early as 1970 to 2010: A Longitudinal Tract Database" The Professional Geographer 66(3): 412–420. (Click here to download).
Logan, John R., Brian J. Stults, and Zengwang Xu. 2016. "Validating Population Estimates for Harmonized Census Tract Data, 2000–2010" Annals of the American Association of Geographers.
http://www.tandfonline.com/doi/full/10.1080/24694452.2016.1187060.
John R. Logan, Wenquan Zhang, Brian J. Stults, and Todd Gardner. 2021. “Improving Estimates of Neighborhood Change with Constant Tract Boundaries” Applied Geography 132:1-11. doi.org/10.1016/j.apgeog.2021.102476 .
John R. Logan, Wenquan Zhang, and Zengwang Xu. 2024. “Using Public Data to Improve Population Estimates Within Consistent Boundaries” Professional Geographer, forthcoming. Click here to download.

To access the LTDB, select one of the following links:

    • Use the LTDB Click here for information about components and revisions of the LTDB and file downloads
    • Map the LTDB data using the web-based map system for 1970-2010 only

Methods

The methodology of the original LTDB is described in detail in Logan et al 2014 (see above). A subsequent article (Logan et al 2016) compares the LTDB estimates with those provided by Geolytics (formerly the NCDB) and another source provided by NHGIS. Based on our analysis we do not recommend the Geolytics data set. The NHGIS and LTDB estimates are very similar. Users should read the Annals article to make choices about what data to use and what precautions are needed when using any estimates.

Improving the LTDB estimates

Logan et al (2021) uses confidential microdata in the 2000 Census to test the accuracy of LTDB estimates for a selected set of full count and sample count data. It finds that the errors in estimates for several variables that are greater than had been documented for population counts (Logan et al 2016). The authors tested an alternative approach in which estimates are based on the original data, to which random noise were added to create “differential privacy” (DP) estimates that can be disclosed. The DP estimates are far better. The Census Bureau approved disclosure of the DP estimates for the few variables that were included in this study. We provide the disclosed estimates for these variables in 2000 in a separate download that we call the LTDB-DP.

The Census Bureau has not approved disclosure of DP estimates for the many other variables in the LTDB. We continue to experiment with alternatives to improve these estimates. In our most recent work (Logan et al 2024, see above) we show that much estimation error results from an assumption that we refer to as “spatial stationarity.” In the case of tracts with complex boundary changes, the LTDB uses information on block-level populations to estimate what proportion of a tract’s population at a given time should be allocated to a given 2010 tract. This is population interpolation. Then persons in all population categories are allocated in the same proportion. We have now tested an alternative in which small-area data for each specific category of persons are used for this interpolation (we refer to this as a “trait-based” or TB approach). We find substantial improvement for full-count variables in 2000, but not for the sample (“long form”) variables. The current version of the LTDB substitutes the TB estimates for the short-form variables including race/ethnicity, national-origin categories of Hispanics and Asians, age by race, and full-count housing variables for 2000. It also applies the TB approach to the race counts from Census 2020 that were reported as the PL94 data.