How to Analyze Transit Equity Using GPS Mobility Data: A Step-by-Step Guide

By

Introduction

Recent research using 66 million GPS mobility records revealed a stark inequity in New York City's public transit system: white neighborhoods enjoy far better access to jobs, banks, healthcare, parks, and schools within a one-hour commute than Black and Hispanic communities do. This guide walks you through the methodology used in that “PNAS Nexus” study, enabling you to replicate the analysis or apply it to your own city. By following these steps, you’ll learn how to leverage geospatial big data to uncover systemic biases in transit infrastructure.

How to Analyze Transit Equity Using GPS Mobility Data: A Step-by-Step Guide
Source: phys.org

What You Need

Step-by-Step Instructions

  1. Step 1: Collect and Preprocess GPS Mobility Data

    Obtain a representative sample of GPS pings from anonymous users. Clean the data by removing outliers (e.g., pings with unrealistic speed or accuracy), filtering for trips that occurred on weekdays during typical commute hours (7–9 AM and 4–7 PM), and mapping each ping to its nearest census tract. Aggregate the data to create origin-destination matrices: for each census tract, calculate the number of trips starting from that tract.

  2. Step 2: Build a Time-Distance Transit Network

    Import GTFS files into your GIS or Python environment. Use a routing engine (e.g., OpenTripPlanner, GraphHopper, or “osmnx” with “networkx”) to model travel times between all pairs of census tracts via public transit. Include walking time to/from stops, waiting time, and in-vehicle travel. Set a maximum one-hour threshold, as in the original study.

  3. Step 3: Geocode Destination Amenities

    Compile geolocated lists of job sites (counts of employment per location), banks, healthcare facilities (hospitals, clinics), parks (public green spaces), and schools (K–12, universities). Standardize addresses using a geocoding service (e.g., Google Maps Geocoding API, Census Geocoder) and assign each amenity to its census tract.

  4. Step 4: Calculate Accessible Destinations from Each Tract

    For each census tract, query the transit network model to find all destinations reachable within 60 minutes. For each type of amenity, sum the total number of opportunities (e.g., total jobs, number of banks) within that one-hour travel time window. Store these accessibility scores per tract.

  5. Step 5: Segment Tracts by Dominant Race/Ethnicity

    Overlay the accessibility results with demographic data from the census. Classify each tract as “Predominantly White” (>50% non-Hispanic white), “Predominantly Black” (>50% Black), or “Predominantly Hispanic” (>50% Hispanic). Alternatively, use continuous measures of racial composition to perform regression analysis.

  6. Step 6: Compare and Visualize Disparities

    Calculate the average accessibility score (jobs, banks, etc.) for each demographic group. Use bar charts or choropleth maps to display the differences. As the original study found, white-majority tracts consistently have higher counts of reachable job sites, banks, healthcare, parks, and schools than Black- or Hispanic-majority tracts. Perform statistical tests (e.g., t-test or ANOVA) to confirm significance.

Tips for Accurate Analysis

Related Articles

Recommended

Discover More

Linux Mint Introduces HWE ISOs for Enhanced Hardware SupportOrbital Pharma and Nuclear Thrust: The New Space Age FrontiersGoogle’s Now Playing Feature Gains Dedicated App: How It Changes the Pixel ExperienceLinux Kernel Patches Partial Dirty Frag Vulnerability – Second Fix Still PendingHow to Upgrade to React Native 0.83 and Master Its New Features