Building Road Networks with NetworkX and OSMnx

Constructing computable road networks from raw geographic data is a foundational task in urban analytics, transportation planning, and logistics optimization. By combining OpenStreetMap (OSM) data extraction with graph-theoretic algorithms, practitioners can transform street geometries into directed or undirected graphs ready for routing, centrality measurement, and spatial modeling. This workflow sits at the core of Spatial Data Processing & Analysis, where raw vector features are systematically cleaned, projected, and structured into network representations.

The following guide provides a complete, step-by-step implementation using osmnx and networkx. It covers data acquisition, topology enforcement, spatial indexing, attribute integration, and routing, alongside explicit debugging procedures for common failure modes.

flowchart LR
    A["graph_from_place<br/>(OSM via Overpass)"] --> B["project_graph<br/>& remove isolates"]
    B --> C["sjoin_nearest<br/>attach attributes"]
    C --> D["add_edge_lengths<br/>weight edges"]
    D --> E["shortest_path<br/>route"]

Environment Setup

Before executing the workflow, ensure the required packages are installed. osmnx handles data retrieval, graph construction, and coordinate reference system (CRS) projection, while networkx provides the underlying graph algorithms. geopandas is used for spatial operations and attribute merging.

pip install osmnx networkx geopandas matplotlib shapely
import osmnx as ox
import networkx as nx
import geopandas as gpd
import matplotlib.pyplot as plt
from shapely.geometry import Point
import warnings
warnings.filterwarnings("ignore")

Step 1: Data Acquisition and Geocoding

OSMnx simplifies the retrieval of street networks by querying the Overpass API. The library automatically handles coordinate resolution, making it an efficient bridge between human-readable locations and machine-readable graphs. When you pass a place name, OSMnx performs internal geocoding to resolve administrative boundaries. When you pass coordinates, it executes reverse geocoding to attach contextual metadata to network nodes.

# Define a target location
place_name = "Piedmont, California, USA"

# Retrieve the drivable street network
G = ox.graph_from_place(place_name, network_type="drive")

print(f"Graph nodes: {G.number_of_nodes()}, edges: {G.number_of_edges()}")

If your workflow starts from specific coordinate pairs (e.g., GPS logs or IoT sensor locations), use graph_from_point with a search radius:

lat, lon = 37.8215, -122.2315
G_point = ox.graph_from_point((lat, lon), dist=1000, network_type="drive")

Step 2: Topology Validation and Projection

Raw OSM data frequently contains topological inconsistencies: self-loops, duplicate edges, disconnected components, and non-planar intersections. OSMnx applies automated validation during graph creation, but explicit post-processing ensures analytical reliability. Projecting to a planar coordinate system is critical, as geographic coordinates (lat/lon) distort distance and angle calculations.

# Project to a planar coordinate system for accurate distance calculations
G_proj = ox.projection.project_graph(G, to_crs="EPSG:32610")

# Remove isolated (disconnected) nodes to clean up topology
G_clean = G_proj.copy()
G_clean.remove_nodes_from(list(nx.isolates(G_clean)))

Step 3: Spatial Indexing and Attribute Integration

Integrating external datasets (e.g., traffic counts, zoning boundaries, or demographic data) requires spatial joins. Because road networks can contain tens of thousands of edges, leveraging spatial indexing dramatically improves performance. GeoPandas automatically constructs R-tree indexes when executing spatial operations, enabling fast nearest-neighbor matching and polygon overlays.

# Example: Create a GeoDataFrame of traffic sensors
sensors = gpd.GeoDataFrame(
    {"id": [1, 2], "volume": [150, 320]},
    geometry=[Point(-122.230, 37.822), Point(-122.235, 37.820)],
    crs="EPSG:4326"
)

# Project sensors to match the graph's CRS
sensors_proj = sensors.to_crs(G_clean.graph["crs"])

# Perform spatial join to attach volumes to nearest network nodes
nodes_gdf = ox.convert.graph_to_gdfs(G_clean, edges=False)
joined = gpd.sjoin_nearest(nodes_gdf, sensors_proj, how="left")

# Map attributes back to the NetworkX graph
for node, row in joined.iterrows():
    if node in G_clean.nodes:
        G_clean.nodes[node]["traffic_volume"] = row["volume"]

Step 4: Routing and Algorithmic Execution

With a cleaned, attributed graph, you can execute graph algorithms for routing and network analysis. NetworkX provides optimized implementations for shortest-path calculations, centrality metrics, and connected component analysis. For a deeper exploration of algorithmic patterns and performance tuning, consult the official NetworkX Documentation.

# Ensure edge lengths (meters) are calculated
G_clean = ox.distance.add_edge_lengths(G_clean)

# Find the nearest graph nodes to origin/destination coordinates.
# nearest_nodes expects coordinates in the graph's CRS, so query the
# unprojected graph G (lat/lon); projection preserves node IDs.
origin = ox.distance.nearest_nodes(G, X=-122.2315, Y=37.8215)
destination = ox.distance.nearest_nodes(G, X=-122.2280, Y=37.8200)

# Compute shortest path by distance
route = nx.shortest_path(G_clean, origin, destination, weight="length")
route_length = nx.shortest_path_length(G_clean, origin, destination, weight="length")
print(f"Route length: {route_length:.2f} meters")

# Visualize the network and route
fig, ax = ox.plot_graph_route(G_clean, route, route_linewidth=4, node_size=0)

Debugging and Performance Optimization

Common failure modes include API rate limits, memory exhaustion on metropolitan-scale datasets, and silent CRS mismatches during spatial joins. Always enable caching to avoid redundant API calls:

ox.settings.use_cache = True
ox.settings.cache_folder = "./osmnx_cache"

For large-scale networks, simplify the topology before running memory-intensive algorithms using ox.simplification.simplify_graph(). When working with geographic coordinates, verify that all spatial operations share the same projection to prevent silent distance calculation errors. Comprehensive configuration options are detailed in the OSMnx Documentation.

Conclusion

Building road networks with NetworkX and OSMnx transforms unstructured geographic data into a rigorous analytical framework. By systematically acquiring data, enforcing topological consistency, and leveraging spatial indexing, practitioners can scale from neighborhood routing to metropolitan logistics modeling. The combination of open-source geospatial tools and graph theory provides a reproducible, extensible pipeline for modern spatial computing.