diff --git a/README.md b/README.md index d130dc0..a2386f7 100644 --- a/README.md +++ b/README.md @@ -6,23 +6,32 @@ A web application for analyzing urban accessibility through the lens of the 15-m ``` Browser (Next.js / React) - ├── MapLibre GL JS (map + canvas heatmap) + ├── MapLibre GL JS (map + canvas heatmap / isochrone overlay) └── API calls → Next.js API routes Next.js App Server - ├── Public API: /api/cities /api/pois /api/grid /api/stats /api/isochrones + ├── Public API: /api/cities /api/tiles /api/stats /api/location-score /api/isochrones ├── Admin API: /api/admin/** (auth-protected) - ├── PostgreSQL + PostGIS (POI data, grid scores, isochrone cache) - └── Valkey (API response cache, sessions, BullMQ queue) + ├── PostgreSQL + PostGIS (POIs, grid points, precomputed scores) + └── Valkey (API response cache, BullMQ queues) -BullMQ Worker (separate process) - ├── download-pbf → streams OSM data from Geofabrik - ├── extract-pois → osmium-tool + osm2pgsql → PostGIS - ├── generate-grid → PostGIS SQL (200m grid) - ├── compute-scores → KNN lateral join + sigmoid scoring - └── build-valhalla → Valhalla routing tile build +BullMQ Worker (pipeline queue, concurrency 8) + ├── refresh-city → orchestrates full ingest via FlowProducer + ├── download-pbf → streams OSM PBF from Geofabrik + ├── extract-pois → osmium filter + osm2pgsql flex → raw_pois + ├── build-valhalla → clips PBF, builds Valhalla routing tiles + ├── generate-grid → PostGIS 200 m hex grid → grid_points + ├── compute-scores → two-phase orchestrator (see Scoring below) + └── compute-routing → Valhalla matrix → grid_poi_details + (15 parallel jobs: 3 modes × 5 categories) + +BullMQ Worker (valhalla queue, concurrency 1) + └── build-valhalla → runs valhalla_build_tiles, manages valhalla_service + +Valhalla (child process of valhalla worker) + ├── sources_to_targets matrix → compute-routing jobs + └── isochrones endpoint → user click → /api/isochrones -Valhalla → local routing (isochrones) Protomaps → self-hosted map tiles (PMTiles) ``` @@ -84,22 +93,64 @@ docker compose up postgres valkey -d ## Category Definitions -| Category | OSM Sources | Default Threshold | -|----------|-------------|-------------------| -| Service & Trade | shops, restaurants, pharmacies, banks | 10 min | -| Transport | bus stops, metro, train, bike share | 8 min | -| Work & School | offices, schools, universities | 20 min | -| Culture & Community | libraries, hospitals, museums, community centers | 15 min | -| Recreation | parks, sports, gyms, green spaces | 10 min | +| Category | OSM Sources | +|----------|-------------| +| Service & Trade | supermarkets, shops, restaurants, pharmacies, banks, cafés | +| Transport | bus stops, metro, tram, train stations, bike share, car share | +| Work & School | offices, coworking, schools, kindergartens, universities | +| Culture & Community | libraries, hospitals, clinics, museums, theatres, community centres | +| Recreation | parks, playgrounds, sports centres, gyms, green spaces | ## Scoring -For each grid point (200m spacing), the nearest POI in each category is found using a PostGIS KNN lateral join. The Euclidean distance is converted to travel time using mode speed assumptions (walking 5 km/h, cycling 15 km/h, driving 40 km/h). A sigmoid function converts travel time to a score in [0,1]: +### Data pipeline + +For each grid point (200 m hexagonal spacing) the pipeline runs in two phases: + +**Phase 1 — Routing** (15 parallel jobs: 3 modes × 5 categories) + +A PostGIS KNN lateral join (`<->` operator) finds the 6 nearest POIs in the category for each grid point. Those POI coordinates are passed to Valhalla's `sources_to_targets` matrix API to obtain real network travel times for the requested travel mode (walking, cycling, driving). The nearest POI per subcategory is persisted to `grid_poi_details`. + +**Phase 2 — Score aggregation** + +Scores are precomputed for every combination of: +- 5 thresholds: 5, 10, 15, 20, 30 minutes +- 3 travel modes: walking, cycling, driving +- 5 profiles: Universal, Young Family, Senior, Young Professional, Student + +### Scoring formula + +Each subcategory *i* within a category contributes a sigmoid score: ``` -score = 1 / (1 + exp(k * (travel_time - threshold))) +sigmoid(t, T) = 1 / (1 + exp(4 × (t − T) / T)) ``` -Where `k = 4/threshold`, giving score=0.5 exactly at the threshold. +Where `t` is the Valhalla travel time in seconds and `T` is the threshold in seconds. The sigmoid equals 0.5 exactly at the threshold and approaches 1 for very short times. -The composite score is a weighted average of all 5 category scores, with user-adjustable weights. +The category score combines subcategories via a complement-product, weighted by per-profile subcategory importance weights `w_i ∈ [0, 1]`: + +``` +category_score = 1 − ∏ (1 − w_i × sigmoid(t_i, T)) +``` + +This captures diversity of coverage: reaching one supermarket near you already yields a high score, but having a pharmacy, bakery, and bank nearby as well pushes the score higher. + +### Profiles + +Each profile carries two sets of weights: + +- **Category weights** (used as slider presets in the UI, range 0–2): how much relative importance each of the 5 categories gets in the composite score. +- **Subcategory weights** (used during score computation, range 0–1): how much a specific subcategory contributes to its category score. + +| Profile | Focus | +|---------|-------| +| Universal | Balanced across all resident types | +| Young Family | Schools, playgrounds, healthcare, daily shopping | +| Senior | Healthcare, local services, accessible green space, transit | +| Young Professional | Rapid transit, fitness, dining, coworking | +| Student | University, library, cafés, transit, budget services | + +### Composite score + +The composite shown on the heatmap is a weighted average of the 5 category scores. Category weights come from the selected profile but can be adjusted freely in the UI. All scores are precomputed — changing the profile or weights only triggers a client-side re-blend with no server round-trip.