doc: update readme

2026-03-06 13:50:11 +01:00 · 2026-03-06 13:50:11 +01:00 · 85ad80e76b
commit 85ad80e76b
parent f067472fd5
1 changed files with 73 additions and 59 deletions
--- a/README.md
+++ b/README.md
@ -109,9 +109,53 @@ Required local services: PostgreSQL+PostGIS, Valkey. Easiest via:
 docker compose up postgres valkey -d
 ```

+## Methodology
+
+The 15-minute city concept holds that a liveable urban neighbourhood is one where daily necessities are reachable within 15 minutes on foot or by bike. Transportationer operationalises this as a continuous, queryable accessibility score for every location in a city.
+
+### What is measured
+
+For each location (grid point) the tool asks: *how quickly can you reach the nearest representative destination in each category, by each travel mode?* The answer is a travel time in seconds, obtained from real routing data rather than straight-line distance.
+
+Destinations are sourced from OpenStreetMap and grouped into five categories — **Service & Trade**, **Transport**, **Work & School**, **Culture & Community**, and **Recreation** — each split into subcategories (e.g. `supermarket`, `pharmacy`, `cafe`). Multiple OSM tags may map to the same subcategory (e.g. `shop=bakery` and `amenity=cafe` both map to `cafe`).
+
+### Grid
+
+Each city is covered by a hexagonal grid at 200 m spacing. Hexagons tessellate uniformly and avoid the directional bias of square grids. One score is stored per grid point per (category × travel mode × threshold × profile) combination.
+
+### Routing
+
+Travel times are obtained from [Valhalla](https://github.com/valhalla/valhalla), a real-network routing engine built on OSM data:
+
+- **Walking, cycling, driving** — Valhalla's `sources_to_targets` matrix endpoint. For each grid point the 6 spatially nearest POIs in the category are sent as targets; the resulting travel-time matrix gives the exact routed time to each. The nearest POI *per subcategory* is retained.
+- **Transit** — Valhalla's matrix endpoint does not support transit. Instead, a multimodal isochrone is computed per grid point at contour intervals of 5, 10, 15, 20, and 30 minutes (fixed departure: next Tuesday 08:00 for reproducible GTFS results). PostGIS `ST_Within` then classifies every POI in the city into the smallest contour it falls within, giving estimated times of 300 / 600 / 900 / 1200 / 1800 seconds. Grid points outside the transit network are silently skipped — they receive no transit score.
+- **Best mode (`fifteen`)** — a synthetic mode computed during score aggregation: for each (grid point, subcategory) the minimum travel time across walking, cycling, and transit is used. Driving is excluded intentionally. No extra routing calls are needed.
+
+### Scoring formula
+
+All scores are precomputed at ingest time for every combination of threshold (5 / 10 / 15 / 20 / 30 min), travel mode, and profile, so interactive queries hit only the database.
+
+Each subcategory *i* contributes a sigmoid score based on travel time `t` and threshold `T` (both in seconds):
+
+```
+sigmoid(t, T) = 1 / (1 + exp(4 × (t − T) / T))
+```
+
+The sigmoid equals 0.5 exactly at the threshold, approaches 1 for very short times, and approaches 0 for very long times. It is continuous — a 14-minute trip still contributes almost as much as a 10-minute trip under a 15-minute threshold.
+
+The category score aggregates all subcategories via a **complement product** weighted by profile-specific importance weights `w_i ∈ [0, 1]`:
+
+```
+category_score = 1 − ∏ (1 − w_i × sigmoid(t_i, T))
+```
+
+This captures coverage diversity: one nearby supermarket already yields a high score, but also having a pharmacy and a bakery pushes it higher. Subcategories with no POI found are omitted from the product and do not penalise the score.
+
+The **composite score** shown on the heatmap is a weighted average of all five category scores. Category weights come from the selected profile but can be adjusted freely with the UI sliders. Changing the profile, threshold, or travel mode re-queries the database; adjusting the sliders re-blends client-side with no server round-trip.
+
 ## Category Definitions

-Five categories cover everyday destinations. All categories are scored at the same threshold (5, 10, 15, 20, or 30 minutes — user-selectable). The **universal weight** column shows how strongly a subcategory contributes to its parent category score in the Universal profile (range 0–1); other profiles override specific values — see the Profiles table below.
+The five categories and their subcategories are defined below. All categories are scored at the same user-selected threshold. The **universal weight** column shows the subcategory importance weight for the Universal profile (range 0–1); other profiles override specific values — see the Profiles table in the Scoring section.

 ### Service & Trade

@ -179,75 +223,45 @@ Five categories cover everyday destinations. All categories are scored at the sa
 | `leisure=nature_reserve`, `leisure=golf_course`, `landuse=recreation_ground`, `landuse=grass`, `landuse=meadow`, `landuse=forest` | `green_space` | 0.6 |
 | `leisure=swimming_pool`, `amenity=swimming_pool` | `swimming_pool` | 0.55 |

-## Scoring
-
-### Data pipeline
-
-For each city the pipeline runs in two phases:
-
-**Phase 1 — Routing** (parallel child jobs)
-
-*Walking, cycling, driving — 15 jobs (3 modes × 5 categories):*
-A PostGIS KNN lateral join (`<->` operator) finds the 6 nearest POIs in the category for each grid point (200 m hexagonal spacing). Those POI coordinates are sent in batches of 20 to Valhalla's `sources_to_targets` matrix API to obtain exact real-network travel times. The nearest POI per subcategory is persisted to `grid_poi_details`.
-
-*Transit — 1 job per city (`compute-transit`):*
-Valhalla's matrix API does not support transit. Instead, for each grid point a multimodal isochrone is fetched from Valhalla at contour intervals of 5, 10, 15, 20, and 30 minutes (fixed departure: Tuesday 08:00 to ensure reproducible GTFS results). PostGIS `ST_Within` then classifies all POIs in the city into the smallest contour they fall within, giving estimated travel times of 300 s / 600 s / 900 s / 1 200 s / 1 800 s respectively. Grid points outside the transit network are silently skipped — transit contributes nothing to their score and the other modes compensate.
-
-**Phase 2 — Score aggregation**
-
-Scores are precomputed for every combination of:
- 5 thresholds: 5, 10, 15, 20, 30 minutes
- 5 travel modes (see below)
- 5 profiles: Universal, Young Family, Senior, Young Professional, Student
-
-### Travel modes
-
-| Mode | Internal key | How travel time is obtained |
-|------|--------------|-----------------------------|
-| Best mode | `fifteen` | Synthetic — minimum travel time across walking, cycling, and transit per subcategory. A destination reachable by any of these modes within the threshold counts as accessible. Driving excluded intentionally. |
-| Walking | `walking` | Valhalla pedestrian matrix, exact seconds |
-| Cycling | `cycling` | Valhalla bicycle matrix, exact seconds |
-| Transit | `transit` | Valhalla multimodal isochrone, quantised to 5-min bands (requires GTFS feed) |
-| Driving | `driving` | Valhalla auto matrix, exact seconds |
-
-The `fifteen` mode is computed entirely in memory during Phase 2: for each (grid point, category, subcategory) the minimum travel time across the three active modes is used, then scored normally. No extra routing jobs are needed.
-
-### Scoring formula
-
-Each subcategory *i* within a category contributes a sigmoid score based on the real travel time `t` and the selected threshold `T` (both in seconds):
-
-```
-sigmoid(t, T) = 1 / (1 + exp(4 × (t − T) / T))
-```
-
-The sigmoid equals 0.5 exactly at the threshold and approaches 1 for very short times. It is continuous, so a 14-minute trip to a park still contributes nearly as much as a 10-minute trip under a 15-minute threshold.
-
-The category score combines all subcategories via a **complement product**, weighted by per-profile subcategory importance weights `w_i ∈ [0, 1]`:
-
-```
-category_score = 1 − ∏ (1 − w_i × sigmoid(t_i, T))
-```
-
-This captures diversity of coverage: one nearby supermarket already yields a high score, but also having a pharmacy and a bakery pushes it higher. Missing subcategories (no POI found) are simply omitted from the product and do not penalise the score.
-
-### Profiles
+## Profiles

 Each profile carries two sets of weights:

- **Category weights** (used as slider presets in the UI, range 0–2): how much relative importance each of the 5 categories receives in the composite score.
- **Subcategory weights** (baked into precomputed scores, range 0–1): how strongly a specific subcategory contributes to its parent category score.
+- **Category weights** (slider presets in the UI, range 0–2): relative importance of each category in the composite score.
+- **Subcategory weights** (baked into precomputed scores, range 0–1): how strongly a specific subcategory contributes to its parent category score. Any subcategory not listed in a profile falls back to 0.5.

 | Profile | Emoji | Category weights | Notable subcategory overrides (vs. Universal) |
 |---------|-------|------------------|------------------------------------------------|
-| Universal | ⚖️ | All 1.0 | Baseline — see tables above for all weights |
-| Young Family | 👨‍👩‍👧 | Work & School 1.5, Recreation 1.4, Service 1.2, Culture 0.9, Transport 1.0 | school → 1.0, kindergarten → 1.0, playground → 1.0, clinic → 1.0, park → 1.0; gym → 0.5, university → 0.2, driving\_school → (inherits 0.2) |
+| Universal | ⚖️ | All 1.0 | Baseline — see category tables above for all weights |
+| Young Family | 👨‍👩‍👧 | Work & School 1.5, Recreation 1.4, Service 1.2, Culture 0.9, Transport 1.0 | school → 1.0, kindergarten → 1.0, playground → 1.0, clinic → 1.0, park → 1.0; gym → 0.5, university → 0.2 |
 | Senior | 🧓 | Culture & Community 1.5, Service 1.4, Transport 1.1, Recreation 1.0, Work & School 0.3 | hospital → 1.0, clinic → 1.0, pharmacy → 1.0, social\_services → 0.9, bus\_stop → 0.75, tram\_stop → 0.8, metro → 0.8; school → 0.05, kindergarten → 0.05, university → 0.15 |
 | Young Professional | 💼 | Transport 1.5, Recreation 1.1, Service 1.0, Culture 0.9, Work & School 0.7 | metro → 1.0, train\_station → 1.0, tram\_stop → 0.85, bike\_share → 0.7; gym → 0.9, restaurant → 0.75, coworking → 0.85; school → 0.1, kindergarten → 0.05 |
 | Student | 🎓 | Work & School 1.5, Transport 1.4, Culture & Community 1.2, Service 0.9, Recreation 0.8 | university → 1.0, library → 1.0, coworking → 0.9, bike\_share → 0.85, cafe → 0.9, metro → 1.0, train\_station → 0.9; school → 0.05, kindergarten → 0.05 |

-### Composite score
+## Implementation Details

-The composite shown on the heatmap is a weighted average of the 5 category scores. Category weights come from the selected profile but can be adjusted freely in the UI. **All scores are precomputed** — changing the profile, threshold, or travel mode only queries the database; adjusting the category weight sliders re-blends entirely client-side with no round-trip.
+### Data pipeline
+
+For each city the worker pipeline runs in two phases:
+
+**Phase 1 — Routing** (parallel child jobs, dispatched by `compute-scores`)
+
+- *Walking, cycling, driving* — 15 parallel jobs (3 modes × 5 categories). A PostGIS KNN lateral join finds the 6 spatially nearest POIs per grid point; those coordinates are sent to Valhalla's `sources_to_targets` matrix API in batches. The nearest POI per subcategory is persisted to `grid_poi_details`.
+- *Transit* — 1 job per city (`compute-transit`). Concurrent isochrone calls (8 at a time) to the dedicated transit Valhalla instance; PostGIS `ST_Within` classifies POIs into contour bands. Runs first so it overlaps with the routing jobs.
+
+**Phase 2 — Score aggregation**
+
+A single SQL CTE chain inside PostgreSQL computes all scores without streaming data through Node.js. Precomputed for every combination of 5 thresholds × 5 travel modes × 5 profiles, then bulk-inserted into `grid_scores` via `ON CONFLICT DO UPDATE`.
+
+### Travel modes
+
+| Mode | Key | Source |
+|------|-----|--------|
+| Best mode | `fifteen` | Synthetic — `MIN(travel_time_s)` across walking, cycling, transit per subcategory during Phase 2. No extra routing calls. |
+| Walking | `walking` | Valhalla pedestrian matrix, exact seconds |
+| Cycling | `cycling` | Valhalla bicycle matrix, exact seconds |
+| Transit | `transit` | Valhalla multimodal isochrone, quantised to 5-min bands (requires GTFS feed) |
+| Driving | `driving` | Valhalla auto matrix, exact seconds |

 ### Per-location score (pin)