fifteen/README.md

293 lines
17 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Transportationer — 15-Minute City Analyzer
A web application for analyzing urban accessibility through the lens of the 15-minute city concept. Shows a heatmap indicating distance to locations of interest across 5 categories: **Service & Trade**, **Transport**, **Work & School**, **Culture & Community**, and **Recreation**.
## Architecture
```
Browser (Next.js / React)
├── MapLibre GL JS (map + canvas heatmap / isochrone overlay)
└── API calls → Next.js API routes
Next.js App Server
├── Public API: /api/cities /api/tiles /api/stats /api/location-score /api/isochrones
├── Admin API: /api/admin/** (auth-protected)
├── PostgreSQL + PostGIS (POIs, grid points, precomputed scores)
└── Valkey (API response cache, BullMQ queues)
BullMQ Worker (download queue, concurrency 1)
└── download-pbf → streams OSM PBF from Geofabrik (serialised to avoid
redundant parallel downloads; idempotent if file exists)
BullMQ Worker (pipeline queue, concurrency 8)
├── refresh-city → orchestrates full ingest via FlowProducer
├── extract-pois → osmium filter + osm2pgsql flex → raw_pois
├── generate-grid → PostGIS 200 m rectangular grid → grid_points
├── compute-scores → two-phase orchestrator (see Scoring below)
├── compute-routing → Valhalla matrix → grid_poi_details
│ (15 parallel jobs: 3 modes × 5 categories)
└── compute-transit → Valhalla isochrones → grid_poi_details (travel_mode='transit')
(1 job per city, covers all categories via PostGIS spatial join)
BullMQ Worker (valhalla queue, concurrency 1) — road-only instance
└── build-valhalla → osmium clip + valhalla_build_tiles (road graph only, no transit
connections) → manages valhalla_service on :8002
Clean tiles ensure cycling/walking/driving routing is never
affected by ghost edges from failed transit connections.
BullMQ Worker (valhalla-transit queue, concurrency 1) — transit instance
├── download-gtfs-de → downloads & filters GTFS feed for German ÖPNV (bbox-clipped to
│ known cities, single-stop trips removed)
└── build-valhalla → osmium clip + valhalla_ingest_transit + valhalla_convert_transit
+ valhalla_build_tiles (road graph with transit connections)
→ manages valhalla_service on :8002 (separate container/port)
Valhalla road instance (child process of valhalla worker, port 8002)
├── sources_to_targets matrix → compute-routing jobs (walking/cycling/driving)
└── isochrone endpoint → user click → /api/isochrones (non-transit modes)
Valhalla transit instance (child process of valhalla-transit worker, port 8002)
├── isochrone (multimodal) → compute-transit jobs
└── isochrone endpoint → user click → /api/isochrones (transit mode)
Protomaps → self-hosted map tiles (PMTiles)
```
## Quick Start
### 1. Configure environment
```bash
cp .env.example .env
# Edit .env with strong passwords
# Generate admin password hash
node -e "require('bcryptjs').hash('yourpassword', 12).then(console.log)"
# Paste result as ADMIN_PASSWORD_HASH in .env
```
### 2. Start services
```bash
docker compose up -d
```
### 3. Add a city
Open [http://localhost:3000/admin](http://localhost:3000/admin), log in, click **Add City**, browse Geofabrik regions (e.g. `europe/germany/berlin`), and start ingestion. Progress is shown live.
Processing time:
- Small city (< 100k pop): ~515 minutes
- Large city (1M+ pop): ~3090 minutes
### 4. Explore
Open [http://localhost:3000](http://localhost:3000) and select your city.
## Map Tiles
By default the app uses CartoDB Positron (CDN). For fully offline operation, download a PMTiles file for your region:
```bash
# Example: download Berlin region tiles
wget https://maps.protomaps.com/builds/berlin.pmtiles -O apps/web/public/tiles/region.pmtiles
# Then switch to the PMTiles style:
cp apps/web/public/tiles/style.pmtiles.json apps/web/public/tiles/style.json
```
## Development
```bash
npm install
npm run dev # Next.js dev server on :3000
npm run worker:dev # BullMQ worker with hot reload
```
Required local services: PostgreSQL+PostGIS, Valkey. Easiest via:
```bash
docker compose up postgres valkey -d
```
## Methodology
The 15-minute city concept holds that a liveable urban neighbourhood is one where daily necessities are reachable within 15 minutes on foot or by bike. Transportationer operationalises this as a continuous, queryable accessibility score for every location in a city.
### What is measured
For each location (grid point) the tool asks: *how quickly can you reach the nearest representative destination in each category, by each travel mode?* The answer is a travel time in seconds, obtained from real routing data rather than straight-line distance.
Destinations are sourced from OpenStreetMap and grouped into five categories **Service & Trade**, **Transport**, **Work & School**, **Culture & Community**, and **Recreation** each split into subcategories (e.g. `supermarket`, `pharmacy`, `cafe`). Multiple OSM tags may map to the same subcategory (e.g. `shop=bakery` and `amenity=cafe` both map to `cafe`).
### Grid
Each city is covered by a regular rectangular grid at 200 m spacing, generated in Web Mercator (EPSG:3857) and projected back to WGS84. One score is stored per grid point per (category × travel mode × threshold × profile) combination. On the map each grid point is rendered as a circle.
### Routing
Travel times are obtained from [Valhalla](https://github.com/valhalla/valhalla), a real-network routing engine built on OSM data:
- **Walking, cycling, driving** Valhalla's `sources_to_targets` matrix endpoint. For each grid point the 6 spatially nearest POIs in the category are sent as targets; the resulting travel-time matrix gives the exact routed time to each. The nearest POI *per subcategory* is retained.
- **Transit** Valhalla's matrix endpoint does not support transit. Instead, a multimodal isochrone is computed per grid point at contour intervals of 5, 10, 15, 20, and 30 minutes (fixed departure: next Tuesday 08:00 for reproducible GTFS results). PostGIS `ST_Within` then classifies every POI in the city into the smallest contour it falls within, giving estimated times of 300 / 600 / 900 / 1200 / 1800 seconds. Grid points outside the transit network are silently skipped they receive no transit score.
- **Best mode (`fifteen`)** a synthetic mode computed during score aggregation: for each (grid point, subcategory) the minimum travel time across walking, cycling, and transit is used. Driving is excluded intentionally. No extra routing calls are needed.
### Scoring formula
All scores are precomputed at ingest time for every combination of threshold (5 / 10 / 15 / 20 / 30 min), travel mode, and profile, so interactive queries hit only the database.
Each subcategory *i* contributes a proximity score based on travel time `t` and threshold `T` (both in seconds) using exponential decay:
```
score(t, T) = exp(3 × t / T)
```
At t = 0 the score is 1.0. At the threshold it is exp(3) 0.05 a POI reachable in exactly the threshold time barely contributes. Close proximity dominates: a third of the threshold away scores ~0.37, halfway scores ~0.22. This ensures that genuinely nearby POIs are rated much more highly than merely reachable ones.
The category score aggregates across subcategories **and** across multiple nearby POIs of the same subcategory via a **complement product** weighted by profile-specific importance weights `w_i ∈ [0, 1]`:
```
category_score = 1 ∏ (1 w_i × score(t_i, T))
```
This captures both subcategory coverage (a pharmacy and a supermarket together score higher than either alone) and within-subcategory diversity (a second nearby park still improves the score, with strongly diminishing returns). Subcategories with no POI found contribute nothing and do not penalise the score.
The **composite score** shown on the heatmap is a weighted average of all five category scores. Category weights come from the selected profile but can be adjusted freely with the UI sliders. Changing the profile, threshold, or travel mode re-queries the database; adjusting the sliders re-blends client-side with no server round-trip.
## Category Definitions
The five categories and their subcategories are defined below. All categories are scored at the same user-selected threshold. The **universal weight** column shows the subcategory importance weight for the Universal profile (range 01); other profiles override specific values see the Profiles table in the Scoring section.
### Service & Trade
| OSM tag(s) | Subcategory | Universal weight |
|---|---|:---:|
| `shop=supermarket`, `shop=wholesale` | `supermarket` | 1.0 |
| `shop=convenience` | `convenience` | 0.65 |
| `amenity=pharmacy`, `shop=pharmacy` | `pharmacy` | 1.0 |
| `amenity=restaurant`, `amenity=fast_food` | `restaurant` | 0.55 |
| `amenity=cafe`, `shop=bakery` | `cafe` | 0.4 |
| `amenity=bank` | `bank` | 0.35 |
| `amenity=post_office` | `post_office` | 0.4 |
| `shop=greengrocer`, `shop=butcher`, `amenity=marketplace`, `shop=department_store`, `shop=mall` | `market` | 0.4 |
| `shop=laundry`, `shop=dry_cleaning` | `laundry` | 0.3 |
| `amenity=atm` | `atm` | 0.2 |
### Transport
| OSM tag(s) | Subcategory | Universal weight |
|---|---|:---:|
| `railway=station`, `railway=halt` | `train_station` | 1.0 |
| `railway=subway_entrance`, `railway=subway_station` | `metro` | 1.0 |
| `railway=tram_stop` | `tram_stop` | 0.75 |
| `highway=bus_stop`, `amenity=bus_station` | `bus_stop` | 0.55 |
| `amenity=ferry_terminal` | `ferry` | 0.5 |
| `amenity=bicycle_rental` | `bike_share` | 0.5 |
| `amenity=car_sharing` | `car_share` | 0.5 |
**Excluded:** `public_transport=stop_position` and `public_transport=platform` these duplicate `highway=bus_stop` / `railway=tram_stop` nodes at the exact same location and would double-count stops in scoring.
### Work & School
| OSM tag(s) | Subcategory | Universal weight |
|---|---|:---:|
| `amenity=kindergarten`, `amenity=childcare` | `kindergarten` | 0.75 |
| `amenity=school` | `school` | 0.7 |
| `office=coworking` | `coworking` | 0.55 |
| `amenity=university`, `amenity=college` | `university` | 0.55 |
| `amenity=driving_school` | `driving_school` | 0.2 |
**Excluded:** `office=company`, `office=government`, `landuse=commercial`, `landuse=office` ubiquitous in every urban block; they always win the "nearest POI" race in the detail view, masking meaningful destinations. Government buildings are captured via `amenity=townhall` / `amenity=police` in Culture & Community instead.
### Culture & Community
| OSM tag(s) | Subcategory | Universal weight |
|---|---|:---:|
| `amenity=hospital` | `hospital` | 1.0 |
| `amenity=clinic`, `amenity=doctors` | `clinic` | 0.8 |
| `amenity=library` | `library` | 0.7 |
| `amenity=community_centre`, `leisure=arts_centre` | `community_center` | 0.6 |
| `amenity=social_facility` | `social_services` | 0.6 |
| `amenity=theatre`, `amenity=cinema` | `theatre` | 0.5 |
| `tourism=museum` | `museum` | 0.4 |
| `amenity=townhall`, `amenity=police` | `government` | 0.4 |
| `amenity=place_of_worship` | `place_of_worship` | 0.25 |
### Recreation
| OSM tag(s) | Subcategory | Universal weight |
|---|---|:---:|
| `leisure=park`, `leisure=garden` | `park` | 1.0 |
| `leisure=playground` | `playground` | 0.85 |
| `leisure=sports_centre`, `leisure=pitch` | `sports_facility` | 0.65 |
| `leisure=fitness_centre` | `gym` | 0.65 |
| `leisure=nature_reserve`, `leisure=golf_course`, `landuse=recreation_ground`, `landuse=grass`, `landuse=meadow`, `landuse=forest` | `green_space` | 0.6 |
| `leisure=swimming_pool`, `amenity=swimming_pool` | `swimming_pool` | 0.55 |
## Profiles
Each profile carries two sets of weights:
- **Category weights** (slider presets in the UI, range 02): relative importance of each category in the composite score.
- **Subcategory weights** (baked into precomputed scores, range 01): how strongly a specific subcategory contributes to its parent category score. Any subcategory not listed in a profile falls back to 0.5.
| Profile | Emoji | Category weights | Notable subcategory overrides (vs. Universal) |
|---------|-------|------------------|------------------------------------------------|
| Universal | | All 1.0 | Baseline see category tables above for all weights |
| Young Family | 👨👩👧 | Work & School 1.5, Recreation 1.4, Service 1.2, Culture 0.9, Transport 1.0 | school 1.0, kindergarten 1.0, playground 1.0, clinic 1.0, park 1.0; gym 0.5, university 0.2 |
| Senior | 🧓 | Culture & Community 1.5, Service 1.4, Transport 1.1, Recreation 1.0, Work & School 0.3 | hospital 1.0, clinic 1.0, pharmacy 1.0, social\_services 0.9, bus\_stop 0.75, tram\_stop 0.8, metro 0.8; school 0.05, kindergarten 0.05, university 0.15 |
| Young Professional | 💼 | Transport 1.5, Recreation 1.1, Service 1.0, Culture 0.9, Work & School 0.7 | metro 1.0, train\_station 1.0, tram\_stop 0.85, bike\_share 0.7; gym 0.9, restaurant 0.75, coworking 0.85; school 0.1, kindergarten 0.05 |
| Student | 🎓 | Work & School 1.5, Transport 1.4, Culture & Community 1.2, Service 0.9, Recreation 0.8 | university 1.0, library 1.0, coworking 0.9, bike\_share 0.85, cafe 0.9, metro 1.0, train\_station 0.9; school 0.05, kindergarten 0.05 |
## Implementation Details
### Data pipeline
For each city the worker pipeline runs in two phases:
**Phase 1 — Routing** (parallel child jobs, dispatched by `compute-scores`)
- *Walking, cycling, driving* 15 parallel jobs (3 modes × 5 categories). A PostGIS KNN lateral join finds the 6 spatially nearest POIs per grid point in the category; those coordinates are sent to Valhalla's `sources_to_targets` matrix API in batches. The nearest POI per subcategory is persisted to `grid_poi_details`.
- *Transit* 1 job per city (`compute-transit`). Concurrent isochrone calls (8 at a time) to the dedicated transit Valhalla instance; PostGIS `ST_Within` classifies POIs into contour bands. Runs first so it overlaps with the routing jobs.
**Phase 2 — Score aggregation**
A single SQL CTE chain inside PostgreSQL computes all scores without streaming data through Node.js. Precomputed for every combination of 5 thresholds × 5 travel modes × 5 profiles, then bulk-inserted into `grid_scores` via `ON CONFLICT DO UPDATE`.
### Travel modes
| Mode | Key | Source |
|------|-----|--------|
| Best mode | `fifteen` | Synthetic `MIN(travel_time_s)` across walking, cycling, transit per subcategory during Phase 2. No extra routing calls. |
| Walking | `walking` | Valhalla pedestrian matrix, exact seconds |
| Cycling | `cycling` | Valhalla bicycle matrix, exact seconds |
| Transit | `transit` | Valhalla multimodal isochrone, quantised to 5-min bands (requires GTFS feed) |
| Driving | `driving` | Valhalla auto matrix, exact seconds |
### Per-location score (pin)
When a user places a pin on the map:
1. The nearest grid point is found via a PostGIS `<->` KNN query.
2. Precomputed `grid_scores` rows for that grid point, travel mode, threshold, and profile are returned one row per category.
3. Per-subcategory detail rows from `grid_poi_details` are also fetched, showing the name, straight-line distance, and travel time to the nearest POI in each subcategory for the requested mode.
4. An isochrone overlay is fetched live from Valhalla and shown on the map. For `transit` mode the multimodal isochrone comes from the dedicated transit Valhalla instance. For `fifteen` (Best mode), cycling is used as the representative display isochrone since Valhalla's interactive isochrone only supports single-mode costing.
The pin panel also shows estate value data (land price in €/m² from the BORIS NI cadastre) for cities in Lower Saxony, including a percentile rank among all zones in the city and a "peer percentile" rank among zones with similar accessibility scores.
### Hidden gem score
For cities with BORIS NI estate value data, a **hidden gem score** is precomputed per grid point at the end of Phase 2:
```
hidden_gem_score = composite_accessibility × (1 price_rank_within_decile)
```
- `composite_accessibility` average of all category scores for that grid point (walking / 15 min / universal profile)
- `price_rank_within_decile` `PERCENT_RANK()` of the nearest zone's land price among all zones in the same accessibility decile (0 = cheapest, 1 = most expensive relative to equally accessible peers)
The result is in [0, 1]: high only when a location is both accessible *and* priced below its peers. Stored in `grid_points.hidden_gem_score` and served as a separate MVT overlay at `/api/tiles/hidden-gems/`.
The map offers three mutually exclusive base overlays (switchable in the control panel):
- **Accessibility** default grid heatmap coloured by composite score
- **Land value** BORIS NI zones coloured by €/m² (Lower Saxony cities only)
- **Hidden gems** grid points coloured by hidden gem score (Lower Saxony cities only)