This is the reference companion to Measuring New York. Each chapter ships with a tight methodology section pointing back here for the full version. Algorithm details, scoring frameworks, vintage decisions, and the long-form caveats that would drown each chapter's prose all live in this document.
It is a living reference — it grows as new chapters ship. Sections marked Forthcoming are placeholders for chapters that haven't been written yet. The series-wide conventions in §1 hold across every chapter unless a chapter explicitly notes an exception.
1. Series-wide conventions
These hold across every chapter unless the chapter's own methodology section notes an exception.
Default geographic unit
Community Districts (59). The 59 residential CDs are the default unit of comparison. CDs are big enough that statistical rates are stable, small enough to correspond to actual lived neighborhoods, and they're the unit the city itself uses for quality-of-life review. Where a chapter needs a finer grain — heat vulnerability, asthma, the intra-CD variance work in Ch. 1 — it drops to 2020 Census tracts (~2,303 in NYC after the five-borough filter) and says so explicitly. The DCP Community Districts file (dataset 5crt-au7u, version 26a) also contains 12 Joint Interest Areas (parks, airports, JFK); we exclude JIAs from analysis because they have essentially no residential population.
Projection
- Spatial math: EPSG:2263 (NY Long Island State Plane, feet). Distance, area, and buffer operations happen in this projection.
- Output / display: EPSG:4326 (WGS84 lat/lon). All published GeoJSONs are in 4326.
Vintage freeze
All chapters use the same pinned dataset vintages, declared in measuring-new-york/MANIFEST.json at the analysis repo. Pin date: 2026-06-01. The four primary vintages:
- ACS 5-year 2019–2023 release (Census Bureau)
- NYC Open Data snapshot 2026-06-01 (Socrata; all SoQL queries filter
<= 2026-06-01T00:00:00.000) - MTA GTFS 2026-06-01 weekly bundle
- OpenStreetMap Overpass 2026-06-01 snapshot
A chapter that needs a different vintage notes the exception inline. Chapter 0 used a 2026-05-01 NYC Open Data snapshot because the pinned 2026-06-01 was in the future at publication time; it will be re-run against the locked vintage once that date passes.
Scoring rule
Cross-chapter composites are reserved for Chapter 9. The synthesis chapter at the end of the series will ship three composite indices built around three different personas — explicitly to refuse the single-ranking trap. No earlier chapter ships a number that mixes mobility with housing with environment with safety.
Within-chapter composites are permitted, provided (a) the components are named, (b) the weights are stated, (c) the rationale for the weights is grounded in cited research rather than analyst preference. Examples currently in the series: Ch. 1's Mobility Access Score (a single-dimension reach metric), Ch. 2's Housing-stress score (which combines that chapter's two genuinely-independent dimensions — affordability + structural distress — with HUD-grounded 70/30 weights), and Ch. 4's daily-needs score (the geometric mean of a district's percentile rank on proximity and on sufficiency, equal-weighted and combined multiplicatively — so strength on one half can't compensate for weakness on the other, the logic being that daily-needs access is gated by whichever is scarcer).
Reproducibility
Every chart in the series ships with a comment in its source MDX pointing back to the analysis-repo notebook cell that produced it. The exact numbers in any chapter can be reproduced with make chapter-N against the pinned MANIFEST.json. The analysis repo is at github.com/shanvann/measuring-new-york.
2. How this differs from popular livability indices
A short tour of how livability is typically measured, and where this series sits in the design space.
Major published indices
- EIU Global Liveability Index (The Economist Intelligence Unit) — 173 cities; 30 factors across 5 categories (stability, healthcare, culture & environment, education, infrastructure) collapsed to a single 0–100 score. Vienna usually wins. Cross-city ranking; single composite.
- Mercer Quality of Living — 39 factors / 10 categories, primarily designed for corporate expat-compensation decisions. Heavy knowledge-worker bias built into the weighting.
- Monocle Quality of Life Survey — editorial; 25 cities; design-led criteria (transit, urban design, retail, climate). Useful but methodologically opaque.
- Numbeo Quality of Life Index — crowd-sourced (purchasing power, safety, healthcare, cost of living, commute, pollution). Open data; uneven sample sizes per city.
- AARP Livability Index (US) — the closest published analogue to what Measuring New York does. Census-tract level for the entire US; 50 indicators across 7 categories (housing, neighborhood, transportation, environment, health, engagement, opportunity). Sub-city granular, persona-aware.
- OECD Better Life Index — 11 topics, user-weighted (the reader slides the importance of each dimension). The user-weighting twist is one of the cleanest responses to the "no single weighting fits everyone" problem Chapter 9 is built around.
- Sustainable Cities Index (Arcadis) and Social Progress Index — three-pillar frameworks (people/planet/profit and basic-needs/wellbeing/opportunity respectively); used more in policy reports than consumer rankings.
Methodological frames worth knowing
- Capability Approach (Sen, Nussbaum) — livability as the freedom to do or be what one values, not as a checklist of amenities. Less an index than a critique of the index format.
- Healthy Streets (Saunders, Transport for London) — 10 indicators for human-centered street design at sub-neighborhood granularity. Operational, not comparative.
- Walk Score / Bike Score — single-dimension access scores; commercial; one slice of the mobility dimension Chapter 1 covers more broadly.
- Knight Soul of the Community (Gallup) — surveyed what emotionally attaches residents to places. Found aesthetics, social offerings, and openness mattered more for attachment than services. A reminder that what's measurable ≠ what matters.
NYC-specific references
- NYU Furman Center's State of New York City's Housing & Neighborhoods — annual; closest existing CD-level NYC reference; Chapter 2 leans on this implicitly.
- NYC Mayor's Management Report — agency-by-agency service quality (response times, complaint resolution). Raw input for an index, not an index itself.
- Citizens Budget Commission State of the City — fiscal + service-delivery reports.
- OneNYC sustainability plan KPIs — the city's own self-measurement.
- Eviction Lab (Princeton) — already in the series via OCA filings (Chapter 2).
How Measuring New York differs
Where on each axis this series sits is what distinguishes it from the published indices above:
| Axis | Most published indices | Measuring New York |
|---|---|---|
| Geography | City-level cross-comparison | Within-city, CD-level |
| Objective vs subjective | Both, blended | Objective only |
| Composite vs disaggregated | One composite ranking | Disaggregated until Ch. 9 |
| Universal vs persona | Universal | Persona-based (3 composites in Ch. 9) |
| Aggregation | City average | Neighborhood-by-neighborhood |
| Reproducibility | Closed methods | Fully reproducible, public datasets |
The closest design-space neighbor is AARP's Livability Index — also sub-city, also indicator-based, also US-only. The biggest differentiator is the persona refusal: Chapter 9 plans to ship three composites built around three different personas (e.g., a young renter, a family with school-age children, a retiree) rather than one "livability of NYC" score. That choice aligns more with the OECD's user-weighting philosophy than with the EIU/Mercer single-ranking tradition, and is the explicit rejection of the single-number trap baked into the series from Chapter 0.
3. Per-chapter methodology
This section used to enumerate every chapter's methodology inline.
It outgrew that role somewhere around Chapter 2. Each chapter's full
methodology now lives next to the code that produces it — a
METHODOLOGY.md file in the analyses/chapter-NN/ directory of the
data repository. The summary table below points at each chapter's
canonical methodology document; the per-chapter <MethodologyFooter>
at the end of each chapter post also links to its own doc directly.
| Chapter | Topic | Full methodology |
|---|---|---|
| 0 | Pilot · What does "livable" even mean? | analyses/chapter-00/ (pilot — no METHODOLOGY.md, see notebook) |
| 1 | Mobility & Access | analyses/chapter-01/METHODOLOGY.md |
| 2 | Housing Stability & Affordability | analyses/chapter-02/METHODOLOGY.md |
| 3 | Environmental Quality · Breathing Room | analyses/chapter-03/METHODOLOGY.md |
| 4 | Access to Daily Needs · Close Enough? | analyses/chapter-04/METHODOLOGY.md |
| 5–9 | Forthcoming | — |
Each per-chapter document covers: data sources (with exact filters and snapshot dates), geographic aggregation, per-metric formulas, an enumerated caveats list (typically 6–10 items), and a methodology changelog of how the chapter's measurements evolved. The advantage of keeping these next to the code is locality — when a metric changes, the methodology revision lives in the same commit as the notebook revision, so the two can't drift.
Pilot (Chapter 0) note
Chapter 0 was a teaser visual — 311 service requests per square mile
by community district, last 30 days ending the snapshot date — and
doesn't have a full METHODOLOGY.md. Source: NYC Open Data erm2-nwe9.
Vintage exception: used a 2026-05-01 snapshot instead of the
series-pinned 2026-06-01 because publication preceded the pinned date.
4. Visualization conventions
- Default visual: static. PNG choropleths produced by matplotlib in the analysis repo, with the series palette applied via
shared.palette.for_matplotlib(). - Interactive maps: Chapters 1 (mobility isochrones), 3 and 4 (per-CD choropleths with an axis-toggle tab strip), and 9 (the synthesis friction map). Implemented via MapLibre + deck.gl, dynamically imported so pages without a map don't pay the bundle cost. Chapter 4 added a categorical treatment — the walkable-vs-served quadrant map, discrete colors plus a legend — and a green-blue-grey ramp for axes where high values read as positive (more access = better) rather than worse.
- Color ramp: the series sequential ramp (light tan → deep red, 8 stops) is defined in
measuring-new-york/shared/palette.pyand mirrored insrc/components/nyc-map-impl.tsx. Percentile-clipped at the 5th / 95th percentile so a few outliers don't flatten the gradient.
5. Glossary
- ACS — American Community Survey, the U.S. Census Bureau's annual rolling sample. The 5-year release (2019–2023) is more stable than the 1-year for small geographies like census tracts.
- Bodega / corner store — A small neighborhood food store. In Chapter 4, a licensed food store under 5,000 sq ft — distinguished from a full-service grocery, which is large enough to carry a real fresh-produce section. About 84% of NYC's licensed food stores fall under the cutoff.
- CD — Community District. NYC has 59 residential CDs + 12 Joint Interest Areas; the series uses the 59 residential ones.
- Daily-needs score — Chapter 4's within-chapter composite: the geometric mean of a district's percentile rank on proximity and on sufficiency (childcare seats per child blended with grocery share). Multiplicative, so a low mark on either half pulls the score down. See §1 Scoring rule.
- FacDB — NYC Facilities Database (
ji82-xba5), the Department of City Planning's point inventory of public/institutional facilities — libraries, clinics, childcare, schools. Chapter 4 uses it for the care and civic baskets. - GTFS — General Transit Feed Specification. The MTA publishes its full subway and bus schedules as GTFS, refreshed weekly.
- Isochrone — A polygon enclosing the area reachable from an origin within a fixed time budget by a specified mode of travel.
- JIA — Joint Interest Area. NYC's term for non-residential geographic units the city governs without a residential community board (parks, JFK airport, etc.). Excluded from this series' analysis.
- LODES — Longitudinal Employer-Household Dynamics Origin-Destination Employment Statistics. Census-published counts of jobs at the workplace-block level for every U.S. state.
- Mobility Access Score — A per-CD single-dimension measure of transit job access. See §3 Chapter 1.
- MODZCTA — Modified ZIP Code Tabulation Area. NYC DOHMH's adjusted ZCTA boundaries used for health geography.
- NTA — Neighborhood Tabulation Area. A Census-defined geography sized between tract and CD. Used by some chapters for finer-grained drilldowns.
- OCFS — NYS Office of Children and Family Services. Regulates home-based family day care statewide — but, notably, no day-care centers inside the five boroughs (those are NYC DOHMH's). A NYC childcare count needs both sources. Chapter 4.
- Proximity — In Chapter 4, whether a daily need is near — within a 10-minute, half-mile walk. The "easy," mappable half of access. Contrast with sufficiency.
- Sufficiency — In Chapter 4, whether there is enough of a nearby need to go around — e.g. childcare seats per child, or full groceries vs bodegas. Distinct from proximity; across NYC the two barely correlate (ρ ≈ +0.37 for childcare).
- TCRP — Transit Cooperative Research Program. Source of the 1.0-mile subway-stop catchment used in the Ch. 1 isochrone model.
- Tract — Census tract. About 2,303 in NYC after the five-borough filter, each designed for roughly equal population (~4,000 residents).
- Vintage — The dataset snapshot date. Frozen at 2026-06-01 across the series unless a chapter explicitly notes an exception.
6. Open methodological questions
Listed here so a future chapter or follow-up post can pick them up.
- Walking network. Swap straight-line × 1.4 for an
osmnx-derived NYC walk graph. Expected impact ≤10% in most neighborhoods; larger near bridges, parks without through-paths, and dead-end blocks. - Bus integration. Add the MTA bus GTFS bundles (
gtfs_m.zip,gtfs_bx.zip,gtfs_b.zip,gtfs_q.zip,gtfs_si.zip) to the routing graph. Likely material in transit-poor pockets of Queens and southeast Brooklyn. - Realtime reliability. Pull GTFS-Realtime trip updates; compute observed vs scheduled travel-time distributions per line and route. The basis for Chapter 8's "Time & Stress" analysis.
- Population weighting. Currently uses tract centroids unweighted (relying on tracts being equipopulated by design). LEHD LODES RAC (residence-side) would give explicit population weights.
- Off-peak isochrones. Re-run for off-peak (11 AM weekday), evening (7 PM), and weekend (Saturday noon) departure windows. The Ch. 1 hypothesis about reliability is testable with this.
- Job quality. LODES is "all payroll jobs." Splitting by earnings tier (CE01/CE02/CE03) or sector (CNS01–CNS20) would let us compute "jobs in your earnings tier reachable" — a more honest opportunity measure for low-income vs high-income residents.
- Sufficiency beyond childcare (Chapter 4). The proximity-vs-sufficiency test only had a true capacity measure for childcare (plus a fresh-food proxy via grocery share). Clinic appointment availability, school seats, and library capacity would let the food/care/civic baskets carry a sufficiency dimension too, instead of leaning on childcare.
- A real gentrification index (Chapter 4). We tested whether neighborhood growth explains low childcare sufficiency using raw 2010→2020 population and under-5 change; it doesn't (Spearman ρ ≈ −0.1 to −0.2, and the starved districts are mostly losing young children). A proper gentrification typology — NYU Furman Center's, or change in college-educated share / median rent — might be sharper than raw head-count growth.
This page updates whenever a new chapter ships. Last updated: 2026-06-07 (Chapter 4 — Access to Daily Needs).
Hero photo: Florian Wehde via Unsplash (downtown Manhattan aerial).
