# American Default MCP Server A local Model Context Protocol server exposing American Default's data (indicators, county distress scores, ADI composite) to MCP-compatible agents — Claude Desktop, Claude Code, Cursor, and any other client that speaks the protocol. Track C1 of the 2026-05 Machine-First Rebuild. Complete as of Session 2 — five tool endpoints, probe mode, token-bucket rate limiter, auth stub, and end-to-end MCP-protocol smoke tests. ## Install From the repo root: ```bash ./venv/bin/pip install -r requirements.txt ``` Requires `mcp>=1.27.0,<1.28.0`. Transitive deps (anyio, starlette, uvicorn, httpx, pyjwt, sse-starlette, python-multipart, pydantic-settings) are installed automatically. ## Run the server locally ```bash PYTHONPATH=. python3 -m scripts.machine_layer.mcp_server ``` Starts the stdio loop. Logs go to stderr — stdout is reserved for JSON-RPC framing. ### Probe mode (confirm the server is live) ```bash PYTHONPATH=. python3 -m scripts.machine_layer.mcp_server --probe ``` Emits a JSON handshake to stdout and exits 0. Does not enter the stdio loop. Use this in CI or for smoke tests. Sample output: ```json { "server_name": "american-default-mcp", "server_version": "1.0.0-session-2", "schema_version": "v1", "sdk_version": "1.27.0", "transport": "stdio", "tools": [ {"name": "get_indicator", "description": "..."}, {"name": "get_county_scorecard", "description": "..."}, {"name": "get_adi_composite", "description": "..."}, {"name": "search_indicators", "description": "..."}, {"name": "get_cross_correlations", "description": "..."} ] } ``` ## Claude Desktop configuration Add to `~/Library/Application Support/Claude/claude_desktop_config.json`: ```json { "mcpServers": { "american-default": { "command": "/absolute/path/to/venv/bin/python3", "args": ["-m", "scripts.machine_layer.mcp_server"], "cwd": "/absolute/path/to/american-default", "env": { "PYTHONPATH": "/absolute/path/to/american-default" } } } } ``` Restart Claude Desktop. The three tools appear under the hammer icon. ## Tool surface | Tool | Input | Returns | |---|---|---| | `get_indicator(slug)` | bundle slug (e.g. `the-buffer`) | compact snapshot + pre-computed aggregates + canonical citation | | `get_county_scorecard(fips)` | 5-digit FIPS (4-digit accepted with implicit leading zero) | CDI scorecard + 5-domain breakdown + pre-baked citations | | `get_adi_composite()` | (none) | latest quarter ADI + 5 components + zone + citation | | `search_indicators(query, limit=10)` | keyword + optional limit (max 50) | ranked matches (slug, branded_name, name, category, URL) | | `get_cross_correlations(slug)` | indicator slug | fully-validated leading/lagging pairs split into `as_leader` + `as_follower` | ## Schema versioning Every response carries `schema_version: "v1"`. Breaking changes ship as a new tool with a `_v2` suffix — v1 tools stay live for backward compatibility. Callers should assert the schema version they expect. ## Canonical attribution (three-tier) Every response includes a `citation` object with APA, MLA, Chicago, and news-copy forms. Three-tier naming is enforced: - **American Default Research** — institutional name, used in citations, source lists, bibliographies - **American Default** — brand name, used for URLs and casual references - **American Distress Index** (ADI) — product name, used only when the composite score is the subject See `https://americandefault.org/llms.txt` § "Canonical Attribution" for the authoritative spec. ## Data surfaces consumed | Surface | Path | Files | |---|---|---| | Indicator bundles | `site/src/data/indicator-bundles/{slug}.json` | 96 | | Scorecard blobs | `site/src/data/scorecard-blobs/{fips}.json` | 3,144 | | Canonical facts (ADI composite + CDI metadata) | `data/canonical_facts.json` | 1 | | Cross-correlations (S2) | `data/research/leading_indicators.json` | 1 | | Source attribution | `data/indicators/{category}/{indicator}.json` | 96 | ### Slug ↔ indicator_id mapping Source JSONs carry both `indicator_id` (snake_case) and `slug` (kebab-case). 91 of 96 indicators have slugs that DO NOT mechanically transform from their id — branded indicators use marketing names like `the-buffer` (id: `savings_rate`), `the-horizon` (id: `ai_capability`), `the-pinch` (id: `census_htops_difficulty`). The server builds a bidirectional map by scanning the source JSONs once (~100ms) and caches it by the directory mtime/size signature. Lookups are O(1) until a committed data refresh changes the underlying files, at which point the long-running process re-reads the data on the next request. ### Category derivation Source JSONs do not populate a `category` string. Categories are derived from the `door` number: | Door | Category | |---|---| | 0 | pressure | | 1 | debt_stress | | 2 | legal_filings | | 3 | buffer_depletion | | 4 | labor_market | ## Empty-data bundles 10 of 96 bundles ship without populated data (indicators on the roadmap but not yet backfilled — AI job postings, ABA consumer discretionary, NMHC rent tracker, utility disconnections, etc.). These return `status: "awaiting_population"` with full metadata and a null `latest_value`. Agents can discover the slug exists without receiving phantom data. ## Rate limiting An in-memory two-level token bucket keyed by the client's `clientInfo.name` (from the MCP initialize handshake): - **Per-minute burst** — `MCP_RATE_LIMIT_RPM`, default `60` - **Per-hour sustained** — `MCP_RATE_LIMIT_RPH`, default `600` Both buckets must have tokens available for a request to pass. Exceeding either raises a `ToolError("rate_limited: retry_after=...s")` that FastMCP converts to a protocol-level `isError=true` response with the message preserved for the client. For local testing, set `RATE_LIMIT_DISABLED=true` for a hard bypass. Rate limiting is intentionally defensive: stdio is inherently single-client per process, but the architecture survives a future HTTP/SSE transport where multi-tenancy matters. ## Auth stub When `MCP_API_KEY` is set server-side, every tool invocation emits an audit line to stderr (`auth-ok` / `auth-mismatch` / `auth-missing-key`) with the `clientInfo.name` that declared itself in the handshake. The stub does NOT block on stdio — the OS process boundary is the auth boundary when Claude Desktop (or any local client) spawns the server directly. `verify_api_key(provided: str | None)` in `auth.py` is the stricter hook that DOES block (raises `ToolError("unauthorized: ...")` on mismatch). Future HTTP/SSE transport will wrap each request with it. ## Tests ```bash # Direct-call unit tests (43 tests) ./venv/bin/pytest scripts/tests/test_mcp_server.py -v # End-to-end MCP-protocol smoke tests (11 tests, in-process client/server) ./venv/bin/pytest scripts/tests/test_mcp_smoke.py -v # Both together ./venv/bin/pytest scripts/tests/test_mcp_server.py scripts/tests/test_mcp_smoke.py -v ``` 54 tests pass in ~1 second. Coverage: probe handshake, tool registration, slug registry build, all five endpoints (happy + error paths), citation three-tier compliance, response size budgets, search ranking sanity, cross-correlation pair splitting, rate-limit burst/refill/per-client isolation/hour-cap, auth stub logging + `verify_api_key`, protocol-level error propagation, client-identity plumbing through the gate. The smoke-test file uses `mcp.shared.memory.create_connected_server_and_client_session` — the same MCP protocol wire Claude Desktop speaks — to catch regressions that only surface end-to-end. ## Response size budgets | Endpoint | Budget | Actual (the-buffer / FIPS 13077 / latest ADI) | |---|---|---| | `get_indicator` | ≤ 16 KB | ~13.8 KB | | `get_county_scorecard` | ≤ 25 KB | ~2.5 KB | | `get_adi_composite` | ≤ 4 KB | ~2.0 KB | Raw 300+ point indicator series is intentionally omitted from `get_indicator` to keep LLM context budgets manageable. The full series lives at `https://americandefault.org/api/indicators/{slug}.json`. ## Troubleshooting **Server hangs on startup in Claude Desktop.** Verify the `cwd` is the repo root and `PYTHONPATH` is set in the `env` block. The server needs repo-root imports to find `scripts.machine_layer.*`. **Probe returns "No module named scripts"**. Run from the repo root with `PYTHONPATH=.` prefixed, using the venv's Python binary. **Unexpected slug not found.** The 96 bundles cover indicators with populated data. 10 bundles return `status: "awaiting_population"` for indicators that are tracked but not yet backfilled. **stdout pollution breaks JSON-RPC.** All logging is routed to stderr. If you add a `print()` statement for debugging, remove it before shipping — it will corrupt the protocol stream. ## License & attribution Data is free to use with attribution. The canonical attribution block and per-indicator citation formats are at `https://americandefault.org/llms.txt`.