# American Default MCP Server

A local Model Context Protocol server exposing American Default's data
(indicators, county distress scores, ADI composite) to MCP-compatible
agents — Claude Desktop, Claude Code, Cursor, and any other client that
speaks the protocol.

Track C1 of the 2026-05 Machine-First Rebuild. Complete as of Session 2
— five tool endpoints, probe mode, token-bucket rate limiter, auth stub,
and end-to-end MCP-protocol smoke tests.

## Install

From the repo root:

```bash
./venv/bin/pip install -r requirements.txt
```

Requires `mcp>=1.27.0,<1.28.0`. Transitive deps (anyio, starlette,
uvicorn, httpx, pyjwt, sse-starlette, python-multipart,
pydantic-settings) are installed automatically.

## Run the server locally

```bash
PYTHONPATH=. python3 -m scripts.machine_layer.mcp_server
```

Starts the stdio loop. Logs go to stderr — stdout is reserved for
JSON-RPC framing.

### Probe mode (confirm the server is live)

```bash
PYTHONPATH=. python3 -m scripts.machine_layer.mcp_server --probe
```

Emits a JSON handshake to stdout and exits 0. Does not enter the stdio
loop. Use this in CI or for smoke tests.

Sample output:

```json
{
  "server_name": "american-default-mcp",
  "server_version": "1.0.0-session-2",
  "schema_version": "v1",
  "sdk_version": "1.27.0",
  "transport": "stdio",
  "tools": [
    {"name": "get_indicator", "description": "..."},
    {"name": "get_county_scorecard", "description": "..."},
    {"name": "get_adi_composite", "description": "..."},
    {"name": "search_indicators", "description": "..."},
    {"name": "get_cross_correlations", "description": "..."}
  ]
}
```

## Claude Desktop configuration

Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "american-default": {
      "command": "/absolute/path/to/venv/bin/python3",
      "args": ["-m", "scripts.machine_layer.mcp_server"],
      "cwd": "/absolute/path/to/american-default",
      "env": {
        "PYTHONPATH": "/absolute/path/to/american-default"
      }
    }
  }
}
```

Restart Claude Desktop. The three tools appear under the hammer icon.

## Tool surface

| Tool | Input | Returns |
|---|---|---|
| `get_indicator(slug)` | bundle slug (e.g. `the-buffer`) | compact snapshot + pre-computed aggregates + canonical citation |
| `get_county_scorecard(fips)` | 5-digit FIPS (4-digit accepted with implicit leading zero) | CDI scorecard + 5-domain breakdown + pre-baked citations |
| `get_adi_composite()` | (none) | latest quarter ADI + 5 components + zone + citation |
| `search_indicators(query, limit=10)` | keyword + optional limit (max 50) | ranked matches (slug, branded_name, name, category, URL) |
| `get_cross_correlations(slug)` | indicator slug | fully-validated leading/lagging pairs split into `as_leader` + `as_follower` |

## Schema versioning

Every response carries `schema_version: "v1"`. Breaking changes ship as
a new tool with a `_v2` suffix — v1 tools stay live for backward
compatibility. Callers should assert the schema version they expect.

## Canonical attribution (three-tier)

Every response includes a `citation` object with APA, MLA, Chicago, and
news-copy forms. Three-tier naming is enforced:

- **American Default Research** — institutional name, used in citations,
  source lists, bibliographies
- **American Default** — brand name, used for URLs and casual references
- **American Distress Index** (ADI) — product name, used only when the
  composite score is the subject

See `https://americandefault.org/llms.txt` § "Canonical Attribution"
for the authoritative spec.

## Data surfaces consumed

| Surface | Path | Files |
|---|---|---|
| Indicator bundles | `site/src/data/indicator-bundles/{slug}.json` | 96 |
| Scorecard blobs | `site/src/data/scorecard-blobs/{fips}.json` | 3,144 |
| Canonical facts (ADI composite + CDI metadata) | `data/canonical_facts.json` | 1 |
| Cross-correlations (S2) | `data/research/leading_indicators.json` | 1 |
| Source attribution | `data/indicators/{category}/{indicator}.json` | 96 |

### Slug ↔ indicator_id mapping

Source JSONs carry both `indicator_id` (snake_case) and `slug`
(kebab-case). 91 of 96 indicators have slugs that DO NOT mechanically
transform from their id — branded indicators use marketing names like
`the-buffer` (id: `savings_rate`), `the-horizon` (id: `ai_capability`),
`the-pinch` (id: `census_htops_difficulty`).

The server builds a bidirectional map by scanning the source JSONs once
(~100ms) and caches it by the directory mtime/size signature. Lookups are
O(1) until a committed data refresh changes the underlying files, at which
point the long-running process re-reads the data on the next request.

### Category derivation

Source JSONs do not populate a `category` string. Categories are derived
from the `door` number:

| Door | Category |
|---|---|
| 0 | pressure |
| 1 | debt_stress |
| 2 | legal_filings |
| 3 | buffer_depletion |
| 4 | labor_market |

## Empty-data bundles

10 of 96 bundles ship without populated data (indicators on the roadmap
but not yet backfilled — AI job postings, ABA consumer discretionary,
NMHC rent tracker, utility disconnections, etc.). These return
`status: "awaiting_population"` with full metadata and a null
`latest_value`. Agents can discover the slug exists without receiving
phantom data.

## Rate limiting

An in-memory two-level token bucket keyed by the client's
`clientInfo.name` (from the MCP initialize handshake):

- **Per-minute burst** — `MCP_RATE_LIMIT_RPM`, default `60`
- **Per-hour sustained** — `MCP_RATE_LIMIT_RPH`, default `600`

Both buckets must have tokens available for a request to pass. Exceeding
either raises a `ToolError("rate_limited: retry_after=...s")` that FastMCP
converts to a protocol-level `isError=true` response with the message
preserved for the client.

For local testing, set `RATE_LIMIT_DISABLED=true` for a hard bypass.

Rate limiting is intentionally defensive: stdio is inherently
single-client per process, but the architecture survives a future
HTTP/SSE transport where multi-tenancy matters.

## Auth stub

When `MCP_API_KEY` is set server-side, every tool invocation emits an
audit line to stderr (`auth-ok` / `auth-mismatch` / `auth-missing-key`)
with the `clientInfo.name` that declared itself in the handshake. The
stub does NOT block on stdio — the OS process boundary is the auth
boundary when Claude Desktop (or any local client) spawns the server
directly.

`verify_api_key(provided: str | None)` in `auth.py` is the stricter
hook that DOES block (raises `ToolError("unauthorized: ...")` on
mismatch). Future HTTP/SSE transport will wrap each request with it.

## Tests

```bash
# Direct-call unit tests (43 tests)
./venv/bin/pytest scripts/tests/test_mcp_server.py -v

# End-to-end MCP-protocol smoke tests (11 tests, in-process client/server)
./venv/bin/pytest scripts/tests/test_mcp_smoke.py -v

# Both together
./venv/bin/pytest scripts/tests/test_mcp_server.py scripts/tests/test_mcp_smoke.py -v
```

54 tests pass in ~1 second. Coverage: probe handshake, tool registration,
slug registry build, all five endpoints (happy + error paths), citation
three-tier compliance, response size budgets, search ranking sanity,
cross-correlation pair splitting, rate-limit burst/refill/per-client
isolation/hour-cap, auth stub logging + `verify_api_key`, protocol-level
error propagation, client-identity plumbing through the gate.

The smoke-test file uses `mcp.shared.memory.create_connected_server_and_client_session`
— the same MCP protocol wire Claude Desktop speaks — to catch regressions
that only surface end-to-end.

## Response size budgets

| Endpoint | Budget | Actual (the-buffer / FIPS 13077 / latest ADI) |
|---|---|---|
| `get_indicator` | ≤ 16 KB | ~13.8 KB |
| `get_county_scorecard` | ≤ 25 KB | ~2.5 KB |
| `get_adi_composite` | ≤ 4 KB | ~2.0 KB |

Raw 300+ point indicator series is intentionally omitted from
`get_indicator` to keep LLM context budgets manageable. The full series
lives at `https://americandefault.org/api/indicators/{slug}.json`.

## Troubleshooting

**Server hangs on startup in Claude Desktop.** Verify the `cwd` is the
repo root and `PYTHONPATH` is set in the `env` block. The server needs
repo-root imports to find `scripts.machine_layer.*`.

**Probe returns "No module named scripts"**. Run from the repo root with
`PYTHONPATH=.` prefixed, using the venv's Python binary.

**Unexpected slug not found.** The 96 bundles cover indicators with
populated data. 10 bundles return `status: "awaiting_population"` for
indicators that are tracked but not yet backfilled.

**stdout pollution breaks JSON-RPC.** All logging is routed to stderr.
If you add a `print()` statement for debugging, remove it before
shipping — it will corrupt the protocol stream.

## License & attribution

Data is free to use with attribution. The canonical attribution block
and per-indicator citation formats are at
`https://americandefault.org/llms.txt`.