Request / Response Reference
Full body contract for POST /v1/TENANT/recommendation.
Pure-metrics-only contract
We accept only raw scientific measurements. Proprietary aggregate scores — score, hrv_balance, activity_balance, sleep_balance, or any nested contributors.*_balance — are accepted at the wire (so SDK drift doesn't break requests) and silently discarded. They are not persisted, not read by the recommendation engine, and not returned in the response.
Why: peer-reviewed clinical literature grounds against raw scientific units (HRV rMSSD in ms, heart rate in bpm, sleep-stage durations in seconds, raw temperatures, raw SpO₂). Vendor 0–100 buckets are opaque proprietary transforms that differ across vendors and firmware versions. Building against raw metrics keeps the contract portable: the same body shape will apply when Whoop, Apple Health, or Garmin onboard.
Keep sending the raw fields. If your SDK surfaces aggregate scores in the same payload, leaving them in is fine — they land in a permissive-extras bucket and are dropped before storage.
Request body
Top-level fields
| Field | Type | Required | Notes |
|---|---|---|---|
request_id | UUID v4/v7 string | ✅ | Idempotency key. See idempotency.md. Recommended: UUID v7. |
query | string | ❌ | Free-text question. Omit for implicit-trigger (CxH auto-generates a relevant prompt from signals). |
client_trace_id | string | ❌ | Your own trace correlation ID. Echoed back in telemetry, not in the response body. |
timezone | IANA string | ✅ | E.g. "Europe/London", "America/Los_Angeles", "UTC". Abbreviations (EST, PST, CET) are rejected — see "Enum and range tables" below. |
safety_mode | enum | ✅ | "strict" or "permissive". See table below. |
minimum_confidence | float | ✅ | 0.0–1.0. Your per-request floor — responses below this confidence are suppressed with suppression_reason: "below_client_threshold". |
user_context | object | ✅ | See "user_context" below. |
TENANT | object | ✅ | See "TENANT payload" below. |
personal_info | object | ❌ | Optional partner-provided user metadata. See "personal_info" below. |
user_context
| Field | Type | Required | Notes |
|---|---|---|---|
cycle_phase | enum | ✅ | "menstrual" / "follicular" / "ovulatory" / "luteal". 4 values only. |
cycle_day | int | ✅ | 1–40. Days since last period start. |
life_stage | enum | ✅ | "reproductive" / "perimenopause" / "postmenopause". 3 values only. |
TENANT payload
At least one signal-bearing key must be populated. Signal-bearing keys:
sleep— array ofSleepRecordheartrate— array ofHeartRateBlockvo2_max— array ofVo2MaxRecorddaily_readiness— array ofDailyReadinessRecorddaily_sleep— array ofDailySleepRecordworkout— array ofWorkoutRecordsession— array ofSessionRecord
The fields above describe the alpha Oura integration's body schema. Each partner has a dedicated route at
/v1/<their-slug>/recommendationwith its own body shape — biomarker fields will differ across partners. The auto-generated/apireference is the wire-format ground truth for each live partner deployment. The surrounding contract patterns — auth, idempotency, errors, suppression, citation semantics — are partner-agnostic and apply uniformly.
Secondary (optional) keys: tag, enhanced_tag, rest_mode_period, ring_configuration, daily_cardiovascular_age, daily_stress, daily_resilience, daily_spo2.
The schema allows extra fields (model_config = {"extra": "allow"}). Any field not in the table — including aggregate score and contributors — is silently dropped.
Example TENANT.sleep[0] entry (pure metrics)
Code
Example TENANT.daily_readiness[0] entry (post-ZUP-12 drop)
Code
Per ZUP-12 (schema change, shipped 2026-04-24), daily_readiness no longer models score or contributors fields. If your SDK includes them, you may send them — they'll be dropped silently. The recommendation engine grounds on the raw metrics in sleep, heartrate, vo2_max, etc. instead.
personal_info (optional)
| Field | Type | Persisted? | Notes |
|---|---|---|---|
age_bucket | string | ✅ | E.g. "25-34". Bucketed age; raw DOB/age is not accepted. |
email | string | ❌ NOT persisted | Accepted at the wire, scrubbed before the audit-trail write. Send it if your SDK forces you to; don't rely on it round-tripping. Do not use it as a lookup key. |
| Other partner-defined fields | any | mostly ✅ | PHI minimization drops the email field only; everything else currently persists. If you send raw DOB/SSN/etc., don't — we treat the partner as responsible for not sending PII we didn't request. |
Enum and range tables
| Field | Valid values |
|---|---|
user_context.cycle_phase | menstrual, follicular, ovulatory, luteal |
user_context.life_stage | reproductive, perimenopause, postmenopause |
user_context.cycle_day | integer in [1, 40] inclusive |
safety_mode | strict, permissive |
minimum_confidence | float in [0.0, 1.0] inclusive |
timezone | IANA region/city format (Europe/London, Australia/Sydney, America/Los_Angeles) OR the bare strings UTC / GMT. Abbreviations like EST, PST, CET, JST are rejected with 422 invalid_request — they're ambiguous across DST boundaries. |
safety_mode semantics
| Mode | Behavior |
|---|---|
strict | Clinical-style queries (containing keywords matching internal diagnostic/prescriptive patterns) are rejected with 422 out_of_scope. Use when the integration is surfaced to consumers without a wellness disclaimer. |
permissive | All wellness queries accepted. Clinical framing in the response is still suppressed by the recommendation engine — but the request is processed. Use for partner-internal research, product testing, or consumer surfaces that include the CxH wellness disclaimer. |
Response body
Success (200)
Code
| Field | Type | Notes |
|---|---|---|
request_id | string | Echoed from the request. |
recommendation | string | null | 400–1500 chars when present. null when suppressed (see below). |
citations | array | Up to 7 Citation objects, ordered by relevance descending. Empty when suppressed. See Citation object below. |
recommendation_confidence | float | 0.0–1.0. The engine's self-assessment of retrieved-evidence quality. Always returned, even when suppressed. See Recommendation confidence below. |
suggested_questions | array | 0–5 follow-up prompts. Primary payload in Shape B (implicit-trigger) responses. |
trace_id | string | Opaque CxH telemetry correlation ID. Include in support tickets. |
served_at | ISO 8601 UTC | Response-emit timestamp. |
model_version | string | Opaque version string. Changes on model/prompt updates — don't parse. |
suppression_reason | string | null | See table below. null on successful recommendation. |
warnings | array | Soft-default advisory signals. V1 code: unmapped_persona — persona fell outside the known table and was routed to a safe default. Non-breaking-additive field — expect new codes over time. |
Citation object
Each entry in the citations array carries the following fields:
| Field | Type | Notes |
|---|---|---|
citation_id | string | Stable identifier for the source document. ULID-style — safe to use as a deduplication key across responses (the same paper cited in two responses gets the same id). |
source_title | string | Article title from the source. |
source_url | string | DOI URL when available, otherwise the canonical source URL. Always a fully qualified https:// URL. |
relevance | float | 0.0–1.0. Query-document match score from a biomedical cross-encoder reranker. See "Relevance score" below. |
Relevance score
relevance is the per-citation match score from a domain-specialized biomedical cross-encoder (MedCPT — trained on PubMed query-article pairs) applied to your query (or, for Shape B, the implicit prompt CxH generates) and the candidate source paragraph.
What it measures. Query-document semantic relevance — how well the retrieved passage answers the query. Not cohort match — i.e. it does not encode how closely the study population in the source matches the user's demographic profile. The score is a raw cross-encoder output clamped to [0.0, 1.0].
How to interpret. A relative ranking signal within a single response — not a calibrated probability. 0.91 does not mean "91% confidence"; it means the reranker placed this passage strongly above its alternatives for this query. Citations in a single response are returned sorted by relevance descending — the first entry is the strongest match.
How to use it.
- Display alongside the citation as a "match strength" indicator if your UI surfaces evidence to the user.
- Threshold-filter client-side if you want to show only the top few; we return up to 7 regardless of value.
- Don't compare values across different
model_versionstrings — the reranker can change without a wire-format break, and absolute numbers may shift.
Caveat. Calibration of these scores is under active review. Treat values as relative ordering within a response, not as cross-response confidence intervals. If you need a single confidence number for the whole answer, see top-level recommendation_confidence below.
Recommendation confidence
recommendation_confidence is the engine's self-assessment of whether the retrieved evidence is strong enough to support an answer. Range [0.0, 1.0]. Returned on every response (served or suppressed) — always parse it.
What it measures. A weighted LLM-graded score across three criteria of the retrieved evidence:
- Coverage (50%) — does the retrieved evidence actually address the user's question?
- Specificity (30%) — is it relevant to this user's context (cycle phase, life stage, biomarkers), or generic?
- Consistency (20%) — do the retrieved sources agree, or contradict?
For multi-step queries, per-sub-query scores are aggregated via min() — the weakest sub-query dominates. By design: a high overall score requires every sub-query to clear the bar.
How to interpret. A heuristic, not a calibrated probability. 0.78 does not mean "78% chance the recommendation is correct"; it means the engine judged the supporting evidence strong but not exceptional. The CxH platform floor is currently 0.5 — anything strictly below that is suppressed with suppression_reason: "below_platform_threshold".
How to use it.
- Set
minimum_confidenceon your request to enforce a per-request floor;0.5is a reasonable starting point and matches the platform floor. - If your UI surfaces a "confidence" indicator, display the value but don't render it as a percentage — that mis-communicates certainty to end users.
- Always parse it. A suppressed response still carries the score (e.g.
0.42withsuppression_reason: "below_platform_threshold") — useful for telemetry on which queries are being filtered.
Caveats.
- Calibration is heuristic, not validated against partner-user outcomes. Treat values as relative ordering; absolute meanings may shift across
model_versionupdates. - Exact
0.0withrecommendation: nulland nosuppression_reasonis an internal failure path, not a low-evidence path. Surface as an error condition rather than "low confidence" in your UI. - The
min()aggregation means one weak sub-query in a multi-step query can tank an otherwise-strong response. Highrelevanceon returned citations + lowrecommendation_confidenceis the signature of this case — not a contradiction.
Suppression
The recommendation engine has two confidence floors:
- Platform floor (CxH-side, non-configurable) — minimum confidence required for CxH to assert a wellness recommendation at all.
- Client floor (
minimum_confidenceon the request) — your per-request floor.
If recommendation_confidence is below either floor, the response is 200 OK with recommendation: null and one of:
suppression_reason | When |
|---|---|
below_platform_threshold | Below CxH-side platform floor. CxH decided we can't responsibly answer. |
below_client_threshold | Above platform floor but below your request's minimum_confidence. Your filter, not ours. |
insufficient_signal | Not enough partner data (e.g. only one day of sleep, no HRV). Re-send with more TENANT.* history if available. |
Suppressed responses still return recommendation_confidence, suggested_questions, trace_id, served_at, model_version, and warnings. recommendation and citations are empty.
Shape A vs Shape B
The response shape is the same in both cases — what changes is which fields are primary for partner display:
- Shape A (explicit query,
queryfield present):recommendationis the primary payload.suggested_questionsare 0–5 optional follow-ups. - Shape B (implicit trigger,
queryomitted):suggested_questionsis the primary payload — these are CxH-generated prompts the user can tap.recommendationmay be null even when confidence is high (we don't proactively answer a question the user didn't ask).
Your integration decides which shape to use based on the UI surface.
Sample full request (Shape B / implicit trigger)
Code
Note: no query field. The response will have suggested_questions populated and recommendation may be null — that's expected Shape B behavior.
What's not in this reference
- The formal JSON Schema is at docs.collectivex.health/api (auto-generated from the OpenAPI spec, kept in sync on every change). Use it for code generation.
- Error response shapes — see errors.md.
- Retry semantics — see idempotency.md.