CollectiveX Health Partner API
  • Get Started
  • Reference
  • Operations
  • API Reference
AuthenticationRequest / Response ReferenceErrorsIdempotencyRate Limits
powered by Zuplo
Reference

Idempotency

Every POST /v1/TENANT/recommendation is idempotent on the (client_id, request_id) pair for 5 minutes.

TL;DR

  • Generate a fresh UUID (v7 preferred, v4 acceptable) for request_id per logical request.
  • If your request times out or you hit a retryable error (429 / 5xx), retry with the same request_id.
  • Within 5 minutes, the same request_id returns the cached response — no re-processing, no double-charge, no duplicate audit-trail row.

How the cache key is computed

Code
cache_key = hash(client_id + ":" + request_id) TTL = 300 seconds storage = Redis (origin-side, EU region)
  • client_id is the partner tenant identifier (cxh-sandbox-TENANT or cxh-prod-TENANT). The gateway derives it from your API key via the injected X-Zuplo-Partner-Id header — you don't send it explicitly.
  • request_id is the UUID string you send.
  • Scope: per-partner. Your request_id = "abc" and another partner's request_id = "abc" are different cache keys — no cross-partner collision.

When the cache serves a replay

A repeat request hits the cache iff all three hold:

  1. Same client_id (automatic — derived from your API key).
  2. Same request_id in the body.
  3. Less than 5 minutes since the original request was served.

Cache hit: response carries X-CxH-Cache: hit and served_at reflects the original processing time, not the replay time.

Cache miss: response carries X-CxH-Cache: miss and the request is processed from scratch.

You can therefore drive cache-aware logic off the header rather than parsing served_at deltas:

Code
r = httpx.post(...) if r.headers.get("X-CxH-Cache") == "hit": # idempotent replay — same recommendation as before ...

When the cache does NOT serve a replay

  • request_id is different — fresh request.
  • More than 5 minutes have passed — cache entry expired; fresh request.
  • The original request returned 422 invalid_request or 422 out_of_scope — validation failures are not cached. Retry with the same request_id re-runs validation.
  • The original request returned 401 / 403 (gateway or consent) — gateway rejections are not cached. Fix the auth issue first.

UUID version recommendations

  • UUID v7 (preferred): time-sortable. Easier to correlate with your own logs if you're debugging a 48-hour window of requests — sorting by request_id sorts by time.
  • UUID v4: random. Works fine. Harder to correlate visually.
  • Anything else (v1, custom schemes, incrementing integers): technically valid as long as it's a string, but fights the spirit of idempotency. Use a real UUID.

Choosing your retry strategy

Scenario A: client-side timeout (no response from us)

Your client sent the request but timed out before our response reached you. Unclear whether we processed it.

  • Retry with the same request_id.
  • If we had already processed it: you get the cached response (200 OK with the original citations).
  • If we hadn't: we process fresh.
  • Either way, no duplicate audit-trail row, no duplicate recommendation.

Scenario B: 429 rate-limited

You hit 30 req/min. Zuplo rejected before the origin saw the request.

  • Wait for Retry-After seconds.
  • Retry with the same request_id.
  • The origin never saw the earlier attempt, so the cache has no entry. First real processing happens on retry.

Scenario C: 502 persistence_failed

We generated a recommendation but couldn't persist it to the audit-trail. We fail-closed, so we didn't serve the response.

  • Retry after 1s with the same request_id.
  • The cache has no entry (we failed before caching).
  • On retry, we regenerate. In practice the result is deterministic given the same inputs, so you'll get functionally the same recommendation.

Scenario D: 500 internal_error

Unexpected failure somewhere in the pipeline.

  • Retry after 1–2s with the same request_id.
  • Exponential backoff if it persists.
  • Include the trace_id in any support ticket.

Anti-patterns

  • Do not generate a new request_id for retries. That bypasses idempotency and risks a duplicate recommendation + duplicate audit-trail row.
  • Do not reuse a request_id across logically distinct requests. If the user asks a different question, that's a different request_id. Reusing an ID for different content will serve the cached (old) response and silently ignore your new content.
  • Do not parse served_at as the response generation time on a replay. It reflects the original processing time. If the difference matters for your UI, track your local send time instead.

Audit-trail implication

Every served response (cache hit or miss) corresponds to exactly one row in our partner audit-trail collection. The row is written on the cache miss that produced the original response. Cache hits do not write new rows — they serve what's already stored.

This matters for:

  • Billing: one row = one billable unit. Retries within the 5-minute window are free.
  • Audit: a repeat request_id that served from cache will not show up as a new audit-trail event.
  • Rate limit: cache hits still count against your 30 req/min. The rate limit is enforced at the Zuplo edge before the cache lookup. If you're hitting rate limits on replays, back off per Retry-After.

Example: safe retry loop

Code
import time import uuid import httpx def request_recommendation(body: dict, *, max_retries: int = 3) -> dict: request_id = str(uuid.uuid4()) # or uuid7 if you have it body = {**body, "request_id": request_id} for attempt in range(max_retries + 1): try: r = httpx.post( f"{CXH_BASE_URL}/v1/{TENANT}/recommendation", headers={"Authorization": f"ApiKey {CXH_API_KEY}"}, json=body, timeout=30.0, ) except httpx.TimeoutException: if attempt < max_retries: time.sleep(2 ** attempt) # 1s, 2s, 4s continue raise if r.status_code == 429: retry_after = int(r.headers.get("Retry-After", 1)) time.sleep(retry_after) continue if r.status_code in (500, 502, 503) and attempt < max_retries: time.sleep(2 ** attempt) continue r.raise_for_status() return r.json() raise RuntimeError(f"Exceeded {max_retries} retries for request_id={request_id}")

Note the request_id is generated once, outside the retry loop. Every retry reuses it. This is the correct pattern.

Last modified on April 29, 2026
ErrorsRate Limits
On this page
  • TL;DR
  • How the cache key is computed
  • When the cache serves a replay
  • When the cache does NOT serve a replay
  • UUID version recommendations
  • Choosing your retry strategy
    • Scenario A: client-side timeout (no response from us)
    • Scenario B: 429 rate-limited
    • Scenario C: 502 persistence_failed
    • Scenario D: 500 internal_error
  • Anti-patterns
  • Audit-trail implication
  • Example: safe retry loop