Architecture

Deployment shape and identity-derivation design for the reference services.

4A — Architecture and Deployment

This document describes the deployment shape of the 4A reference services, the identity model, and the rationale behind the technology choices. For the convention itself, see README.md.

Goals

  • Cheapest reliable shape that scales to millions of requests per month
  • No persistent state in the hosted gateway (no database, no object store, no Parameter Store)
  • Surfaces accessible to cloud-hosted agents (ChatGPT, Claude.ai) without local install
  • Local self-hosting must remain a first-class option for power users and OSS-project commons

Component overview

                       ┌──────────────────────────────┐
                       │  Nostr relays (existing)     │
                       │  relay.damus.io, nos.lol,    │
                       │  nostr.wine, …               │
                       └──────────────┬───────────────┘
                                      │ WSS
                                      ▼
                       ┌──────────────────────────────┐
                       │  4A hosted gateway           │
                       │  Cloudflare Workers          │
                       │  + Durable Objects           │
                       │                              │
                       │  Phase 1:                    │
                       │   - public read API          │
                       │   - relay subscriptions in   │
                       │     Durable Objects (WS      │
                       │     hibernation)             │
                       │                              │
                       │  Phase 2:                    │
                       │   - OAuth (GitHub)           │
                       │   - KMS-backed key           │
                       │     derivation               │
                       │   - publish endpoints        │
                       └─────┬────────────────────┬───┘
                             │                    │
                  Phase 2 only│                    │
                             ▼                    ▼
                  ┌──────────────────┐   ┌──────────────────┐
                  │   AWS KMS        │   │   GitHub OAuth   │
                  │   (HMAC key,     │   │   (identity)     │
                  │   non-extractable│   │                  │
                  │   derivation)    │   │                  │
                  └──────────────────┘   └──────────────────┘

                        ▲ HTTP / SSE / MCP
                        │
       ┌────────────┬───┴────────┬─────────┬────────────┬──────────┐
       │ ChatGPT    │ Claude.ai  │ MCP     │ Browser    │ Sonata   │
       │ Custom GPT │ connector  │ clients │ extension  │ plugin   │
       │            │            │         │ (later)    │ (local)  │
       └────────────┴────────────┴─────────┴────────────┴──────────┘

                                       ╳

                       ┌──────────────────────────────┐
                       │  Local CLI (`4a`)            │
                       │  - signs with local key      │
                       │  - posts directly to relays  │
                       │  - bypasses gateway entirely │
                       └──────────────────────────────┘

Why Cloudflare Workers + AWS KMS

Two clouds, picked for what each does best.

Cloudflare Workers + Durable Objects (compute)

The hosted gateway's central job is to maintain WebSocket subscriptions to Nostr relays. Lambda fundamentally cannot do this — Lambda invocations cannot hold persistent connections across requests. A Lambda-based architecture would require a separate Fargate (or equivalent) always-on indexer service, splitting the system into two deployment targets.

Durable Objects can hold WebSocket connections via the WebSocket hibernation API. The Durable Object is suspended between events; it pays only for storage and active processing. This collapses the indexer and the API into one architecture and one deploy target.

Other Workers properties that fit:

  • Cold start ~5ms (vs ~200–500ms for Lambda with Bun runtime)
  • Edge-distributed by default (every PoP is an entry point)
  • Free tier covers up to 100,000 requests per day
  • Paid plan ($5/mo) covers 10,000,000 requests per month
  • Workers have a 50ms CPU limit on the free plan and 30s on paid — well within our needs

AWS KMS (identity)

Cloudflare does not have a non-extractable HMAC primitive equivalent to AWS KMS. Workers Secrets are encrypted at rest but are available as plaintext to the running Worker code; that is comparable to a Lambda environment variable, not to an HSM-backed HMAC key.

For deterministic key derivation (see below), we want the master HMAC secret to be non-extractable: it never appears in plaintext outside the HSM, even to the running Worker code. AWS KMS HMAC keys provide this. Each GenerateMac call is ~$0.000003 and ~50ms.

The architecture uses Cloudflare Workers for the heavy lifting (compute, WS, edge distribution) and calls AWS KMS only for the security-critical derivation step.

Identity model

A 4A user's identity is a Nostr keypair. The hosted gateway supports three paths to having one. The interesting one is the custodial path; the others are well-defined existing patterns.

Custodial via OAuth (the default)

When a user signs in with GitHub OAuth, the gateway derives their Nostr private key deterministically from their OAuth identity using an HMAC key held in AWS KMS. No keys are stored. Every signing operation re-derives the key on demand.

The derivation:

oauth_id_string = oauth_provider + ":" + oauth_user_id
seed_bytes = AWS_KMS.GenerateMac(
    KeyId = "4a-derivation-key",
    Message = oauth_id_string,
    MacAlgorithm = "HMAC_SHA_256"
)
nostr_private_key = clamp_to_secp256k1(seed_bytes)
nostr_public_key = secp256k1_pubkey(nostr_private_key)

The HMAC key (4a-derivation-key) is created in AWS KMS as a non-extractable HMAC-SHA-256 key. It never leaves the HSM. The seed bytes are returned, used in-memory by the Worker for the duration of the signing operation, and discarded. The Nostr private key never persists to any storage.

Consequences:

  • No database. No keystore. Nothing to back up. Nothing to leak from a database breach.
  • The OAuth account is the recovery mechanism. Re-authenticating to GitHub re-derives the same key.
  • Users can export their nsec. A GET /me/export endpoint runs the derivation and returns the private key, allowing the user to migrate to local self-hosting or a NIP-46 bunker.
  • Same OAuth identity → same Nostr key, forever. This is the v0 contract.

The major tradeoff is that the KMS HMAC key is the master secret for every custodial user. If it leaks, all derived keys are compromised — and unlike a per-user keystore, the keys cannot be rotated by re-encrypting (the keys are out in the network with reputation attached). Mitigations:

  • KMS HMAC keys are non-extractable by definition; the only way they leak is via AWS HSM compromise
  • IAM policy locks the key to the gateway service principal only
  • All GenerateMac calls are CloudTrail-audit-logged
  • The blast radius is bounded — 4A custodial users only; users on the bunker or local-self-host paths are unaffected

NIP-46 bunker

Users with an existing Nostr identity provide a NIP-46 bunker URI in their account settings. The gateway forwards signing requests to the bunker (a separate Nostr event flow); the bunker signs and returns the signature. The gateway never sees the private key.

This is the right path for power users and for anyone uncomfortable with custodial. Existing public bunkers (nsec.app, others) work out of the box.

Local self-hosting

Anyone can clone the gateway repository and run it themselves on their own Cloudflare account, with their own KMS key (or with a Workers Secret if they accept the security tradeoff). Their users get the same surface area entirely off our infrastructure.

OSS-project commons are encouraged to self-host: commons.next.js runs its own gateway and own key, and the project's MCP config points users at it. We host nothing for them.

The local CLI (4a) bypasses the gateway entirely — it signs with a locally stored key and publishes directly to Nostr relays. This is the lowest-trust publishing path and the right choice for an OSS-project maintainer key.

Rotation (deferred)

In v0, custodial users cannot rotate their Nostr key without changing their OAuth account, because the derivation is deterministic on the OAuth ID. If a user wants rotation later, the derivation can be extended to include a counter:

oauth_id_string = oauth_provider + ":" + oauth_user_id + ":" + rotation_counter

Where the counter lives is a future decision: a JWT claim, a NIP-32 self-published label, or a tiny key/value somewhere. v0 does not implement this; the convention is "your OAuth identity is your 4A pubkey, forever."

Phase plan

Phase 1 — read everywhere, write locally

The minimum viable system. No identity, no DB, no KMS.

  • Cloudflare Worker exposes GET /query, GET /object/:id, GET /credibility/:pubkey, GET /commons and an SSE-transport MCP wrapper at /mcp/sse
  • Durable Objects hold WebSocket subscriptions to a configured set of relays
  • Read endpoints query the Durable Objects' in-memory event cache, with a fallback to direct relay queries on cache miss
  • Local CLI (4a) handles all publishing — power users sign with their own key, post directly to relays
  • Sonata plugin wraps the local CLI for Sonata users
  • ChatGPT Custom GPT and Claude.ai connector wrap the public read API

Total infrastructure cost: free tier covers all expected v0 traffic; ~$5/mo if it gets popular.

Phase 2 — custodial publishing

Adds OAuth and KMS-backed signing for users on cloud agent surfaces.

  • GitHub OAuth flow on the gateway
  • AWS KMS HMAC key created (4a-derivation-key)
  • Worker derives Nostr keys per-request via KMS GenerateMac
  • Write endpoints (POST /publish/observation, POST /publish/claim, POST /attest) become available to authenticated callers
  • Per-user API tokens (signed JWTs, no server-side state) for ChatGPT/Claude connectors
  • NIP-46 bunker support added as alternate identity path
  • Export endpoint (GET /me/export) added for users who want to migrate to bunker or local

Additional infrastructure cost: KMS calls (~$1 per million), API Gateway cost stays at zero (Worker handles HTTP directly). At 1M publishes/month: ~$1 added cost.

Phase 3 — credibility events (shipped 2026-04-28)

Adds two 4A-native event kinds that carry first-class credibility signal, with a paired-rationale rule the gateway enforces at publish time.

  • New event kinds: kind:30506 (Score) and kind:30507 (Comment), both addressable per NIP-33. See SPEC.md → Credibility events for the normative wire format.
  • New write endpoints on the gateway:
    • POST /v0/score — paired-publish convenience. Accepts { target, value, rationale, … }, signs both the kind:30506 Score and a paired kind:30507 Comment under the caller's custodial key, fans both out atomically.
    • POST /v0/comment — standalone comment publishing for the recursive-comments case (commenting on any 4A event including other comments).
  • New MCP tools (score, comment) wired through the same code path as the REST endpoints. Same handler, different surface; the OpenAPI document and the MCP manifest are generated from the same gateway source.
  • No new infrastructure. Score and Comment events ride the same Durable Object → relay pool that the four knowledge-object kinds already use; KMS calls per publish stay at one GenerateMac per signed event (two for /v0/score's paired-publish).

Credibility events — format versus methodology

A deliberate architectural decision: 4A v0 ships the wire format for credibility events but does not ship a reference aggregator. There is no aggregator.4a4.ai, no inline credibility block on query responses, no anointed seeds.

Methodology — how to turn a graph of kind:30506 and kind:30507 events into a presentable credibility figure — is left to clients, agents, and ecosystem implementations that can compete on opinion. The reasoning, paraphrasing SPEC.md Appendix A: "4A specifies the shape of score and comment events; it does not specify how aggregators turn a graph of those events into a presentable credibility figure." This mirrors the Microformats-on-HTML pattern — the substrate carries the conventions; downstream consumers decide what they mean.

Concretely, this means:

  • The hosted gateway runs no scoring algorithm and publishes no rollups.
  • The GET /v0/credibility/:pubkey endpoint continues to surface NIP-85 trusted assertions from external aggregators (nostr.band, Vertex) as-is. Adding a 4A-native rollup is explicitly deferred.
  • The paired-rationale MUST is enforced at the publish layer (the gateway rejects malformed pairings, or accepts a score-only event that aggregators are required to weight zero). It is not enforced at the aggregation layer, because aggregation is out of scope.
  • Bad actors get filtered by whichever aggregators a consumer chooses to trust, not by the gateway. Per-aggregator policy disagreement is a feature.

Two worked examples (alice→bob, carol→alice) are published on live relays. See docs/examples/phase-3/ and the Phase 3 runbook for the operational walkthrough.

v0.5 — private audiences (shipped 2026-04-28)

Adds audiences — named groups with per-epoch encryption keypairs, public rosters, pending-invite lists, and a claim flow that lets an invitee turn a one-shot 4a://invite/... URL into a real key-grant. The normative wire shape is locked in SPEC-v0.5.md.

  • New event kinds, all addressable per NIP-33:
    • kind:3051030514 — encrypted variants of the public kinds (Observation, Claim, Entity, Relation, Commons). Payload is the same JSON-LD content string as the public kind, NIP-44-v2-encrypted to the audience's current epoch pubkey. The blake3 tag carries BLAKE3-of-ciphertext per SPEC-v0.5.md § 3.3.
    • kind:30520fa:Audience declaration. Signed by the audience identity key (aud_id); declares the current epoch, current epoch pubkey, public member roster, and pending-invite list.
    • kind:30521fa:KeyGrant. NIP-44-v2 ciphertext of the audience epoch private key, encrypted from the granter's identity key to one recipient. Composite d tag (<slug>:<epoch>:<recipient-pub>) makes grants parameterized-replaceable per (granter, audience, epoch, recipient).
    • kind:30522fa:AudienceClaim. Off-band claim signed by the invite throwaway key (invite_priv decoded from a 4ainv1... bech32 string), naming the inviter pubkey, the audience epoch, and the invitee's real identity pubkey to be admitted.
  • NIP-17 gift-wrap layer. Every encrypted-variant rumor is wrapped once per current member into a kind:1059 gift-wrap signed by a fresh ephemeral pubkey, addressed by a single p tag. Relays cannot recover the audience slug, epoch, payload kind, publisher, or roster from the wire — that data lives inside the seal and is only visible to recipients with the matching identity key. Per SPEC-v0.5.md § 4.3 the gift-wrap is MUST, not SHOULD: skipping it would let relays map the membership graph against #a filters.
  • Per-epoch NIP-44 v2 encryption. The audience-level secret is a secp256k1 keypair (aud_epoch_n_pub, aud_epoch_n_priv). Members publish by NIP-44-encrypting to aud_epoch_n_pub; anyone holding aud_epoch_n_priv (delivered via a kind:30521 grant) can derive the same conversation key and decrypt. NIP-104 / MLS-on-Nostr is the migration target once stable — see SPEC-v0.5.md § 9.
  • Gateway endpoints under /v0/audience/*: create, invite, grant, claim, rotate, process-claims, list-pending-claims, list-my, publish, :slug/inbox (capability-based decryption), :slug/declaration (public read), :slug/stream (SSE replay for live-tail readers), by-invite-pub (claim-page resolver). The audience identity priv and current epoch priv are returned on /create and accepted as inputs on subsequent state-mutating routes — the gateway does not persist them (per PLAN-v0.5.md § 6 Q1 default).
  • NIP-05 fa extension. .well-known/nostr.json may carry an fa object whose keys are pubkeys and whose values list the audiences each pubkey publishes to, plus the 4A context URL. Optional and additive; the standard names and relays fields are unchanged.
  • 4a://invite/... URL grammar. Bech32 4ainv1... invite keys (HRP 4ainv, 32-byte payload) carry the throwaway invite priv. The HTTPS twin https://claim.4a4.ai/... is the transport convenience for surfaces that cannot register the 4a:// scheme — claim.4a4.ai is a host of convenience, not a privileged authority (no global 4A resolver). See SPEC-v0.5.md § 6.
  • New MCP tools. Ten audience-lifecycle tools (audience_create, audience_invite, audience_grant, audience_claim, audience_rotate, audience_process_claims, audience_list_pending_claims, audience_list_my, audience_publish, audience_inbox) — same JWT auth pattern as the public publish_* tools. audience_publish is polymorphic across kinds 30510–30514 via a kind argument; there is no separate publish_encrypted_observation etc. (one tool replaces four near-identical ones).
  • New CLI subcommands. 4a audience create | invite | grant | claim | rotate | process-claims | publish | inbox — same JWT auth, same arguments as the gateway routes.
  • Infrastructure additions. None. Audiences ride the same Durable Object → relay pool path as the public kinds. The relay-pool DO grows three new indexes: a pinv:<invite-pub> reverse index (claim-page resolves invite URLs to declarations without a relay round-trip), a giftwrap:<recipient>:<receivedAt>:<id> index (server-receive-time-keyed for inbox reads), and a event:30521:* keyspace scan for listKeyGrants. Storage growth is linear in audience throughput.

Reference application — Sonata Studio

Kinds 30530–30539 are reserved for Sonata Studio, a federated multi-Sonata workspace built on top of v0.5 audiences. Studio is a 4A application, not a 4A protocol kind block — its events carry Studio-specific JSON-LD payloads (context: https://sonata.4a4.ai/ns/studio-v0) and are always audience-addressed (NIP-44 to the epoch pubkey, NIP-17 gift-wrapped per member). Studio is the proof that the substrate is real, not a thought experiment: agents on different machines can join the same project room and exchange structured cards, dispatch intents, and reactions without trusting a central server. Normative shapes for the Studio kinds will be specified by a forthcoming studio-v0 spec; the v0.5 reservation only holds the block.

Phase 2.5+

  • NIP submission for 4A event kinds, including the v0.5 audience block
  • Optional ecosystem aggregator(s) that publish NIP-85 score assertions over the score/comment graph (non-normative, not part of 4A)
  • Arweave pinning workflow

Cost model

Estimates at representative traffic volumes, assuming Phase 2 deployed.

Monthly requests Workers KMS Total
1,000 $0 (free tier) $0 $0
100,000 $0 (free tier) $0.10 $0.10
1,000,000 $5 (paid plan) $1 ~$6
10,000,000 $5 + $30 $10 ~$45
100,000,000 $5 + $300 $100 ~$405

The cost model holds because Workers' free plan covers up to 3M req/mo (100K/day) and the paid plan ($5/mo) covers up to 10M req/mo, with linear scaling thereafter. KMS GenerateMac is $1 per million calls and only fires on publishes (writes), which are an order of magnitude less frequent than reads.

For comparison, the equivalent AWS-only architecture (Lambda + API Gateway + Fargate indexer + KMS) lands around $15–25/mo at v0 scale and ~$200/mo at 10M req/mo.

What we deliberately do not run

  • A database
  • An object store
  • A Parameter Store / Secrets Manager record per user
  • An always-on EC2 / Fargate / App Runner service
  • A Nostr relay
  • A reputation aggregator over the 4A score/comment graph. Phase 3 v0 ships the wire format (kind:30506 and kind:30507) but no aggregator; methodology is non-normative and ecosystem-built. v0 continues to consume external NIP-85 assertions from nostr.band, Vertex, and similar.

If any of these become necessary, they are explicit additions with their own justification, not inheritances from this design.

Source code layout (planned)

4a/
  README.md                 # convention pitch
  ARCHITECTURE.md           # this document
  LICENSE                   # Apache 2.0
  spec/
    kind-assignments.md
    vocabulary-v0.md
    context-v0.json         # the JSON-LD context document (hosted at 4a4.ai/ns/v0)
  gateway/                  # Cloudflare Worker source
    src/
      router.ts             # HTTP routing
      query.ts              # read endpoints
      publish.ts            # write endpoints (Phase 2)
      auth.ts               # OAuth + JWT (Phase 2)
      kms.ts                # AWS KMS GenerateMac wrapper (Phase 2)
      relay-pool.ts         # Durable Object: holds WS connections
      mcp-wrapper.ts        # SSE-transport MCP adapter
    wrangler.toml
  cli/                      # local publisher
    src/
      keygen.ts
      publish.ts
      sign.ts
  surfaces/                 # configurations for external surfaces
    chatgpt-action.json     # OpenAPI spec for ChatGPT Custom GPT Actions
    claude-connector.json   # Claude.ai connector manifest
    sonata-plugin.json      # Sonata plugin manifest
  examples/
    publish-observation.ts
    consume-via-mcp.json

Open architectural questions

  • DurableObject sharding strategy. A single DO holding all WS subscriptions does not scale past one CF region. We may need to shard by relay (one DO per relay) or by topic (one DO per popular t tag) once traffic grows. v0 uses one DO and accepts the limit.
  • Cache invalidation between Durable Objects and edge KV. If we ever cache query results at the edge for read latency, we need to invalidate when new events arrive. Fall back to short TTLs (60 seconds) until volume justifies smarter invalidation.
  • OAuth provider expansion. GitHub-only at v0; Google and Apple are obvious additions but not necessary for the engineer audience the OSS-commons wedge targets.
  • NIP-46 timeout handling. If a user's bunker is offline, signing requests time out. UX needs a clear failure mode (queued retry vs immediate failure).

Phase 2 secrets and env vars

The Phase 2 OAuth + JWT module reads three values from the Worker env. None of them go in .env (which is for local wrangler/Cloudflare auth only) — they are deployed via wrangler secret put so they live in Cloudflare's secret store, not the repo:

  • GITHUB_OAUTH_CLIENT_ID — public client id from the OAuth app registered at https://github.com/settings/developers. The app's Authorization callback URL must be https://api.4a4.ai/auth/github/callback.
  • GITHUB_OAUTH_CLIENT_SECRET — confidential secret from the same OAuth app.
  • JWT_SIGNING_KEY — HS256 signing secret for tokens minted on successful callback. Generate once with openssl rand -base64 32 and persist via wrangler secret put JWT_SIGNING_KEY. The same secret is used to sign the short-lived OAuth state parameter (HMAC over nonce.expiry).

The KMS-related variables (AWS_*, KMS_DERIVATION_KEY_ID) come online with the KMS signing module — see the Phase 2 AWS setup runbook.

Change log

  • 2026-04-28 — v0.5: private audiences shipped. New kinds (30510–30514 encrypted variants, 30520 audience declaration, 30521 key-grant, 30522 audience-claim), NIP-17 gift-wrap layer, NIP-44 v2 group encryption against a per-epoch keypair, 4a://invite/... URL grammar with bech32 4ainv1... invite keys, NIP-05 fa extension. New gateway endpoints under /v0/audience/*, new MCP audience_* tool family (10 tools), new CLI subcommands. Kinds 30530–30539 reserved for Sonata Studio (federated multi-Sonata workspaces, reference application). Normative wire shape in SPEC-v0.5.md. Migration target: NIP-104 / MLS-on-Nostr once MLS stabilizes; v0.5 wire shape is drop-in replaceable.
  • 2026-04-28 — Phase 3 v0: credibility events shipped. Two new addressable kinds (30506 Score, 30507 Comment), two new write endpoints (POST /v0/score, POST /v0/comment), two new MCP tools (score, comment). No reference aggregator — format-vs-methodology stance recorded in the new "Credibility events — format versus methodology" subsection. Refer to SPEC.md → Credibility events for the normative wire format.
  • 2026-04-27 — Phase 2 / 2: OAuth + JWT module landed in gateway/src/auth.ts. GitHub provider only in v0; HS256 JWT (24h) for publish-endpoint auth. New secrets documented above and in .env.example.
  • 2026-04-24 — Initial architecture document. Cloudflare Workers + Durable Objects for compute; AWS KMS for HMAC-based deterministic key derivation; no database; two-phase rollout (read-everywhere + write-locally first, custodial publishing second).