Glyphoxa Web Management Service β€” Architecture Plan

1. Executive Summary

This document defines the architecture for the Glyphoxa Web Management Service β€” a separate, independently deployable service that provides self-service management for Dungeon Masters, tenant administration, NPC configuration, billing, and observability.

Key constraints:

  • Separate service (NOT embedded in the gateway) β€” firm requirement
  • Must scale to >1,000 concurrent users
  • Self-service SaaS model with tiered pricing
  • DMs manage their own campaigns, NPCs, and sessions without operator intervention

This supersedes the β€œOption A” (embedded in gateway) approach from the 2026-03-23 admin UI plan. The gateway remains a lean voice-pipeline orchestrator; user-facing management moves to its own service.


2. Service Topology

                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                    β”‚     CDN / Edge Cache    β”‚
                                    β”‚  (static SPA assets,    β”‚
                                    β”‚   voice sample cache)   β”‚
                                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                 β”‚
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                               β”Œβ”€β”€β”€β–Ίβ”‚   Reverse Proxy (NPM /  │◄────┐
                               β”‚    β”‚   Traefik / Caddy)       β”‚     β”‚
                               β”‚    β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”˜     β”‚
                               β”‚       β”‚          β”‚          β”‚       β”‚
                    Browser    β”‚  /app/*β”‚   /api/* β”‚  /gw/*   β”‚       β”‚
                    (SPA)β”€β”€β”€β”€β”€β”€β”˜       β”‚          β”‚          β”‚       β”‚
                                       β–Ό          β–Ό          β–Ό       β”‚
                             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”  β”‚
                             β”‚  SPA Static  β”‚ β”‚  Web    β”‚ β”‚Gatewayβ”‚  β”‚
                             β”‚  File Server β”‚ β”‚ Mgmt APIβ”‚ β”‚ Admin β”‚  β”‚
                             β”‚  (or CDN)    β”‚ β”‚  (Go)   β”‚ β”‚  API  β”‚  β”‚
                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”¬β”€β”€β”€β”˜  β”‚
                                                  β”‚          β”‚       β”‚
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
                        β”‚                         β”‚                  β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
              β”‚   PostgreSQL       β”‚   β”‚   Gateway (gRPC)    β”‚       β”‚
              β”‚  (shared DB,       β”‚   β”‚  - Session control   β”‚       β”‚
              β”‚   tenant schemas)  β”‚   β”‚  - NPC control       β”‚       β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚  - Audio bridge      β”‚       β”‚
                       β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                  β”‚
              β”‚   Vault (Transit)  β”‚   External Services:             β”‚
              β”‚  - API key encrypt β”‚   - Stripe (billing)             β”‚
              β”‚  - Bot token store β”‚   - Discord OAuth2               β”‚
              β”‚  - Secret mgmt    β”‚   - Google OAuth2                 β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   - ElevenLabs (voice samples)   β”‚
                                       - S3/MinIO (file storage)      β”‚
                                       - OTel Collector β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Communication paths

From To Protocol Purpose
Browser Web Mgmt API HTTPS (REST + WebSocket) All user interactions
Web Mgmt API PostgreSQL TCP (pgx pool) Tenant, user, campaign, NPC, billing data
Web Mgmt API Gateway Admin API HTTP (internal) Session control, bot management (proxy)
Web Mgmt API Gateway gRPC gRPC (internal) Live session status, NPC mute/unmute/speak
Web Mgmt API Vault HTTP Encrypt/decrypt API keys, bot tokens
Web Mgmt API Stripe HTTPS Subscription lifecycle, webhooks
Web Mgmt API Discord/Google HTTPS OAuth2 flows
Web Mgmt API S3/MinIO HTTPS Voice sample upload/download
Web Mgmt API OTel Collector gRPC (OTLP) Traces, metrics, logs

Why separate from the gateway?

Concern Embedded (Option A) Separate (Option B β€” chosen)
Scaling Scales with gateway (voice pipeline) Scales independently based on web traffic
Release cycle Frontend changes require gateway redeploy (voice disruption) Deploy frontend/backend independently
Security surface Auth, OAuth, Stripe in the voice-critical path Isolated β€” gateway stays lean and locked down
Failure isolation UI bug or spike can impact voice sessions Web service crash doesn’t affect live sessions
Multi-gateway One UI per gateway instance Single management plane for N gateways
Complexity Simpler for single-tenant Required for multi-tenant SaaS at >1000 users

3. Tech Stack

3.1 Backend: Go

Choice Rationale
Go 1.26+ Same language as gateway β€” shared domain types, DB migration patterns, Vault client code. One team, one language, one toolchain.
net/http (stdlib) Standard library router (http.ServeMux with method patterns, Go 1.22+) β€” same as gateway. No framework dependency.
pgx v5 Same PostgreSQL driver as gateway. Connection pooling via pgxpool.
google.golang.org/grpc For calling gateway’s gRPC services (session status, NPC control).
golang-jwt/jwt/v5 JWT issuance and validation (access + refresh tokens).
markbates/goth or coreos/go-oidc OAuth2 provider abstraction (Discord, Google, GitHub).
stripe/stripe-go Stripe subscription management + webhook verification.

Why Go over Node/Python?

  • The entire team (Luk) writes Go. Shared types with gateway (tenant model, NPC definition, config structs) can live in an importable pkg/ package β€” no cross-language serialization.
  • Go’s concurrency model handles WebSocket fan-out and long-polling efficiently.
  • Single static binary β€” same deployment model as gateway.
  • If we ever need to merge web management back into the gateway (unlikely), the code is directly compatible.

Why not a Go framework (Gin, Echo, Fiber)?

  • stdlib net/http + http.ServeMux is sufficient for REST APIs (Go 1.22+ has method routing).
  • The gateway already uses this pattern β€” consistency matters more than framework features.
  • Middleware chains are trivial with func(http.Handler) http.Handler.
  • No dependency churn from framework major versions.

3.2 Frontend: React + Vite + Tailwind

Choice Rationale
React 19 Largest ecosystem, easiest to find contributors, Luk can find help.
Vite 6 Fast HMR, modern ESM bundling, minimal config.
TypeScript 5.x Type safety for API contracts. Non-negotiable for >30 components.
Tailwind CSS 4 Utility-first, no custom design system needed. Mobile-first responsive by default.
shadcn/ui Copy-paste component library (Radix primitives). Accessible, customizable, no npm lock-in.
TanStack Query v5 Server-state management with caching, optimistic updates, background refetch.
TanStack Router Type-safe routing with search params. Better than React Router for data-heavy apps.
Recharts Charts for usage/billing dashboards. Lightweight, React-native.
react-hook-form + zod Form validation β€” paired with zod schemas generated from OpenAPI spec.

Why SPA over SSR (Next.js, Remix)?

  • Management dashboards are inherently interactive β€” no SEO requirement.
  • Voice preview requires Web Audio API (client-only).
  • WebSocket session monitoring is client-driven.
  • SPA deploys as static files to CDN β€” zero Node.js servers in production.
  • Simpler deployment: static files + Go API binary.

Why not HTMX?

  • Same reasoning as the original plan: voice preview, drag-and-drop NPC ordering, real-time session monitoring, and rich form editors all require significant client-side JS. HTMX would need so many hx-ext scripts that it becomes React-with-extra-steps.

3.3 API Contract: OpenAPI 3.1

  • Go backend generates OpenAPI spec from struct tags + route definitions (via swaggo/swag or oapi-codegen annotations).
  • TypeScript client auto-generated from spec (openapi-typescript-codegen or @hey-api/openapi-ts).
  • Zod validation schemas generated from spec for form validation.
  • Single source of truth β€” backend structs drive everything.

4. Multi-Tenancy Model

Decision: Shared database, tenant_id column isolation

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   PostgreSQL                          β”‚
β”‚                                                       β”‚
β”‚  public schema (shared tables):                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚  users   β”‚ β”‚ tenants  β”‚ β”‚ subscriptions  β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚  β”‚campaigns β”‚ β”‚ sessions β”‚ β”‚ usage_records  β”‚        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                           β”‚
β”‚  β”‚  npcs    β”‚ β”‚ invoices β”‚                           β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                           β”‚
β”‚                                                       β”‚
β”‚  Per-tenant schemas (existing β€” used by workers):     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚
β”‚  β”‚ tenant_luk.*    β”‚  β”‚ tenant_demo.*   β”‚            β”‚
β”‚  β”‚ session_entries β”‚  β”‚ session_entries β”‚            β”‚
β”‚  β”‚ chunks          β”‚  β”‚ chunks          β”‚            β”‚
β”‚  β”‚ entities        β”‚  β”‚ entities        β”‚            β”‚
β”‚  β”‚ relationships   β”‚  β”‚ relationships   β”‚            β”‚
β”‚  β”‚ recaps          β”‚  β”‚ recaps          β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why shared DB + tenant_id columns (not separate DBs per tenant)?

Factor Shared DB Separate DBs
Operational cost 1 DB to manage, backup, monitor N DBs β€” linear ops growth
Cross-tenant queries Simple JOINs (admin dashboards, billing) Requires federation or app-level aggregation
Connection count 1 pool shared across tenants N pools β€” connection explosion at >100 tenants
Schema migrations Run once Run N times (migration orchestrator needed)
Tenant isolation Row-level (RLS or app-enforced WHERE) Full schema isolation
Scale limit ~10,000 tenants before RLS overhead matters Unlimited (each DB independent)

Isolation mechanism:

  • All queries include WHERE tenant_id = $1 β€” enforced at the repository layer.
  • PostgreSQL Row-Level Security (RLS) as defense-in-depth (Phase 2).
  • The existing per-tenant schemas (tenant_<id>.*) for session entries, memory chunks, and knowledge graph data remain unchanged β€” workers already use these.
  • The web management service reads per-tenant schemas for transcript viewing and knowledge graph browsing.

Compatibility with existing gateway DB:

The web management service shares the same PostgreSQL instance. Tables owned by the gateway (tenants, sessions, usage_records) are accessed read-only by the web service for display. Write operations on these entities go through the gateway’s Admin API (or a new internal API) to maintain the gateway as the source of truth for session-critical state.

New tables (users, campaigns, subscriptions, invoices, voice_samples, audit_log) are owned by the web management service.


5. Service Boundaries

What lives where

Concern Web Management Service Gateway Shared (DB)
User auth (OAuth2, JWT) βœ“ owns β€” users table
Tenant CRUD βœ“ owns (replaces gateway admin API for external use) Internal API only tenants table
Campaign CRUD βœ“ owns Reads campaign context campaigns table
NPC CRUD βœ“ owns (HTTP) Reads NPC defs at session start npc_definitions table
Session start/stop Proxies to gateway βœ“ owns (orchestrator + dispatcher) sessions table
Session monitoring Reads DB + gateway gRPC βœ“ owns (live state) sessions table
Usage tracking Reads + displays βœ“ owns (writes during sessions) usage_records table
Billing/subscriptions βœ“ owns Checks quota via usage store subscriptions table
Voice sample upload βœ“ owns β€” S3/MinIO
Transcript viewing βœ“ reads Worker writes tenant_<id>.session_entries
Knowledge graph browse βœ“ reads Worker writes tenant_<id>.entities/relationships
Provider config βœ“ manages overrides Reads at session start provider_configs table
Bot token management βœ“ manages (via Vault) Uses at runtime Vault Transit
Observability dashboard βœ“ owns (queries Grafana/OTel) Emits telemetry OTel Collector
Support tickets βœ“ owns (integrates third-party) β€” External system

Internal communication contract

The web management service calls the gateway for operations the gateway must own (voice session lifecycle):

// Web service β†’ Gateway (HTTP, internal network only)
POST   /internal/v1/sessions/{tenant_id}/start   // Start voice session
POST   /internal/v1/sessions/{session_id}/stop    // Stop voice session
GET    /internal/v1/sessions/active               // List active sessions

// Web service β†’ Gateway (gRPC, internal network only)
rpc GetStatus(GetStatusRequest) returns (GetStatusResponse)
rpc ListNPCs(ListNPCsRequest) returns (ListNPCsResponse)
rpc MuteNPC / UnmuteNPC / SpeakNPC               // NPC control during session

The gateway’s existing external Admin API (/api/v1/tenants) can be deprecated or restricted to internal-only once the web management service takes over tenant CRUD. During migration, both coexist.


6. Authentication & Authorization

Auth architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   OAuth2    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   JWT      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Discord  │◄───────────►│              │──────────►│              β”‚
β”‚ Google   β”‚  code grant  β”‚  Web Mgmt    β”‚  access +  β”‚   Browser    β”‚
β”‚ GitHub   β”‚              β”‚  Service     β”‚  refresh   β”‚   (SPA)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚              β”‚            β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚  /auth/*     β”‚                   β”‚
                          β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜                   β”‚
                                 β”‚                           β”‚
                          β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
                          β”‚  users table β”‚            β”‚ All /api/*   β”‚
                          β”‚  + sessions  β”‚            β”‚ requests     β”‚
                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚ carry JWT    β”‚
                                                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Token strategy

Token Storage Lifetime Purpose
Access token (JWT) In-memory (SPA state) 15 minutes API authorization
Refresh token HttpOnly, Secure, SameSite=Strict cookie 7 days Silent token refresh
CSRF token Custom header (X-CSRF-Token) Per-session Prevent CSRF on cookie-based refresh

Why short-lived access tokens + refresh cookie?

  • Access token in memory (not localStorage) β€” immune to XSS-based token theft.
  • Refresh token in HttpOnly cookie β€” immune to JS access.
  • 15-minute access token limits damage window if somehow leaked.
  • Refresh endpoint rotates the refresh token (rotation + reuse detection).

User model

CREATE TABLE users (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id       UUID REFERENCES tenants(id) ON DELETE CASCADE,
    email           TEXT UNIQUE,
    name            TEXT NOT NULL,
    avatar_url      TEXT,
    role            TEXT NOT NULL DEFAULT 'dm',
    -- OAuth provider links
    discord_id      TEXT UNIQUE,
    google_id       TEXT UNIQUE,
    github_id       TEXT UNIQUE,
    -- Billing
    stripe_customer_id TEXT UNIQUE,
    -- Lifecycle
    email_verified  BOOLEAN NOT NULL DEFAULT false,
    last_login_at   TIMESTAMPTZ,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_users_tenant ON users(tenant_id);

Roles

Role Description Scope
super_admin Platform operator (Luk) Global β€” all tenants
tenant_owner Tenant creator / billing contact Own tenant β€” full control
dm Dungeon Master Own tenant β€” campaigns, NPCs, sessions
viewer Read-only (invited players) Own tenant β€” transcripts, session history

Permission matrix:

Action super_admin tenant_owner dm viewer
Manage all tenants βœ“ Β  Β  Β 
Platform observability βœ“ Β  Β  Β 
Manage own tenant settings βœ“ βœ“ Β  Β 
Manage billing/subscription βœ“ βœ“ Β  Β 
Invite/remove users βœ“ βœ“ Β  Β 
Manage campaigns βœ“ βœ“ βœ“ Β 
Manage NPCs βœ“ βœ“ βœ“ Β 
Start/stop sessions βœ“ βœ“ βœ“ Β 
Upload voice samples βœ“ βœ“ βœ“ Β 
View transcripts βœ“ βœ“ βœ“ βœ“
View usage βœ“ βœ“ βœ“ βœ“
View session history βœ“ βœ“ βœ“ βœ“

Self-service onboarding flow

1. DM visits glyphoxa.app β†’ "Sign up with Discord"
2. Discord OAuth2 β†’ identify + guilds scopes
3. Web service creates:
   a. User record (role: tenant_owner)
   b. Tenant record (license_tier: shared, empty config)
   c. Redirect to onboarding wizard
4. Onboarding wizard:
   a. "Name your first campaign" β†’ creates campaign
   b. "Add your Discord bot token" β†’ encrypted via Vault
   c. "Select your guild" β†’ guild picker from Discord API
   d. "Choose a plan" β†’ Stripe checkout
   e. "Create your first NPC" β†’ NPC editor
5. DM is live β€” can start sessions from Discord

7. API Design

Route structure

/auth/discord              GET    Initiate Discord OAuth2
/auth/discord/callback     GET    Discord OAuth2 callback
/auth/google               GET    Initiate Google OAuth2
/auth/google/callback      GET    Google OAuth2 callback
/auth/refresh              POST   Refresh access token
/auth/logout               POST   Revoke refresh token

/api/v1/me                 GET    Current user profile
/api/v1/me                 PUT    Update profile

/api/v1/tenants            POST   Create tenant (self-service)
/api/v1/tenants/{id}       GET    Get tenant
/api/v1/tenants/{id}       PUT    Update tenant settings
/api/v1/tenants/{id}       DELETE Delete tenant

/api/v1/tenants/{id}/users         GET    List users in tenant
/api/v1/tenants/{id}/users         POST   Invite user
/api/v1/tenants/{id}/users/{uid}   PUT    Update user role
/api/v1/tenants/{id}/users/{uid}   DELETE Remove user

/api/v1/campaigns                  POST   Create campaign
/api/v1/campaigns                  GET    List campaigns (tenant-scoped)
/api/v1/campaigns/{id}             GET    Get campaign
/api/v1/campaigns/{id}             PUT    Update campaign
/api/v1/campaigns/{id}             DELETE Delete campaign

/api/v1/campaigns/{id}/npcs        POST   Create NPC
/api/v1/campaigns/{id}/npcs        GET    List NPCs for campaign
/api/v1/npcs/{id}                  GET    Get NPC
/api/v1/npcs/{id}                  PUT    Update NPC
/api/v1/npcs/{id}                  DELETE Delete NPC
/api/v1/npcs/{id}/voice-preview    POST   Generate TTS preview audio

/api/v1/sessions                   GET    List sessions (filterable)
/api/v1/sessions/active            GET    Active sessions
/api/v1/sessions/{id}              GET    Session details
/api/v1/sessions/{id}/transcript   GET    Session transcript
/api/v1/sessions/{id}/stop         POST   Force-stop session
/api/v1/sessions/{id}/live         WS     Live transcript stream

/api/v1/voice-samples              POST   Upload voice sample
/api/v1/voice-samples              GET    List voice samples
/api/v1/voice-samples/{id}         GET    Get voice sample
/api/v1/voice-samples/{id}         DELETE Delete voice sample

/api/v1/usage                      GET    Usage summary (tenant-scoped)
/api/v1/usage/export               GET    Export usage as CSV

/api/v1/billing/subscription       GET    Current subscription
/api/v1/billing/subscription       POST   Create/change subscription
/api/v1/billing/portal             POST   Create Stripe billing portal session
/api/v1/billing/webhook            POST   Stripe webhook receiver

/api/v1/providers                  GET    List provider configs (redacted keys)
/api/v1/providers/{slot}           PUT    Update provider config
/api/v1/providers/{slot}/test      POST   Test provider connectivity

/api/v1/support/tickets            POST   Create support ticket
/api/v1/support/tickets            GET    List tickets
/api/v1/support/tickets/{id}       GET    Get ticket

# Super admin only
/api/v1/admin/tenants              GET    List all tenants
/api/v1/admin/observability        GET    System health + metrics
/api/v1/admin/users                GET    List all users

API gateway / reverse proxy

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚             Reverse Proxy (Traefik / Caddy)       β”‚
β”‚                                                    β”‚
β”‚  app.glyphoxa.app/*          β†’ SPA static files   β”‚
β”‚  app.glyphoxa.app/api/*      β†’ Web Mgmt Service   β”‚
β”‚  app.glyphoxa.app/auth/*     β†’ Web Mgmt Service   β”‚
β”‚                                                    β”‚
β”‚  gw.glyphoxa.app/internal/*  β†’ Gateway Admin API   β”‚
β”‚  (internal network only β€” not internet-facing)     β”‚
β”‚                                                    β”‚
β”‚  TLS termination, rate limiting, request logging   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

For local/K3s deployment, Nginx Proxy Manager (already in use) handles this. For production SaaS, Traefik (K8s-native) or Caddy (auto-TLS) are preferred.

Rate limiting:

Endpoint group Limit Window
/auth/* 10 req 1 min (per IP)
/api/v1/npcs/*/voice-preview 5 req 1 min (per user)
/api/v1/billing/webhook 100 req 1 min (Stripe IPs only)
All other /api/* 60 req 1 min (per user)

8. Scaling Strategy

Horizontal scaling

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Load Balancer      β”‚
                    β””β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
                       β”‚      β”‚      β”‚
                  β”Œβ”€β”€β”€β”€β–Όβ”€β” β”Œβ”€β–Όβ”€β”€β”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β”
                  β”‚Web #1β”‚ β”‚Web #2β”‚ β”‚Web #3β”‚   Stateless Go instances
                  β””β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”€β”¬β”€β”€β”˜
                     β”‚        β”‚        β”‚
                  β”Œβ”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”
                  β”‚   PostgreSQL (pgx pool) β”‚   Connection pooling
                  β”‚   + PgBouncer (optional)β”‚
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Why this works:

  • Web management service is stateless β€” all state lives in PostgreSQL + Vault + S3.
  • JWT validation is local (no session store needed).
  • WebSocket connections are per-instance (no cross-instance fan-out needed β€” each browser connects to one instance, and sessions are scoped).
  • Go’s goroutine model handles thousands of concurrent connections per instance.

Scaling targets

Component 1-100 users 100-1,000 users 1,000+ users
Web service 1 replica 2-3 replicas HPA (CPU-based)
PostgreSQL Single instance Single + read replica Primary + read replicas
Static assets Same origin CDN (Cloudflare/BunnyCDN) CDN
File storage (voice samples) Local disk / MinIO MinIO S3 / R2
Redis (optional) Not needed Rate limiting + sessions Rate limiting + caching

Database connection pooling

  • pgxpool in Go β€” per-instance pool (default: 10 idle, 25 max per instance).
  • At >500 users: add PgBouncer in transaction mode between Go instances and PostgreSQL to multiplex connections.
  • Read-heavy queries (transcript viewing, usage dashboards) can target a read replica (configurable DSN).

CDN for static assets

The SPA build output (dist/) is deployed to a CDN or object storage with aggressive caching:

  • index.html β€” Cache-Control: no-cache (always fresh, checks ETag)
  • assets/*.js / assets/*.css β€” Cache-Control: public, max-age=31536000, immutable (content-hashed filenames)
  • Voice sample playback URLs β€” signed, time-limited S3 presigned URLs

9. Secret Management

Vault integration

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Vault                              β”‚
β”‚                                                       β”‚
β”‚  Transit engine (glyphoxa-bot-tokens):               β”‚
β”‚  β”œβ”€β”€ Bot tokens (existing β€” shared with gateway)     β”‚
β”‚  └── Tenant API keys (BYO provider keys)             β”‚
β”‚                                                       β”‚
β”‚  KV v2 engine (glyphoxa-web/):                       β”‚
β”‚  β”œβ”€β”€ stripe-secret-key                               β”‚
β”‚  β”œβ”€β”€ discord-oauth-client-secret                     β”‚
β”‚  β”œβ”€β”€ google-oauth-client-secret                      β”‚
β”‚  β”œβ”€β”€ jwt-signing-key                                 β”‚
β”‚  └── s3-access-credentials                           β”‚
β”‚                                                       β”‚
β”‚  PKI engine (existing):                               β”‚
β”‚  └── mTLS certs for web-service ↔ gateway            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Secret flow for β€œbring your own API keys”

1. DM enters ElevenLabs API key in the web UI
2. Web service calls Vault Transit: encrypt(plaintext=key, key=glyphoxa-bot-tokens)
3. Encrypted ciphertext stored in `provider_configs` table
4. At session start: gateway reads provider_configs, calls Vault Transit: decrypt()
5. Decrypted key passed to worker via gRPC StartSessionRequest (TLS-encrypted in transit)
6. Worker uses key for TTS calls, never persists it

Key principles:

  • Plaintext secrets never touch the database.
  • The web service can encrypt but only the gateway needs to decrypt (separation of concern possible via Vault policies, but shared for simplicity now).
  • Vault Transit key rotation is transparent β€” old ciphertexts remain decryptable.
  • If Vault is unreachable, the web service rejects secret-write operations (no graceful degradation for writes β€” this is intentional for security).

10. Billing & Pricing Integration

Pricing tiers (from pricing assessment)

Tier Price Sessions/mo NPCs Voices Model tier Target
Apprentice (Free) $0 2 2 Basic (gTTS) Gemini Flash Trial
Adventurer $9/mo 8 10 Standard (ElevenLabs) GPT-4o-mini Casual DMs
Dungeon Master $19/mo Unlimited Unlimited Premium voices GPT-4o Serious DMs
Guild $29/mo Unlimited Unlimited Premium + custom training GPT-4o Groups (5 seats)

Annual discount: 2 months free ($90/yr, $190/yr, $290/yr).

Stripe integration

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  checkout   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  webhook    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Browser  │───────────►│  Stripe   │────────────►│ Web Service  β”‚
β”‚          │◄───────────│ Checkout  β”‚             β”‚ /billing/    β”‚
β”‚          β”‚  redirect   β”‚          β”‚             β”‚  webhook     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
                                                 β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
                                                 β”‚ subscriptionsβ”‚
                                                 β”‚    table     β”‚
                                                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Subscription data model:

CREATE TABLE subscriptions (
    id                  UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id           UUID NOT NULL UNIQUE REFERENCES tenants(id),
    stripe_subscription_id TEXT UNIQUE,
    stripe_customer_id  TEXT NOT NULL,
    tier                TEXT NOT NULL DEFAULT 'apprentice',
    status              TEXT NOT NULL DEFAULT 'active',  -- active, past_due, canceled, trialing
    current_period_start TIMESTAMPTZ,
    current_period_end  TIMESTAMPTZ,
    cancel_at           TIMESTAMPTZ,
    created_at          TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at          TIMESTAMPTZ NOT NULL DEFAULT now()
);

Webhook events handled:

Event Action
checkout.session.completed Create subscription record, upgrade tenant tier
invoice.paid Extend period, clear past_due status
invoice.payment_failed Mark past_due, send email, grace period (7 days)
customer.subscription.updated Sync tier changes (up/downgrade)
customer.subscription.deleted Downgrade to Apprentice (free) tier

Quota enforcement:

The gateway’s existing usage.Store.CheckQuota() mechanism is reused. The web management service writes monthly_session_hours to the tenant record based on the subscription tier. The gateway checks this at session start via ValidateAndCreate().

Tier monthly_session_hours Max concurrent sessions
Apprentice 8 (β‰ˆ2 sessions Γ— 4h) 1
Adventurer 32 (β‰ˆ8 sessions Γ— 4h) 1
Dungeon Master 0 (unlimited) 3
Guild 0 (unlimited) 5

Self-hosted / BYO-keys mode

For users who self-host Glyphoxa (open-core model), the billing system is optional. Config flag --billing=disabled skips Stripe integration and sets all tenants to unlimited. The self-hosted user provides their own LLM/TTS/STT API keys.


11. Voice Sample Upload

Flow

1. DM uploads .wav/.mp3 in the NPC editor (max 10MB, 10 seconds)
2. Web service validates format (ffprobe), rejects invalid files
3. File stored in S3/MinIO: voice-samples/{tenant_id}/{sample_id}.wav
4. Metadata stored in DB: voice_samples table
5. For ElevenLabs custom voice: web service calls ElevenLabs Voice Clone API
6. Returns voice_id for use in NPC config
7. Preview endpoint returns presigned S3 URL (1-hour TTL)

Storage:

CREATE TABLE voice_samples (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    name        TEXT NOT NULL,
    file_key    TEXT NOT NULL,          -- S3 object key
    file_size   BIGINT NOT NULL,
    duration_ms INT NOT NULL,
    format      TEXT NOT NULL,          -- wav, mp3
    provider_voice_id TEXT,             -- ElevenLabs voice ID after cloning
    status      TEXT NOT NULL DEFAULT 'uploaded',  -- uploaded, processing, ready, failed
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

12. Deployment Architecture

K3s deployment (current infrastructure)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         K3s Cluster                               β”‚
β”‚                                                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ web-mgmt (Deploy) β”‚  β”‚ gateway (Deploy)  β”‚  β”‚ postgres     β”‚  β”‚
β”‚  β”‚ replicas: 1-3     β”‚  β”‚ replicas: 1       β”‚  β”‚ (StatefulSet)β”‚  β”‚
β”‚  β”‚ port: 8080        β”‚  β”‚ ports: 8080,50051 β”‚  β”‚ port: 5432   β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚           β”‚                      β”‚                     β”‚          β”‚
β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                                  β”‚                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ vault (StatefulSet)β”‚  β”‚ worker (Job, N)   β”‚  β”‚ minio        β”‚  β”‚
β”‚  β”‚ port: 8200        β”‚  β”‚ ephemeral pods    β”‚  β”‚ (StatefulSet) β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚ port: 9000   β”‚  β”‚
β”‚                                                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ Nginx Proxy Manager / Traefik Ingress                     β”‚    β”‚
β”‚  β”‚ app.glyphoxa.lan β†’ web-mgmt:8080                          β”‚    β”‚
β”‚  β”‚ gw.glyphoxa.lan  β†’ gateway:8080 (internal only)           β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Helm chart additions

New chart or subchart: deploy/helm/glyphoxa-web/

# values.yaml (web management service)
replicaCount: 1

image:
  repository: ghcr.io/mrwong99/glyphoxa-web
  tag: latest

env:
  DATABASE_DSN: ""
  VAULT_ADDR: ""
  GATEWAY_INTERNAL_URL: "http://glyphoxa-gateway:8080"
  GATEWAY_GRPC_ADDR: "glyphoxa-gateway:50051"
  STRIPE_WEBHOOK_SECRET: ""
  S3_ENDPOINT: "http://minio:9000"
  OTEL_EXPORTER_OTLP_ENDPOINT: "http://otel-collector:4317"

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

CI/CD pipeline

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Push to │───►│  CI      │───►│  Build   │───►│  Deploy  β”‚
β”‚  main    β”‚    β”‚  Checks  β”‚    β”‚  Images  β”‚    β”‚  K3s     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚          β”‚    β”‚          β”‚    β”‚          β”‚
                β”‚ lint     β”‚    β”‚ frontend β”‚    β”‚ helm     β”‚
                β”‚ test     β”‚    β”‚ backend  β”‚    β”‚ upgrade  β”‚
                β”‚ vet      β”‚    β”‚ multi-   β”‚    β”‚          β”‚
                β”‚ typecheckβ”‚    β”‚ stage    β”‚    β”‚          β”‚
                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Build pipeline:

  1. Frontend: npm ci && npm run build β†’ dist/ folder
  2. Backend: Multi-stage Dockerfile:
    • Stage 1 (Node): Build SPA β†’ dist/
    • Stage 2 (Go): COPY dist/ β†’ embed β†’ go build
    • Stage 3 (Distroless): Copy binary only
  3. Push: ghcr.io/mrwong99/glyphoxa-web:${SHA}
  4. Deploy: helm upgrade glyphoxa-web ./deploy/helm/glyphoxa-web

The SPA is embedded in the Go binary via //go:embed β€” the web management service is a single binary that serves both the API and the static frontend. No separate static file server needed.


13. Monitoring & Observability

Instrumentation

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Web Management Service                     β”‚
β”‚                                                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Traces   β”‚  β”‚  Metrics  β”‚  β”‚  Structured Logs       β”‚   β”‚
β”‚  β”‚  (OTel)   β”‚  β”‚ (Prom)   β”‚  β”‚  (slog β†’ JSON)         β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚        β”‚              β”‚                     β”‚                β”‚
β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                β”‚
β”‚                       β”‚                                       β”‚
β”‚                β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”                                β”‚
β”‚                β”‚ OTel SDK    β”‚                                β”‚
β”‚                β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚ OTLP/gRPC
                 β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
                 β”‚OTel Collectorβ”‚
                 β””β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
                    β”‚     β”‚
             β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”  β”Œβ”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
             β”‚ Loki  β”‚  β”‚Prometheus β”‚
             β”‚(logs) β”‚  β”‚(metrics)  β”‚
             β””β”€β”€β”€β”¬β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
                 β”‚           β”‚
             β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”
             β”‚      Grafana       β”‚
             β”‚  (dashboards)      β”‚
             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key metrics (Prometheus)

# HTTP request metrics (auto-instrumented via OTel middleware)
http_server_duration_seconds{method, route, status_code}
http_server_active_requests{method, route}

# Business metrics
glyphoxa_web_active_users_total{tenant_id}
glyphoxa_web_signups_total{tier}
glyphoxa_web_subscription_changes_total{from_tier, to_tier}
glyphoxa_web_voice_previews_total{tenant_id}
glyphoxa_web_voice_uploads_total{tenant_id, status}

# Session proxy metrics
glyphoxa_web_session_starts_total{tenant_id, result}
glyphoxa_web_session_stops_total{tenant_id, reason}

Super admin observability dashboard

The super admin dashboard aggregates:

  1. System health: Gateway status, worker pod count, DB connection pool stats
  2. Business metrics: Signups, active subscriptions by tier, MRR, churn
  3. Usage: Total session hours, LLM tokens, STT seconds, TTS chars (all from usage_records)
  4. Per-tenant drill-down: Usage vs quota, session history, error rates
  5. Provider health: Latency P50/P99, error rates (from gateway’s Prometheus metrics)

Implementation: Embed Grafana dashboards via iframe (Grafana supports anonymous/embedded mode), or build custom charts in React using the same Prometheus query API.

Health probes

GET /healthz         β†’ 200 OK (liveness β€” process is running)
GET /readyz          β†’ 200 OK / 503 (readiness β€” DB connected, Vault reachable)
GET /metrics         β†’ Prometheus exposition format

14. Database Schema Overview

New tables (owned by web management service)

-- Users (see section 6)
-- Subscriptions (see section 10)
-- Voice samples (see section 11)

CREATE TABLE campaigns (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    name        TEXT NOT NULL,
    game_system TEXT NOT NULL DEFAULT '',
    description TEXT NOT NULL DEFAULT '',
    settings    JSONB NOT NULL DEFAULT '{}',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_campaigns_tenant ON campaigns(tenant_id);

CREATE TABLE provider_configs (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
    slot        TEXT NOT NULL,               -- llm, stt, tts, s2s, vad, embeddings
    provider    TEXT NOT NULL,               -- openai, elevenlabs, deepgram, etc.
    model       TEXT NOT NULL DEFAULT '',
    api_key_enc TEXT NOT NULL DEFAULT '',    -- Vault Transit encrypted
    base_url    TEXT NOT NULL DEFAULT '',
    options     JSONB NOT NULL DEFAULT '{}',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    UNIQUE(tenant_id, slot)
);

CREATE TABLE audit_log (
    id          BIGSERIAL PRIMARY KEY,
    tenant_id   UUID REFERENCES tenants(id),
    user_id     UUID REFERENCES users(id),
    action      TEXT NOT NULL,               -- tenant.create, npc.update, session.stop, etc.
    resource_type TEXT NOT NULL,
    resource_id TEXT NOT NULL,
    details     JSONB,                       -- before/after diff
    ip_address  INET,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_audit_tenant_time ON audit_log(tenant_id, created_at DESC);

CREATE TABLE support_tickets (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id   UUID NOT NULL REFERENCES tenants(id),
    user_id     UUID NOT NULL REFERENCES users(id),
    external_id TEXT,                        -- ID in third-party system (Freshdesk, etc.)
    subject     TEXT NOT NULL,
    status      TEXT NOT NULL DEFAULT 'open',
    priority    TEXT NOT NULL DEFAULT 'normal',
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

Migration strategy

  • Use golang-migrate/migrate (same as gateway).
  • Migration files in internal/webmgmt/migrations/ (or similar).
  • Migrations run on service startup (same pattern as gateway).
  • Shared tables (tenants, sessions, usage_records) are NOT migrated by the web service β€” gateway owns those schemas.

15. Project Structure

glyphoxa-web/                      # Could be a separate repo or a directory in the monorepo
β”œβ”€β”€ cmd/
β”‚   └── glyphoxa-web/
β”‚       └── main.go                # Entry point, config, DI wiring, graceful shutdown
β”œβ”€β”€ internal/
β”‚   β”œβ”€β”€ auth/                      # OAuth2 providers, JWT, middleware
β”‚   β”‚   β”œβ”€β”€ discord.go
β”‚   β”‚   β”œβ”€β”€ google.go
β”‚   β”‚   β”œβ”€β”€ jwt.go
β”‚   β”‚   └── middleware.go
β”‚   β”œβ”€β”€ api/                       # HTTP handlers
β”‚   β”‚   β”œβ”€β”€ campaigns.go
β”‚   β”‚   β”œβ”€β”€ npcs.go
β”‚   β”‚   β”œβ”€β”€ sessions.go
β”‚   β”‚   β”œβ”€β”€ users.go
β”‚   β”‚   β”œβ”€β”€ billing.go
β”‚   β”‚   β”œβ”€β”€ providers.go
β”‚   β”‚   β”œβ”€β”€ voice_samples.go
β”‚   β”‚   β”œβ”€β”€ admin.go               # Super admin endpoints
β”‚   β”‚   └── router.go              # Route registration + middleware chains
β”‚   β”œβ”€β”€ store/                     # Database repositories
β”‚   β”‚   β”œβ”€β”€ users.go
β”‚   β”‚   β”œβ”€β”€ campaigns.go
β”‚   β”‚   β”œβ”€β”€ subscriptions.go
β”‚   β”‚   β”œβ”€β”€ voice_samples.go
β”‚   β”‚   β”œβ”€β”€ audit.go
β”‚   β”‚   └── providers.go
β”‚   β”œβ”€β”€ gateway/                   # Gateway client (HTTP + gRPC)
β”‚   β”‚   β”œβ”€β”€ client.go
β”‚   β”‚   └── session_proxy.go
β”‚   β”œβ”€β”€ billing/                   # Stripe integration
β”‚   β”‚   β”œβ”€β”€ stripe.go
β”‚   β”‚   └── webhook.go
β”‚   β”œβ”€β”€ storage/                   # S3/MinIO file storage
β”‚   β”‚   └── s3.go
β”‚   β”œβ”€β”€ vault/                     # Vault Transit client (reuse from gateway pkg/)
β”‚   β”‚   └── transit.go
β”‚   β”œβ”€β”€ observe/                   # OTel setup
β”‚   β”‚   └── otel.go
β”‚   └── migrations/                # golang-migrate SQL files
β”‚       β”œβ”€β”€ 000001_users.up.sql
β”‚       β”œβ”€β”€ 000001_users.down.sql
β”‚       β”œβ”€β”€ 000002_campaigns.up.sql
β”‚       └── ...
β”œβ”€β”€ web/                           # SPA frontend source
β”‚   β”œβ”€β”€ package.json
β”‚   β”œβ”€β”€ vite.config.ts
β”‚   β”œβ”€β”€ tsconfig.json
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ main.tsx
β”‚   β”‚   β”œβ”€β”€ api/                   # Generated TypeScript client
β”‚   β”‚   β”œβ”€β”€ components/            # shadcn/ui + custom components
β”‚   β”‚   β”œβ”€β”€ pages/                 # Route-level components
β”‚   β”‚   β”œβ”€β”€ hooks/                 # TanStack Query hooks
β”‚   β”‚   └── lib/                   # Utils, auth context, theme
β”‚   └── dist/                      # Build output (embedded into Go binary)
β”œβ”€β”€ Dockerfile                     # Multi-stage: Node β†’ Go β†’ Distroless
β”œβ”€β”€ Makefile
└── go.mod

Monorepo vs separate repo

Recommendation: Start in the monorepo (Glyphoxa/), extract later if needed.

  • Shared Go types (pkg/ β€” tenant, NPC definition, config) are importable directly.
  • Single CI pipeline, single version, single go.mod.
  • When the web service stabilizes and the team grows, extract to a separate repo with a shared pkg/ module.

If monorepo, the web service lives at cmd/glyphoxa-web/ with its packages under internal/webmgmt/ to avoid polluting the gateway’s internal/gateway/ namespace.


16. Decision Log

# Decision Chosen Alternatives considered Rationale
D1 Service topology Separate service Embedded in gateway (Option A) Independent scaling, failure isolation, separate release cycle. Gateway stays lean for voice-critical path. Required for multi-tenant SaaS at >1000 users.
D2 Backend language Go Node.js (Express/Fastify), Rust (Axum) Same language as gateway β€” shared types, shared Vault/DB patterns, single toolchain. Luk writes Go. No cross-language serialization overhead.
D3 Backend framework stdlib net/http Gin, Echo, Fiber, chi Consistency with gateway. Go 1.22+ http.ServeMux has method routing. No framework churn. Middleware chains are trivial.
D4 Frontend framework React 19 + Vite Svelte, Vue, HTMX, Go templates Largest ecosystem, easiest hiring, Luk can find help. Voice preview + WebSocket monitoring + rich NPC editor require significant client-side JS β€” rules out HTMX.
D5 Component library shadcn/ui (Radix) MUI, Ant Design, Chakra Copy-paste ownership (no npm lock-in), accessible (Radix primitives), Tailwind-native.
D6 Multi-tenancy Shared DB, tenant_id columns Separate DB per tenant, schema-per-tenant Simpler ops (1 DB), cross-tenant queries for admin, connection pool efficiency. RLS for defense-in-depth. Scale limit ~10k tenants is well beyond target.
D7 Auth strategy OAuth2 (Discord/Google) + JWT API key only, session cookies, Clerk/Auth0 Self-service requires real user identity. JWT is stateless (scales horizontally). Discord OAuth is natural for TTRPG audience. Third-party auth (Clerk) adds cost + vendor lock-in.
D8 Token storage Access: memory / Refresh: HttpOnly cookie localStorage, sessionStorage Memory is immune to XSS. HttpOnly cookie immune to JS access. Best security posture without a token store.
D9 Billing provider Stripe Paddle, LemonSqueezy, custom Industry standard, excellent webhook reliability, Stripe Billing handles subscription lifecycle. Tax compliance via Stripe Tax.
D10 Secret storage Vault Transit (encrypt at rest) AWS KMS, DB-level encryption, env vars Already deployed, gateway uses it for bot tokens. Consistent encryption for BYO API keys. Key rotation is transparent.
D11 File storage MinIO (S3-compatible) Local disk, Cloudflare R2 Self-hosted (K3s), S3 API compatible, easy migration to cloud S3 later.
D12 Deployment Single Go binary (SPA embedded) on K3s Separate frontend deploy (Vercel/Netlify) + API Simpler ops (1 artifact), no CORS, consistent versioning. CDN layer can sit in front.
D13 API contract OpenAPI 3.1 spec β†’ generated TS client GraphQL, tRPC, manual client REST is sufficient for CRUD-heavy management UI. OpenAPI gives typed client generation, Swagger docs, and validation schemas.
D14 Pricing model Tiered subscription (Apprentice→Guild) Usage-based, session packs, flat rate TTRPG community expects predictable costs. Session-based caps align with how DMs think. Free tier essential for adoption. See pricing assessment.
D15 Support system Third-party integration (Freshdesk/Zendesk) Custom built, email only Building a ticket system is not core value. Integrate via API β€” display in-app, manage externally.
D16 Project location Monorepo (start), extract later Separate repo from day 1 Shared Go types, single CI, simpler DX. Extract when team grows or release cycles diverge.

17. Phase Breakdown

Phase 1: Foundation (MVP)

Goal: DMs can sign up, create a campaign, configure NPCs, and see their session history.

  • OAuth2 login (Discord)
  • Tenant + campaign + NPC CRUD
  • Session list + transcript viewer (read-only from gateway DB)
  • Basic usage display
  • API key management (BYO keys stored via Vault)
  • SPA: dashboard, campaign editor, NPC editor with voice preview
  • Deploy on K3s alongside gateway

Auth: Discord OAuth2 + JWT. Single role: tenant_owner (all DMs are owners of their tenant).

Phase 2: Billing + Multi-user

Goal: Stripe subscriptions enforced, multiple users per tenant.

  • Stripe integration (checkout, portal, webhooks)
  • Tier-based quotas enforced
  • User invite flow (Discord ID β†’ assign role)
  • Role-based access control
  • Voice sample upload (S3/MinIO)
  • Live session monitoring (WebSocket transcript stream)
  • Onboarding wizard for new DMs

Phase 3: Scale + Polish

Goal: Production-ready SaaS for >1000 users.

  • Google OAuth2 + GitHub OAuth2
  • Provider config UI (with test buttons)
  • Knowledge graph browser (D3/react-force-graph)
  • Super admin observability dashboard (Grafana embed or custom)
  • Audit log
  • Support ticket integration
  • CDN for static assets
  • PgBouncer for connection pooling
  • Horizontal autoscaling (HPA)
  • Rate limiting (Redis-backed)

18. Open Questions

  1. Monorepo vs multi-repo? This plan assumes monorepo to start. If the web service diverges significantly in release cadence, extract to its own repo with a shared pkg/ Go module.

  2. Gateway internal API authentication? The web service needs to call the gateway for session control. Options: shared secret (simple), mTLS (Vault PKI is already available), or K8s NetworkPolicy (restrict access by namespace). Recommend: mTLS for production, shared secret for dev.

  3. Session start from web UI? Currently sessions start via Discord slash commands. Should the web UI also be able to start sessions (selecting guild + channel)? This requires the gateway to expose a start-session-by-API endpoint. Recommend: yes, Phase 2.

  4. Multi-gateway support? If Glyphoxa scales to multiple gateway instances (e.g., regional), the web management service needs a gateway registry. Defer until needed β€” single gateway is sufficient for >1000 users.

  5. Email notifications? For billing events (payment failed, subscription expiring), session alerts, and support ticket updates. Recommend: Resend or SES, Phase 2.

  6. NPC avatar/image upload? Adds visual identity in the UI. Can share the same S3/MinIO infrastructure as voice samples. Recommend: Phase 2 (nice-to-have).

  7. Mobile app? The responsive SPA should work well on mobile browsers. A native app is not warranted until user demand is demonstrated. The SPA can be wrapped as a PWA for app-like experience.


This site uses Just the Docs, a documentation theme for Jekyll.