Developer setup guide for building, running, and contributing to Glyphoxa.


πŸ“‹ Prerequisites

Go

Glyphoxa requires Go 1.26+ with CGo enabled (CGO_ENABLED=1).

Install from go.dev/dl or via your system package manager.

System Libraries

Debian / Ubuntu

sudo apt update
sudo apt install -y build-essential cmake git \
  libopus-dev pkg-config

Arch Linux

sudo pacman -S base-devel cmake git opus

macOS (Homebrew)

brew install cmake opus pkg-config

ONNX Runtime (Silero VAD)

The built-in Silero Voice Activity Detection provider requires the ONNX Runtime shared library.

  1. Download the latest release for your platform from onnxruntime releases.
  2. Extract and place the shared library where your linker can find it (e.g. /usr/local/lib).
  3. Ensure the headers are accessible (e.g. /usr/local/include/onnxruntime).

libdave (Discord DAVE E2EE)

The Discord audio transport requires the libdave shared library for Discord’s Audio/Video End-to-End Encryption (DAVE) protocol.

make dave-libs

This downloads (or builds from source) the libdave shared library and installs it to ~/.local/lib. After the build completes, set the environment variables:

export PKG_CONFIG_PATH="$HOME/.local/lib/pkgconfig:$PKG_CONFIG_PATH"
export LD_LIBRARY_PATH="$HOME/.local/lib:$LD_LIBRARY_PATH"
export CGO_ENABLED=1

PostgreSQL with pgvector

The memory subsystem requires PostgreSQL with the pgvector extension.

# Debian/Ubuntu
sudo apt install -y postgresql postgresql-server-dev-all
# Then install pgvector from source β€” see https://github.com/pgvector/pgvector#installation

# Arch
sudo pacman -S postgresql
yay -S pgvector  # or build from source

# macOS
brew install postgresql@17 pgvector

Alternatively, use the Docker Compose setup (see below) which includes a pre-configured pgvector/pgvector:pg17 image.


πŸ“₯ Clone and Build

git clone https://github.com/MrWong99/glyphoxa.git
cd glyphoxa

Build the binary:

make build

This compiles the server to ./bin/glyphoxa. Verify it built successfully:

./bin/glyphoxa --help

πŸ”§ whisper.cpp Native Build

If you want to use the whisper-native STT provider (local speech-to-text via CGo instead of an HTTP server), you need to build the whisper.cpp static library first.

make whisper-libs

This clones whisper.cpp into /tmp/whisper-src, builds it, and installs headers and static libraries to /tmp/whisper-install.

After the build completes, set the environment variables before running other Make targets:

export C_INCLUDE_PATH=/tmp/whisper-install/include
export LIBRARY_PATH=/tmp/whisper-install/lib
export CGO_ENABLED=1

Then rebuild Glyphoxa so the whisper-native provider is linked:

make build

You will also need a GGML model file. Download one from the Hugging Face whisper.cpp models:

# Example: download the base English model
curl -L -o ggml-base.en.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin

Then reference the model path in your config under providers.stt:

stt:
  name: whisper-native
  model: /path/to/ggml-base.en.bin
  options:
    language: en

βš™οΈ Minimal Configuration

Copy the example config and edit it:

cp configs/example.yaml config.yaml

For a first run, you need at minimum:

  1. A Discord bot token (from discord.com/developers/applications)
  2. At least one voice engine path configured β€” either the cascaded pipeline (STT + LLM + TTS) or a speech-to-speech provider

Here is a minimal config.yaml using OpenAI for the cascaded pipeline and ElevenLabs for TTS:

server:
  listen_addr: ":8080"
  log_level: info

providers:
  audio:
    name: discord
    api_key: "Bot YOUR_BOT_TOKEN_HERE"
    options:
      guild_id: "YOUR_GUILD_ID"

  llm:
    name: openai
    api_key: sk-...
    model: gpt-4o

  stt:
    name: deepgram
    api_key: dg-...
    model: nova-2
    options:
      language: en-US

  tts:
    name: elevenlabs
    api_key: el-...
    model: eleven_multilingual_v2

  vad:
    name: silero

npcs:
  - name: Greymantle the Sage
    personality: |
      You are Greymantle, an ancient wizard. You speak in measured,
      slightly archaic sentences and are helpful but mysterious.
    engine: cascaded

For a fully local setup (no API keys), use the Docker Compose local profile instead – see Running with Docker Compose.

See configs/example.yaml for the complete configuration reference including memory, embeddings, MCP tool servers, and multi-NPC setups.


▢️ Running Glyphoxa

Start the server:

./bin/glyphoxa -config config.yaml

On successful startup you will see the startup summary followed by a ready message:

╔═══════════════════════════════════════╗
β•‘         Glyphoxa β€” startup summary    β•‘
╠═══════════════════════════════════════╣
β•‘  LLM              : openai / gpt-4o   β•‘
β•‘  STT              : deepgram / nova-2  β•‘
β•‘  TTS              : elevenlabs / el…   β•‘
β•‘  S2S              : (not configured)   β•‘
β•‘  Embeddings       : (not configured)   β•‘
β•‘  VAD              : silero             β•‘
β•‘  Audio            : (not configured)   β•‘
β•‘  Discord          : connected          β•‘
β•‘  NPCs configured  : 1                  β•‘
β•‘  MCP servers      : 0                  β•‘
β•‘  Listen addr      : :8080              β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
time=... level=INFO msg="server ready β€” press Ctrl+C to shut down"

Press Ctrl+C to initiate graceful shutdown (15-second timeout).

If the config file is not found, Glyphoxa exits with:

glyphoxa: config file "config.yaml" not found β€” copy configs/example.yaml to get started

🐳 Running with Docker Compose

The deployments/compose/ directory contains a full Docker Compose setup with two modes:

Cloud API providers (you supply API keys):

cd deployments/compose
cp config.yaml.example config.yaml
# Edit config.yaml with your API keys
docker compose up -d

Fully local stack (no API keys needed β€” uses Ollama, Whisper.cpp, Coqui TTS):

cd deployments/compose
cp config.local.yaml config.yaml
docker compose --profile local up -d

The local profile starts PostgreSQL with pgvector, Ollama (llama3.2 + nomic-embed-text), Whisper.cpp, and Coqui TTS automatically.

For GPU acceleration, service configuration, model selection, and troubleshooting, see the full guide at deployments/compose/README.md.


πŸ› οΈ Development Workflow

Tests

Run the full test suite with the race detector:

make test

Run tests with verbose output:

make test-v

Generate a coverage report:

make test-cover

Linting

Requires golangci-lint:

make lint

Pre-commit Check

Run formatting, vetting, and tests in one command:

make check

This runs make fmt, make vet, and make test sequentially. Run this before pushing.

Branch Naming

Follow the project conventions:

  • feat/short-description – new features
  • fix/short-description – bug fixes
  • docs/short-description – documentation only
  • refactor/short-description – code cleanup

βœ… Verifying the Setup

Health Endpoints

Once Glyphoxa is running, check the health endpoints:

# Liveness probe β€” always returns 200 if the process is running
curl http://localhost:8080/healthz
{"status":"ok"}
# Readiness probe β€” returns 200 only when all dependencies are healthy
curl http://localhost:8080/readyz
{"status":"ok","checks":{"database":"ok","providers":"ok"}}

If any check fails, /readyz returns HTTP 503 with the failing check details:

{"status":"fail","checks":{"database":"fail: connection refused","providers":"ok"}}

First NPC Interaction

  1. Invite your Discord bot to a server with the guild ID from your config.
  2. Join a voice channel in that server.
  3. Use the bot’s slash commands to start a session and summon an NPC into the voice channel.
  4. Speak to the NPC – you should hear a voiced response within ~2 seconds.

If you configured a dm_role_id, ensure your Discord user has that role to access DM commands (/session, /npc, /entity, /campaign). Leave dm_role_id empty during development to allow all users.


πŸ”œ Next Steps

The default --mode=full is the right choice for self-hosted, single-tenant deployments. For multi-tenant SaaS deployments on Kubernetes, see:

πŸ“– See Also


This site uses Just the Docs, a documentation theme for Jekyll.