Developer setup guide for building, running, and contributing to Glyphoxa.
π Prerequisites
Go
Glyphoxa requires Go 1.26+ with CGo enabled (CGO_ENABLED=1).
Install from go.dev/dl or via your system package manager.
System Libraries
Debian / Ubuntu
sudo apt update
sudo apt install -y build-essential cmake git \
libopus-dev pkg-config
Arch Linux
sudo pacman -S base-devel cmake git opus
macOS (Homebrew)
brew install cmake opus pkg-config
ONNX Runtime (Silero VAD)
The built-in Silero Voice Activity Detection provider requires the ONNX Runtime shared library.
- Download the latest release for your platform from onnxruntime releases.
- Extract and place the shared library where your linker can find it (e.g.
/usr/local/lib). - Ensure the headers are accessible (e.g.
/usr/local/include/onnxruntime).
libdave (Discord DAVE E2EE)
The Discord audio transport requires the libdave shared library for Discordβs Audio/Video End-to-End Encryption (DAVE) protocol.
make dave-libs
This downloads (or builds from source) the libdave shared library and installs it to ~/.local/lib. After the build completes, set the environment variables:
export PKG_CONFIG_PATH="$HOME/.local/lib/pkgconfig:$PKG_CONFIG_PATH"
export LD_LIBRARY_PATH="$HOME/.local/lib:$LD_LIBRARY_PATH"
export CGO_ENABLED=1
PostgreSQL with pgvector
The memory subsystem requires PostgreSQL with the pgvector extension.
# Debian/Ubuntu
sudo apt install -y postgresql postgresql-server-dev-all
# Then install pgvector from source β see https://github.com/pgvector/pgvector#installation
# Arch
sudo pacman -S postgresql
yay -S pgvector # or build from source
# macOS
brew install postgresql@17 pgvector
Alternatively, use the Docker Compose setup (see below) which includes a pre-configured pgvector/pgvector:pg17 image.
π₯ Clone and Build
git clone https://github.com/MrWong99/glyphoxa.git
cd glyphoxa
Build the binary:
make build
This compiles the server to ./bin/glyphoxa. Verify it built successfully:
./bin/glyphoxa --help
π§ whisper.cpp Native Build
If you want to use the whisper-native STT provider (local speech-to-text via CGo instead of an HTTP server), you need to build the whisper.cpp static library first.
make whisper-libs
This clones whisper.cpp into /tmp/whisper-src, builds it, and installs headers and static libraries to /tmp/whisper-install.
After the build completes, set the environment variables before running other Make targets:
export C_INCLUDE_PATH=/tmp/whisper-install/include
export LIBRARY_PATH=/tmp/whisper-install/lib
export CGO_ENABLED=1
Then rebuild Glyphoxa so the whisper-native provider is linked:
make build
You will also need a GGML model file. Download one from the Hugging Face whisper.cpp models:
# Example: download the base English model
curl -L -o ggml-base.en.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
Then reference the model path in your config under providers.stt:
stt:
name: whisper-native
model: /path/to/ggml-base.en.bin
options:
language: en
βοΈ Minimal Configuration
Copy the example config and edit it:
cp configs/example.yaml config.yaml
For a first run, you need at minimum:
- A Discord bot token (from discord.com/developers/applications)
- At least one voice engine path configured β either the cascaded pipeline (STT + LLM + TTS) or a speech-to-speech provider
Here is a minimal config.yaml using OpenAI for the cascaded pipeline and ElevenLabs for TTS:
server:
listen_addr: ":8080"
log_level: info
providers:
audio:
name: discord
api_key: "Bot YOUR_BOT_TOKEN_HERE"
options:
guild_id: "YOUR_GUILD_ID"
llm:
name: openai
api_key: sk-...
model: gpt-4o
stt:
name: deepgram
api_key: dg-...
model: nova-2
options:
language: en-US
tts:
name: elevenlabs
api_key: el-...
model: eleven_multilingual_v2
vad:
name: silero
npcs:
- name: Greymantle the Sage
personality: |
You are Greymantle, an ancient wizard. You speak in measured,
slightly archaic sentences and are helpful but mysterious.
engine: cascaded
For a fully local setup (no API keys), use the Docker Compose local profile instead β see Running with Docker Compose.
See configs/example.yaml for the complete configuration reference including memory, embeddings, MCP tool servers, and multi-NPC setups.
βΆοΈ Running Glyphoxa
Start the server:
./bin/glyphoxa -config config.yaml
On successful startup you will see the startup summary followed by a ready message:
βββββββββββββββββββββββββββββββββββββββββ
β Glyphoxa β startup summary β
β ββββββββββββββββββββββββββββββββββββββββ£
β LLM : openai / gpt-4o β
β STT : deepgram / nova-2 β
β TTS : elevenlabs / elβ¦ β
β S2S : (not configured) β
β Embeddings : (not configured) β
β VAD : silero β
β Audio : (not configured) β
β Discord : connected β
β NPCs configured : 1 β
β MCP servers : 0 β
β Listen addr : :8080 β
βββββββββββββββββββββββββββββββββββββββββ
time=... level=INFO msg="server ready β press Ctrl+C to shut down"
Press Ctrl+C to initiate graceful shutdown (15-second timeout).
If the config file is not found, Glyphoxa exits with:
glyphoxa: config file "config.yaml" not found β copy configs/example.yaml to get started
π³ Running with Docker Compose
The deployments/compose/ directory contains a full Docker Compose setup with two modes:
Cloud API providers (you supply API keys):
cd deployments/compose
cp config.yaml.example config.yaml
# Edit config.yaml with your API keys
docker compose up -d
Fully local stack (no API keys needed β uses Ollama, Whisper.cpp, Coqui TTS):
cd deployments/compose
cp config.local.yaml config.yaml
docker compose --profile local up -d
The local profile starts PostgreSQL with pgvector, Ollama (llama3.2 + nomic-embed-text), Whisper.cpp, and Coqui TTS automatically.
For GPU acceleration, service configuration, model selection, and troubleshooting, see the full guide at deployments/compose/README.md.
π οΈ Development Workflow
Tests
Run the full test suite with the race detector:
make test
Run tests with verbose output:
make test-v
Generate a coverage report:
make test-cover
Linting
Requires golangci-lint:
make lint
Pre-commit Check
Run formatting, vetting, and tests in one command:
make check
This runs make fmt, make vet, and make test sequentially. Run this before pushing.
Branch Naming
Follow the project conventions:
feat/short-descriptionβ new featuresfix/short-descriptionβ bug fixesdocs/short-descriptionβ documentation onlyrefactor/short-descriptionβ code cleanup
β Verifying the Setup
Health Endpoints
Once Glyphoxa is running, check the health endpoints:
# Liveness probe β always returns 200 if the process is running
curl http://localhost:8080/healthz
{"status":"ok"}
# Readiness probe β returns 200 only when all dependencies are healthy
curl http://localhost:8080/readyz
{"status":"ok","checks":{"database":"ok","providers":"ok"}}
If any check fails, /readyz returns HTTP 503 with the failing check details:
{"status":"fail","checks":{"database":"fail: connection refused","providers":"ok"}}
First NPC Interaction
- Invite your Discord bot to a server with the guild ID from your config.
- Join a voice channel in that server.
- Use the botβs slash commands to start a session and summon an NPC into the voice channel.
- Speak to the NPC β you should hear a voiced response within ~2 seconds.
If you configured a dm_role_id, ensure your Discord user has that role to access DM commands (/session, /npc, /entity, /campaign). Leave dm_role_id empty during development to allow all users.
π Next Steps
The default --mode=full is the right choice for self-hosted, single-tenant deployments. For multi-tenant SaaS deployments on Kubernetes, see:
- Deployment: Kubernetes / Helm β gateway + worker setup
- Multi-Tenant Architecture β admin API, tenant model, usage tracking
π See Also
- Architecture β system layers and data flow
- Configuration β full configuration reference
- Deployment β production deployment guide
- Testing β testing strategy and conventions
- Contributing β code style, workflow, and PR guidelines