Common issues and their fixes for the Glyphoxa voice AI framework.
🛠️ Build Issues
libopus not found
Symptom
# layeh.com/gopus
cgo: C compiler cannot find -lopus
Cause – The gopus Opus bindings (used for Discord voice encoding/decoding) require the libopus C library and its development headers. They are not bundled with the Go module.
Fix
| Platform | Command |
|---|---|
| Debian / Ubuntu | sudo apt-get install libopus-dev |
| Fedora / RHEL | sudo dnf install opus-devel |
| Arch Linux | sudo pacman -S opus |
| macOS (Homebrew) | brew install opus |
| Alpine (Docker) | apk add opus-dev |
After installing, re-run:
make build
ONNX Runtime not found (Silero VAD)
Symptom
error while loading shared libraries: libonnxruntime.so: cannot open shared object file
or at build time:
ld: library not found for -lonnxruntime
Cause – The Silero VAD engine depends on ONNX Runtime for model inference. The dynamic linker cannot find the shared library.
Fix
- Download ONNX Runtime from the official releases for your platform.
- Extract it to a known path (e.g.,
/opt/onnxruntime/). - Set the library path environment variable:
# Linux
export LD_LIBRARY_PATH=/opt/onnxruntime/lib:$LD_LIBRARY_PATH
# macOS
export DYLD_LIBRARY_PATH=/opt/onnxruntime/lib:$DYLD_LIBRARY_PATH
- Optionally, make it persistent by adding the export to your shell profile (
~/.bashrc,~/.zshrc).
whisper.cpp build failures
Symptom
whisper: load model "...": ...
or during build:
ld: library not found for -lwhisper
fatal error: whisper.h: No such file or directory
Cause – The native whisper.cpp STT provider (whisper-native) uses CGO bindings that require the libwhisper.a static library and whisper.h header at link time. These must be built from source.
Fix
Use the provided Makefile target:
make whisper-libs
This clones whisper.cpp, builds the static library, and installs headers + .a files to /tmp/whisper-install/. Then export the required paths:
export C_INCLUDE_PATH=/tmp/whisper-install/include
export LIBRARY_PATH=/tmp/whisper-install/lib
export CGO_ENABLED=1
Re-run the build:
make build
Tip: If you only use the HTTP-based whisper provider (
whisperinstead ofwhisper-native), you do not need the CGO bindings. You can run a standalonewhisper-serverprocess and point the provider at it viabase_url.
CGo compilation errors
Symptom
cgo: C compiler "gcc" not found: exec: "gcc": executable file not found in $PATH
or
CGO_ENABLED=0 but package requires CGO
Cause – Glyphoxa requires CGO_ENABLED=1 because the Opus codec, Silero VAD, and whisper-native providers all use C bindings. A C compiler (gcc or clang) must be available.
Fix
-
Install a C toolchain:
Platform Command Debian / Ubuntu sudo apt-get install build-essentialFedora / RHEL sudo dnf install gccArch Linux sudo pacman -S base-develmacOS xcode-select --installAlpine (Docker) apk add gcc musl-dev -
Ensure CGo is enabled:
export CGO_ENABLED=1 -
Re-run
make build.
🔌 Provider Issues
Deepgram: WebSocket disconnects
Symptom – STT transcription stops mid-session. Logs show:
deepgram: dial: ...connection refused
or the read loop exits silently and the Finals() / Partials() channels close unexpectedly.
Cause – The Deepgram WebSocket connection at wss://api.deepgram.com/v1/listen was dropped due to a network interruption, idle timeout, or Deepgram-side rate limit.
Fix
- Check your network connectivity to
api.deepgram.com. - Ensure your Deepgram API key is valid and has sufficient quota:
providers: stt: name: deepgram api_key: "YOUR_DEEPGRAM_API_KEY" - If using the STT fallback group, the circuit breaker will automatically route to the next healthy provider after 5 consecutive failures (default
MaxFailures). Check logs for:circuit breaker opened name=deepgram provider failed, trying next provider=deepgram - If disconnects are frequent, consider adding a local whisper fallback:
providers: stt: name: deepgram api_key: "..." options: fallback: whisper
Deepgram: authentication failures
Symptom
deepgram: dial: websocket: bad handshake (HTTP 401)
Cause – The API key sent in the Authorization: Token ... header is invalid, expired, or missing.
Fix
- Verify the key in your config file under
providers.stt.api_key. - Confirm the key is active in the Deepgram console.
- Ensure there are no trailing whitespace or newline characters in the key.
ElevenLabs: rate limiting
Symptom – TTS synthesis fails intermittently. The ListVoices call returns:
elevenlabs: list voices: unexpected status 429
or the WebSocket dial fails with a 429-like rejection.
Cause – Your ElevenLabs plan’s concurrent request or character limit has been exceeded.
Fix
- Check your ElevenLabs plan quotas at elevenlabs.io/app/subscription.
- Reduce the number of NPCs speaking simultaneously.
- Consider using a local TTS fallback (e.g., Coqui TTS) for lower-priority NPCs.
- The TTS circuit breaker will automatically trip after 5 consecutive failures and retry after 30 seconds. Monitor with:
circuit breaker opened name=elevenlabs
ElevenLabs: voice ID not found
Symptom
elevenlabs: voice.ID must not be empty
or the WebSocket connection opens but returns no audio (the voice ID path segment in the URL is invalid).
Cause – The voice_id in your NPC’s voice config is empty, incorrect, or refers to a voice not available on your ElevenLabs account.
Fix
- List available voices by calling the API or via the ElevenLabs dashboard:
curl -H "xi-api-key: YOUR_KEY" https://api.elevenlabs.io/v1/voices | jq '.voices[].voice_id' - Update your config: ```yaml npcs:
- name: “Greymantle the Sage” voice: provider: elevenlabs voice_id: “pNInz6obpgDQGcFmaJgB” # a valid voice ID ```
Ollama: model not loaded / connection refused
Symptom
anyllm: completion: Post "http://localhost:11434/...": dial tcp 127.0.0.1:11434: connect: connection refused
or
anyllm: completion: model "llama3" not found
Cause – The Ollama daemon is not running, or the requested model has not been pulled.
Fix
- Start the Ollama daemon:
ollama serve - Pull the model:
ollama pull llama3 - If Ollama runs on a non-default address, set
base_urlin your config:providers: llm: name: ollama model: "llama3" base_url: "http://my-ollama-host:11434"
whisper.cpp: library not found at runtime
Symptom
whisper: load model "/path/to/ggml-base.bin": dlopen: libwhisper.so: cannot open shared object file
Cause – The whisper shared library is not on the dynamic linker’s search path at runtime. This affects the whisper-native provider only.
Fix
Set the library path before starting Glyphoxa:
export LD_LIBRARY_PATH=/tmp/whisper-install/lib:$LD_LIBRARY_PATH # Linux
export DYLD_LIBRARY_PATH=/tmp/whisper-install/lib:$DYLD_LIBRARY_PATH # macOS
If using the HTTP-based whisper provider instead, ensure your whisper-server process is running:
# Start whisper-server separately
./whisper-server --model /path/to/ggml-base.bin --port 8080
Then configure the provider:
providers:
stt:
name: whisper
base_url: "http://localhost:8080"
OpenAI / Anthropic: API key issues
Symptom
anyllm: create "openai" backend: ...
anyllm: completion: 401 Unauthorized
Cause – The API key is missing, invalid, or not provided via the config file or environment variable.
Fix
Provide the key in config:
providers:
llm:
name: openai
api_key: "sk-..."
model: "gpt-4o"
Or via the corresponding environment variable:
| Provider | Environment Variable |
|---|---|
| OpenAI | OPENAI_API_KEY |
| Anthropic | ANTHROPIC_API_KEY |
| Gemini | GEMINI_API_KEY or GOOGLE_API_KEY |
| DeepSeek | DEEPSEEK_API_KEY |
| Mistral | MISTRAL_API_KEY |
| Groq | GROQ_API_KEY |
OpenAI / Anthropic: model not available
Symptom
anyllm: completion: 404 model "gpt-5-turbo" not found
Cause – The model name in your config does not match an available model on the provider’s API.
Fix
Double-check the model name against the provider’s documentation. Common valid values:
| Provider | Example models |
|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo, o1, o3-mini |
| Anthropic | claude-3-5-sonnet-latest, claude-3-5-haiku-latest, claude-3-opus-20240229 |
| Gemini | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash |
| Ollama | Any locally pulled model (run ollama list) |
⚙️ Runtime Issues
NPC not responding to speech
Symptom – You speak in the Discord voice channel but the NPC never responds.
Cause – This can stem from several pipeline stages. Work through them in order:
| Stage | Diagnostic | Likely cause |
|---|---|---|
| Audio input | Check that the bot is receiving Opus packets (log discord: opus decode error would indicate packets are arriving) | Bot is deafened, or not in the voice channel |
| VAD | Check for VADSpeechStart / VADSpeechEnd events in debug logs | VAD thresholds too high, wrong sample rate, or ONNX Runtime missing |
| STT | Check for deepgram: dial: or whisper: http request: errors | STT provider misconfigured or unreachable |
| Address detection | Enable debug logging; look for NPC name matching | Player did not address the NPC by name or the NPC name is not in the STT keyword list |
| LLM | Check for anyllm: completion: errors | LLM provider unreachable, API key invalid, or model not available |
| TTS | Check for elevenlabs: dial: errors | TTS provider unreachable or voice ID invalid |
Fix
- Set
log_level: debugin your config to see full pipeline trace. - Check the
/readyzendpoint to verify all dependencies are healthy. - Verify the NPC is not muted (check the session dashboard or use the
/npcslash command).
High latency
Symptom – The NPC responds but with a noticeable delay (> 2 seconds).
Cause – One or more pipeline stages are slow. The cascaded pipeline runs STT -> LLM -> TTS sequentially, so latencies compound.
Fix
- Check the session dashboard embed in Discord – it shows per-stage p50/p95 latencies:
STT: p50=120.0ms p95=340.0ms LLM: p50=800.0ms p95=1500.0ms TTS: p50=200.0ms p95=450.0ms -
Identify the bottleneck stage and address it:
Stage Fix STT Use Deepgram (streaming, low latency) over whisper.cpp (batch). Ensure the modelis set tonova-3for fastest results.LLM Use a faster model ( gpt-4o-mini,claude-3-5-haiku-latest, or a local Ollama model). Consider thesentence_cascadeengine for perceived latency reduction.TTS Use eleven_flash_v2_5(the default ElevenLabs model). Reducespeed_factorif set above 1.0. - Check Prometheus metrics at
/metricsforglyphoxa_pipeline_*histograms if enabled. - For the
sentence_cascadeengine, ensurecascade_modeis set toautooralways: ```yaml npcs:- name: “Greymantle” engine: sentence_cascade cascade_mode: auto cascade: fast_model: “gpt-4o-mini” strong_model: “gpt-4o” ```
Memory not populating
Symptom – The session dashboard shows Memory Entries: 0 even after extended conversation. NPC responses lack context from previous sessions.
Cause – The PostgreSQL memory store is not connected, the pgvector extension is not installed, or embedding_dimensions is misconfigured.
Fix
- Verify
memory.postgres_dsnis set in your config:memory: postgres_dsn: "postgres://user:pass@localhost:5432/glyphoxa?sslmode=disable" embedding_dimensions: 1536 - Check that PostgreSQL is reachable:
psql "postgres://user:pass@localhost:5432/glyphoxa" -c "SELECT 1;" - Check that the
pgvectorextension is available:psql "postgres://user:pass@localhost:5432/glyphoxa" -c "CREATE EXTENSION IF NOT EXISTS vector;"If this fails, install pgvector for your PostgreSQL version. See the pgvector installation guide.
-
Ensure
embedding_dimensionsmatches your embedding model:Embedding provider/model Dimensions OpenAI text-embedding-3-small1536 OpenAI text-embedding-3-large3072 Ollama nomic-embed-text768 - Check for migration errors in the startup logs:
postgres store: migrate: ... postgres store: ping: ...
Transcript correction not working
Symptom – NPC and place names are consistently misspelled in transcripts (e.g., “Eldrinax” becomes “Elder Next”).
Cause – STT keyword boosting is not configured, or the provider does not support mid-stream keyword updates.
Fix
- For Deepgram, keywords are passed as URL query parameters at stream start. Ensure your NPC and entity names are included in the STT
StreamConfig.Keywordsvia the orchestrator. - Note that both Deepgram and whisper.cpp do not support mid-session keyword updates. Keywords set after stream start will return:
deepgram: mid-session keyword updates are not supported whisper: keyword boosting is not supported by whisper.cppKeywords must be provided before starting the STT stream.
- Consider adding campaign entity names to the keyword list via the
/entityslash command so they are available at session start.
đź’¬ Discord Issues
Bot not connecting
Symptom
discord: create session: ...
discord: open session: ...
Cause – The Discord bot token is invalid, or the bot lacks required gateway intents.
Fix
- Verify
providers.audio.api_keyin your config file. The token should not include theBotprefix (Glyphoxa adds it automatically).providers: audio: name: discord api_key: "MTIz..." # just the token, no "Bot " prefix options: guild_id: "123456789" - In the Discord Developer Portal, ensure these Privileged Gateway Intents are enabled for your bot:
- Server Members Intent (for role-based DM permission checks)
- Message Content Intent (if using text-based commands)
- Verify the required intents match the code:
IntentsGuildMessages | IntentsGuildVoiceStates | IntentsGuilds
Slash commands not appearing
Symptom – After starting the bot, the /session, /npc, /entity, and other commands do not appear in Discord.
Cause – Guild-scoped command registration can take up to a few minutes. If commands never appear, the guild_id may be wrong or the bot lacks the applications.commands scope.
Fix
- Double-check
providers.audio.options.guild_idmatches your target server. You can copy the guild ID by right-clicking the server name in Discord (Developer Mode must be enabled). - Verify the bot was invited with the
applications.commandsOAuth2 scope. The invite URL should include:&scope=bot+applications.commands - Check the logs for registration errors:
discord: register commands: ...A successful registration logs:
discord commands registered count=N - Guild-scoped commands should appear within a few seconds; global commands can take up to an hour. Glyphoxa uses guild-scoped registration by default.
Voice channel issues
Symptom – The bot joins the voice channel but no audio is received or sent. Players hear silence.
Cause – The bot may lack voice permissions, or the Opus codec layer is failing.
Fix
- Ensure the bot has these permissions in the target voice channel:
- Connect
- Speak
- Use Voice Activity (not Push-to-Talk only)
- Check for Opus codec errors in logs:
discord: failed to create opus decoder ssrc=... discord: opus decode error ssrc=... discord: failed to create opus encoderThese indicate a libopus installation issue. See the libopus not found section above.
- Verify the bot is not self-deafened. The platform connects with
mute=false, deaf=false. - If audio frames are being dropped, you will see
Channel full -- drop frame rather than blockbehavior silently. Increase system resources if under heavy load.
DM permissions not working
Symptom – All users can execute privileged commands, or no users can (even the DM).
Cause – The dm_role_id is misconfigured.
Fix
- If
dm_role_idis empty, all users are treated as DMs (development mode). Set it to restrict access:providers: audio: name: discord api_key: "Bot MTIz..." options: guild_id: "123456789012345678" dm_role_id: "987654321012345678" - The role ID must be a Discord role in the target guild. Copy it from Server Settings > Roles (right-click the role with Developer Mode enabled).
- Interactions from DM channels (outside guilds) always return
IsDM = falsebecause there is noMemberobject. Privileged commands must be run inside the guild.
đź“„ Configuration Issues
Hot-reload not picking up changes
Symptom – You edit glyphoxa.yaml but the running instance does not reflect the changes.
Cause – The config watcher uses polling (not filesystem events). It checks the file’s modification time every 5 seconds by default. If the mtime has not changed (e.g., some editors write atomically to a temp file then rename), the watcher may not detect the change.
Fix
- Check the logs for watcher activity:
config watcher: configuration reloaded path=glyphoxa.yamlIf you see
config watcher: failed to load config, the file has a YAML syntax error. Fix the syntax and the watcher will pick up the next valid version. - The watcher computes a SHA-256 hash of the file contents. If the file was touched but the content is identical, no reload occurs. This is by design.
- Only certain fields support hot-reload without restart:
- NPC personality, voice, and budget tier changes
- Log level changes
- Changes to providers, discord, memory, or server settings require a full restart.
- If your editor writes to a temp file then renames, ensure the final file path matches the one Glyphoxa is watching.
Invalid provider options
Symptom
config: provider not registered: llm/"openaii"
or a startup warning:
unknown provider name -- may be a typo or third-party provider kind=llm name=openaii
Cause – The name field in a provider entry does not match any registered provider.
Fix
Use one of the valid provider names:
| Kind | Valid names |
|---|---|
llm | openai, anthropic, ollama, gemini, deepseek, mistral, groq, llamacpp, llamafile |
stt | deepgram, whisper, whisper-native |
tts | elevenlabs, coqui |
s2s | openai-realtime, gemini-live |
embeddings | openai, ollama |
vad | silero |
audio | discord |
Missing required fields
Symptom – Startup fails with one or more validation errors:
npcs[0].name is required
npcs[0]: engine "cascaded" requires an LLM provider but providers.llm is not configured
npcs[0]: engine "cascaded" requires a TTS provider but providers.tts is not configured
npcs[0]: engine "s2s" requires an S2S provider but providers.s2s is not configured
mcp.servers[0].name is required
mcp.servers[0].command is required when transport is stdio
Cause – The config validator checks for cross-field consistency. Engine mode dictates which providers are required.
Fix
| Engine | Required providers |
|---|---|
cascaded | llm + tts (and stt for speech input) |
sentence_cascade | llm + tts |
s2s | s2s |
Additional rules:
- Every NPC must have a non-empty
name. - NPC names must be unique (duplicates are rejected).
voice.speed_factormust be in[0.5, 2.0].voice.pitch_shiftmust be in[-10, 10].server.log_levelmust be one ofdebug,info,warn,error(or empty for default).- MCP servers with transport
stdiorequire acommand;streamable-httprequires aurl.
Unknown fields in config
Symptom
config: decode yaml: line 12: field foo_bar not found in type config.Config
Cause – The YAML decoder is configured with KnownFields(true), which rejects unrecognised keys. This catches typos early.
Fix
Remove or correct the unrecognised field. Check the field name against the struct tags in internal/config/config.go. Common mistakes:
| Wrong | Correct |
|---|---|
api-key | api_key |
guildId | guild_id |
listenAddr | listen_addr |
postgresdsn | postgres_dsn |
🔍 Diagnostic Steps
Health endpoint check
Glyphoxa exposes two health endpoints:
| Endpoint | Purpose |
|---|---|
GET /healthz | Liveness probe. Always returns 200 OK if the process is running. |
GET /readyz | Readiness probe. Returns 200 only when all registered health checks pass. |
# Liveness
curl -s http://localhost:8080/healthz | jq .
# {"status":"ok"}
# Readiness (includes per-check results)
curl -s http://localhost:8080/readyz | jq .
# {
# "status": "ok",
# "checks": {
# "database": "ok",
# "providers": "ok"
# }
# }
If /readyz returns 503 Service Unavailable, inspect the checks map for the failing component:
{
"status": "fail",
"checks": {
"database": "fail: postgres store: ping: connection refused",
"providers": "ok"
}
}
Each individual check has a 5-second timeout. If a check consistently times out, the underlying dependency is likely unreachable.
Log analysis
Set the log level to debug for maximum visibility:
server:
log_level: debug
Key log patterns to look for:
| Pattern | Meaning |
|---|---|
circuit breaker opened name=... | A provider has failed 5+ times and is being bypassed |
circuit breaker transitioning to half-open | The breaker is testing whether the provider has recovered |
circuit breaker closed after successful probes | The provider has recovered |
provider failed, trying next provider=... error=... | A fallback group is trying the next healthy provider |
all providers failed | Every provider in a fallback group has failed or has an open circuit |
config watcher: configuration reloaded | Hot-reload detected and applied a config change |
config watcher: failed to load config | Hot-reload detected a change but the new config is invalid |
discord: register commands | Slash command registration outcome |
whisper native inference failed | whisper.cpp CGO inference error |
voicecmd: command executed | A DM voice command was recognised and executed |
Prometheus metrics inspection
If you have Prometheus scraping enabled, key metrics to inspect:
# Pipeline latency by stage (p95 over 5 minutes)
histogram_quantile(0.95, rate(glyphoxa_pipeline_duration_seconds_bucket[5m]))
# Error rate
rate(glyphoxa_pipeline_errors_total[5m])
# Circuit breaker state (1 = open, 0 = closed)
glyphoxa_circuit_breaker_state{name="deepgram"}
# STT / LLM / TTS individual stage latencies
histogram_quantile(0.95, rate(glyphoxa_stt_duration_seconds_bucket[5m]))
histogram_quantile(0.95, rate(glyphoxa_llm_duration_seconds_bucket[5m]))
histogram_quantile(0.95, rate(glyphoxa_tts_duration_seconds_bucket[5m]))
Common SQL queries for memory debugging
Connect to your PostgreSQL database and run these queries to diagnose memory issues:
-- Check if tables were created
SELECT tablename FROM pg_tables WHERE schemaname = 'public';
-- Verify pgvector extension
SELECT * FROM pg_extension WHERE extname = 'vector';
-- Count session log entries
SELECT session_id, COUNT(*) AS entries
FROM session_entries
GROUP BY session_id
ORDER BY entries DESC;
-- Recent session entries (last 20)
SELECT id, session_id, speaker_name, LEFT(text, 80) AS text_preview, timestamp
FROM session_entries
ORDER BY timestamp DESC
LIMIT 20;
-- Count semantic chunks
SELECT COUNT(*) FROM chunks;
-- Check if embeddings are populated (NULL = missing embeddings)
SELECT id, session_id, LEFT(content, 60) AS content_preview,
(embedding IS NOT NULL) AS has_embedding
FROM chunks
ORDER BY timestamp DESC
LIMIT 20;
-- Count entities in the knowledge graph
SELECT type, COUNT(*) FROM entities GROUP BY type;
-- List all relationships
SELECT source_id, rel_type, target_id
FROM relationships
ORDER BY created_at DESC
LIMIT 20;
-- Search session log (full-text)
SELECT id, speaker_name, text, timestamp
FROM session_entries
WHERE to_tsvector('english', text) @@ plainto_tsquery('english', 'blacksmith sword')
ORDER BY timestamp DESC
LIMIT 10;
-- Find nearest semantic chunks (requires a pre-computed embedding vector)
-- Replace the vector literal with an actual embedding:
-- SELECT id, content, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
-- FROM chunks
-- ORDER BY distance
-- LIMIT 5;
See also
- getting-started.md – Installation, first run, and basic configuration
configuration.md– Full configuration referenceobservability.md– Prometheus metrics, logging, and dashboard detailsdeployment.md– Docker, systemd, and production deployment guides