A skill that gives AI coding agents accurate, up-to-date knowledge of AssemblyAI's speech-to-text APIs, SDKs, and voice agent integrations. Works with Claude Code, Codex, Cursor, and other coding agents that support skill/rules files.
LLM training data contains outdated AssemblyAI patterns — deprecated LeMUR API calls, discontinued SDK usage, wrong auth headers, and no knowledge of newer features like the LLM Gateway, streaming v3, or voice agent framework integrations. This skill corrects those mistakes and adds coverage for the full current API surface.
Without the skill, coding agents will:
- Use the deprecated LeMUR API instead of the LLM Gateway
- Use
Authorization: Bearer KEYinstead ofAuthorization: KEY - Use
word_boostinstead ofkeyterms_prompt - Use discontinued Java/Go/C# SDKs
- Miss all LiveKit/Pipecat-specific gotchas for voice agents
- Use wrong model ID formats (
anthropic/claude-...instead ofclaude-...)
| Area | Details |
|---|---|
| Pre-recorded transcription | Universal-3-Pro, Universal-2, prompting, speech_models fallback |
| Streaming STT | v3 protocol, v2 legacy, Whisper Streaming, temp tokens, error codes |
| Voice agents | LiveKit and Pipecat integrations, u3-rt-pro, turn detection, silence tuning, latency optimization |
| LLM Gateway | Chat completions, tool calling, agentic workflows, structured output caveats, full model list |
| Audio intelligence | PII redaction, diarization, summarization, sentiment, entity detection, content safety, chapters |
| Speech understanding | Translation, speaker identification, custom formatting |
| SDKs | Python and JS/TS patterns, Ruby status, discontinued SDK warnings |
| API reference | Full parameter list, export endpoints, webhooks, custom spelling, multichannel, code switching |
claude skill add --from ./assemblyaiOr add it as a plugin skill from this repo.
Copy the assemblyai/ directory into your project, then reference it in your AGENTS.md:
When working with AssemblyAI, read and follow the instructions in assemblyai/SKILL.mdCopy the assemblyai/ directory into your project and add a rule or instruction pointing to assemblyai/SKILL.md. Most agents that support custom rules or docs can ingest the skill content directly. For example, in Cursor you can add the assemblyai/ folder as project-level documentation.
The skill uses progressive disclosure to keep context usage efficient. The core SKILL.md (122 lines) is always loaded and contains auth patterns, model overview, common mistakes, and gotchas. Detailed reference files are only loaded when relevant:
assemblyai/
├── SKILL.md # Core skill (always loaded)
└── references/
├── python-sdk.md # Python SDK patterns
├── js-sdk.md # JS/TS SDK patterns
├── streaming.md # Streaming STT protocol details
├── voice-agents.md # LiveKit, Pipecat integrations
├── llm-gateway.md # LLM Gateway models and usage
├── speech-understanding.md # Translation, speaker ID, formatting
├── audio-intelligence.md # PII, diarization, summarization, etc.
└── api-reference.md # Full API parameters, endpoints, webhooks
The assemblyai-workspace/ directory contains test results comparing skill vs. no-skill outputs across three scenarios:
| Test Case | With Skill | Without Skill |
|---|---|---|
| Basic transcription + summary (Python) | 4/4 | 4/4 |
| Voice agent with LiveKit | 7/7 | 0/7 |
| LLM Gateway + PII redaction (TypeScript) | 6/6 | 3/6 |
| Overall | 17/17 (100%) | 7/17 (41%) |
The skill provides the most value for voice agent integrations (where LLMs have no training data for framework-specific pitfalls) and LLM Gateway usage (where LLMs default to the deprecated LeMUR API). Evals were run with Claude Code but results should generalize to other agents.