Skip to main content

Documentation Index

Fetch the complete documentation index at: https://patter-06b046ce-feat-observability-otel-attrs-0-6-1.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Agent Configuration

An Agent defines how your voice AI behaves: what it says, how it sounds, what tools it can use, and what guardrails it follows.

Creating an Agent

Use the phone.agent() factory method. The simplest form leans on env-var fallback and a default engine (OpenAIRealtime):
from getpatter import Patter, Twilio

phone = Patter(carrier=Twilio(), phone_number="+15550001234")   # TWILIO_* from env

agent = phone.agent(
    system_prompt="You are a customer support agent for Acme Corp.",
    first_message="Hello! How can I help you today?",
)   # defaults to engine=OpenAIRealtime(), reads OPENAI_API_KEY
To pick the engine explicitly (flat imports):
from getpatter import OpenAIRealtime

agent = phone.agent(
    engine=OpenAIRealtime(voice="nova"),
    system_prompt="You are a customer support agent for Acme Corp.",
)
To use pipeline mode (pick STT, LLM, TTS independently):
from getpatter import DeepgramSTT, AnthropicLLM, ElevenLabsTTS

agent = phone.agent(
    stt=DeepgramSTT(endpointing_ms=80),        # DEEPGRAM_API_KEY from env
    llm=AnthropicLLM(),                        # ANTHROPIC_API_KEY from env
    tts=ElevenLabsTTS(voice_id="rachel"),      # ELEVENLABS_API_KEY from env
    system_prompt="You are a helpful assistant.",
    first_message="Hi!",
)
Available LLM providers: OpenAILLM, AnthropicLLM, GroqLLM, CerebrasLLM, GoogleLLM. Tool calling works across all five. See LLM for the full reference. For fully custom logic (multi-model routing, local models), drop llm= and pass an on_message callback to serve() instead — llm= and on_message are mutually exclusive. The same pipeline using namespaced imports:
from getpatter.stt import deepgram
from getpatter.llm import anthropic
from getpatter.tts import elevenlabs

agent = phone.agent(
    stt=deepgram.STT(endpointing_ms=80),
    llm=anthropic.LLM(),
    tts=elevenlabs.TTS(voice_id="rachel"),
    system_prompt="You are a helpful assistant.",
)

Agent Parameters

ParameterTypeDefaultDescription
system_promptstrrequiredInstructions that define the agent’s behavior.
engineOpenAIRealtime | ElevenLabsConvAI | NoneNone → OpenAI RealtimeEnd-to-end engine. See Engines. Omit for pipeline mode.
sttSTTProvider | NoneNoneSTT instance for pipeline mode (DeepgramSTT(), CartesiaSTT(), …). See STT.
llmLLMProvider | NoneNoneLLM instance for pipeline mode (AnthropicLLM(), GroqLLM(), …). Mutually exclusive with on_message on serve(). Ignored when engine is set. See LLM.
ttsTTSProvider | NoneNoneTTS instance for pipeline mode (ElevenLabsTTS(), RimeTTS(), …). See TTS.
voicestr"alloy"Voice name. Usually inferred from the engine or TTS instance.
modelstr"gpt-4o-mini-realtime-preview"Model ID for OpenAI Realtime. Usually inferred from the engine.
languagestr"en"BCP-47 language code.
first_messagestr""If set, the agent speaks this immediately when a call connects.
toolslist[Tool] | NoneNoneTool(...) instances for function calling. See Tools.
variablesdict | NoneNoneDynamic variable substitutions for {placeholder} patterns in the system prompt. Values limited to 500 chars.
guardrailslist[Guardrail] | NoneNoneGuardrail(...) instances applied to LLM output. See Guardrails.
hooksPipelineHooks | NoneNonePipeline hooks for intercepting STT/TTS processing. Pipeline mode only. See Events.
text_transformslist[Callable] | NoneNoneText transformation functions applied to LLM output before TTS. Pipeline mode only.
vadVADProvider | NoneNoneVoice activity detection provider (e.g. Silero). Pipeline mode only.
audio_filterAudioFilter | NoneNonePre-STT audio filter (e.g. Krisp noise suppression). Pipeline mode only.
background_audioBackgroundAudioPlayer | NoneNoneHold music / ambient-cue mixer. Pipeline mode only.
barge_in_threshold_msint300Sustained-voice window (ms) before treating caller audio as barge-in. Set to 0 to disable.
aggressive_first_flushboolFalseOpt-in low-latency mode: emits the first clause on a soft punctuation boundary (,, em-dash, en-dash) once the buffer reaches ~40 chars. Saves 200–500 ms TTFA on the first sentence at the cost of slightly clipped prosody. Hard-disabled when language starts with "it" (Italian decimal commas would split mid-number). Pipeline mode only.
disable_phone_preambleboolFalseWhen False (default), Patter prepends a phone-friendly preamble to system_prompt that instructs the LLM to avoid markdown, emojis, bullet lists, and code blocks; spell out numbers and dates; and keep replies short. Set to True to ship system_prompt verbatim.

Agent Dataclass

Agent is a frozen (immutable) dataclass. You can construct it directly when you need a dataclass outside of phone.agent():
from getpatter import Agent

agent = Agent(
    system_prompt="You are a helpful assistant.",
    voice="echo",
    language="es",
)
Prefer phone.agent() over constructing Agent directly — the factory method validates credentials, unpacks the engine/STT/TTS instances, and surfaces clear errors up front.

System Prompt

The system_prompt defines the agent’s personality, instructions, and constraints:
agent = phone.agent(
    system_prompt="""You are a scheduling assistant for Dr. Smith's dental office.

Rules:
- Only book appointments Monday through Friday, 9am to 5pm.
- Each appointment is 30 minutes.
- Always confirm the patient's name and phone number.
- If the patient has an emergency, transfer them to the front desk.
""",
)

Dynamic Variables

Use {placeholder} syntax in the system prompt to inject dynamic values at call start. Values are limited to 500 characters each.
agent = phone.agent(
    system_prompt="""You are a support agent for {company_name}.
The customer's name is {customer_name} and their account ID is {account_id}.
Greet them by name and help resolve their issue.""",
    variables={
        "company_name": "Acme Corp",
        "customer_name": "Jane Doe",
        "account_id": "ACC-12345",
    },
)

First Message

When first_message is set, the agent speaks it immediately when a call connects:
agent = phone.agent(
    system_prompt="You are a restaurant reservation assistant.",
    first_message="Good evening! Thank you for calling Luigi's. Would you like to make a reservation?",
)

Voice Selection

Voice is usually inferred from the engine or TTS instance — e.g. OpenAIRealtime(voice="nova") or ElevenLabsTTS(voice_id="rachel"). Available voices depend on the provider.
"alloy", "echo", "fable", "onyx", "nova", "shimmer"

Voice Activity Detection (VAD)

Pipeline-mode agents can plug a VAD provider into the vad= parameter to gate STT around real speech and drive barge-in detection. The SDK ships Silero VAD (an ONNX model, ~1 MB) with a telephony-tuned factory:
import asyncio
from getpatter import SileroVAD

# Recommended for any phone-call deployment.
vad = await asyncio.to_thread(SileroVAD.for_phone_call)

agent = phone.agent(
    stt=DeepgramSTT(),
    llm=AnthropicLLM(),
    tts=ElevenLabsTTS(voice_id="rachel"),
    system_prompt="You are a helpful assistant.",
    vad=vad,
)
SileroVAD.for_phone_call(**overrides) is identical to SileroVAD.load(...) but pins sample_rate to 16 000 Hz — the only sample rate Patter’s pipeline-mode audio bus uses (8 kHz mulaw from Twilio is upsampled to 16 kHz PCM before reaching the VAD). All other parameters use the upstream snakers4/silero-vad defaults:
FieldDefaultUpstream equivalent
activation_threshold0.5threshold
deactivation_threshold0.35neg_threshold = threshold − 0.15
min_speech_duration0.25 smin_speech_duration_ms = 250
min_silence_duration0.1 smin_silence_duration_ms = 100
prefix_padding_duration0.03 sspeech_pad_ms = 30
Override per call site rather than as a global default. A common tweak: deployments that experience truncation on natural pauses raise min_silence_duration to 0.5–1.0 s:
vad = await asyncio.to_thread(
    SileroVAD.for_phone_call, min_silence_duration=0.5
)
SileroVAD.load(...) and SileroVAD.for_phone_call(...) are synchronous (they load the ONNX model). Wrap them in asyncio.to_thread(...) so the event loop stays responsive during process startup.

Engine vs Pipeline Mode

from getpatter import OpenAIRealtime, ElevenLabsConvAI, DeepgramSTT, AnthropicLLM, ElevenLabsTTS

# OpenAI Realtime (default engine) — end-to-end
agent = phone.agent(
    engine=OpenAIRealtime(),
    system_prompt="...",
)

# ElevenLabs Conversational AI — natural voices
agent = phone.agent(
    engine=ElevenLabsConvAI(agent_id="agent_abc123"),
    system_prompt="...",
)

# Pipeline — pick STT, LLM, TTS independently
agent = phone.agent(
    stt=DeepgramSTT(),
    llm=AnthropicLLM(),
    tts=ElevenLabsTTS(voice_id="rachel"),
    system_prompt="...",
)
See LLM for a deeper comparison.

Complete Example

import os
import asyncio
from dotenv import load_dotenv
from getpatter import Patter, Twilio, OpenAIRealtime, Tool, Guardrail

load_dotenv()

phone = Patter(
    carrier=Twilio(),                               # TWILIO_* from env
    phone_number=os.environ["PHONE_NUMBER"],
    webhook_url=os.environ["WEBHOOK_URL"],
)

async def check_availability(args: dict, ctx: dict) -> dict:
    # Hit your reservation system here.
    return {"available": True, "rooms": 3}

agent = phone.agent(
    engine=OpenAIRealtime(voice="nova"),            # OPENAI_API_KEY from env
    system_prompt="""You are a booking assistant for {hotel_name}.
Help guests check availability and make reservations.
Be warm, professional, and concise.""",
    language="en",
    first_message="Welcome to {hotel_name}! How can I assist you with your stay?",
    variables={"hotel_name": "The Grand Hotel"},
    tools=[
        Tool(
            name="check_availability",
            description="Check room availability for given dates",
            parameters={
                "type": "object",
                "properties": {
                    "check_in": {"type": "string", "description": "Check-in date (YYYY-MM-DD)"},
                    "check_out": {"type": "string", "description": "Check-out date (YYYY-MM-DD)"},
                    "guests": {"type": "integer", "description": "Number of guests"},
                },
                "required": ["check_in", "check_out"],
            },
            handler=check_availability,
        ),
    ],
    guardrails=[
        Guardrail(
            name="No pricing promises",
            blocked_terms=["discount", "free upgrade", "complimentary"],
            replacement="I'd be happy to check our current rates for you.",
        ),
    ],
)

async def main():
    await phone.serve(agent, port=8000)

asyncio.run(main())