CloudTalk VoiceAgents allow you to create AI-powered virtual agents that can make outbound calls, guide conversations, and extract insights - all within your existing CloudTalk account.
This article explains how to enable, set up, and launch your first VoiceAgent directly from the CloudTalk Dashboard.
Enabling the VoiceAgent Feature
VoiceAgents are available on request. If you don’t see the feature in your Dashboard, please contact CloudTalk Support to enable it.
Once activated
A VoiceAgents section will appear in the main Dashboard menu.
Calls made by VoiceAgents are charged per minute and deducted from your account balance or billed accordingly.
Creating & Configuring a VoiceAgent
Step 1: Go to Dashboard > VoiceAgents > Agents
From the main Dashboard menu, open the VoiceAgents tab and select Agents.
Here, you can view existing agents or create a new one.
Click the ➕ icon to start creating a new agent.
Step 2: Set General Info
After selecting or creating a new agent, fill in the basics:
Setting | Description |
Agent Name | The internal name of the VoiceAgent. This name will appear in Call Logs, Analytics, and any Integrations (e.g., in HubSpot as the Agent Name). Choose a name that clearly identifies the agent’s purpose or target audience (e.g., "Demo Booker", "Onboarding Agent"). |
Language | Currently, only English is supported. |
Call Direction | Currently, Outbound only. |
Outbound Number | Choose a specific number or set to Automatic. If set to Automatic, the system will use a number that perfectly matches the destination country. For example, if calling a German number and a German number exists in your account, it will be used. However, the system does not pick the "closest" alternative - only exact country matches are supported. If no match is found, the Failover Number will be used. |
Failover Number | Backup number used if no local number is available. |
Maximum Call Duration | Add a hard limit (in minutes) to control costs. When the limit is reached, the call is immediately terminated - even mid-sentence. The VoiceAgent will not attempt to wrap up or finish its task. We recommend setting a buffer above the expected conversation length to avoid abrupt call endings. |
Step 3: Define Conversation Behavior
Set how your agent greets, responds, and handles the call. You can define tone, script, and dynamic responses using AI prompts.
Want help writing strong prompts? See our Prompt Writing Guide.
Call Analysis Prompt
Call Analysis Prompts are executed after the call ends, using the full transcript of the conversation. We use OpenAI (ChatGPT) to analyze the transcript and return structured data in a predefined JSON format.
What It’s For
Use this prompt to extract key business insights from the call - such as customer interest, challenges, CRM tools, go-live timeline, and demo readiness. These outputs are commonly used to populate CRM records, trigger follow-up actions, or build dashboards.
How It Works
When the call ends, the transcript is sent to ChatGPT along with your Call Analysis Prompt.
The AI is expected to return a JSON object containing exactly the fields you define.
The returned JSON will be delivered to your Call Results Endpoint (configured in the VoiceAgent settings).
Defining What to Extract
You must clearly define:
Which fields you want in the output (e.g.
crm
,wants_demo
,preferred_date
)Field types:
boolean
: true or false (e.g., wants_demo: true)string
: text (e.g., crm: "HubSpot")number
: numerical values (e.g., total_team_size: 5)array
: a list (e.g., team_sizes: [2, 3, 4])
Extraction Logic Tips
For simple fields like
crm
, the AI can usually guess correctly.For more complex ones, add logic to your prompt so the AI knows what to do.
Example logic: "Total Team Size" is a sum of all team sizes mentioned in the call.
Only information mentioned during the call will be extracted. Add clear logic in your prompt if needed.
Example Call Analysis Prompt
This is a basic example used to extract a single field (name
) from the call transcript. It follows all required formatting and response rules for use with CloudTalk's Call Analysis system.
You are a call-analyzing assistant that analyzes sales call transcripts. Your task is to extract key information from the conversation and return it in a specific JSON format.
Only include information that was explicitly mentioned in the conversation.
You must respond with a valid JSON object containing these fields: { "name": string }
Use empty strings for unknown string fields, 0 for unknown numbers, and false for unknown booleans.
Your response must be only the JSON object, without any additional text, markdown formatting, or code blocks.
Call Results Endpoint
This is where JSON results from the Call Analysis Prompt will be sent.
Recommended tools:
Webhook.site (for testing)
Zapier / Make / Tray.io
Your internal systems
Step 4: Configure Voice & Model Settings
Setting | Description |
Provider | Deepgram or Elevenlabs |
Language Model | OpenAI or Anthropic. Choose based on use cases: Fast/simple replies → GPT-4o Mini; Deep, conversational → GPT-4o / Claude Sonet; Structured & concise → Claude Haiku. |
Voice | Pick your VoiceAgent’s voice. You can preview options directly in the dashboard. Some voices include gender, accent, and tone labels (e.g., Jessica - young, American, female) |
Temperature (Elevenlabs only) | Controls response style. Low (0-0.3) = more factual and robotic, High (0.6 - 1) = more expressive and varied. Useful for adjusting the "personality" of the voice. |
Streaming Latency (Elevenlabs only) | Controls response speed. Lower latency = faster response time, but may reduce quality. Higher = smoother output, but with more delay. Low latency may reduce naturalness slightly. |
Stability (Elevenlabs only) | Higher = more consistent tone; good for professional agents. Lower = more dynamic/emotional; great for sales or casual calls. |
Similarity (Elevenlabs only) | Controls how closely the voice mimics its reference; higher = clearer, more exact match, but risks sounding synthetic. Lower = warmer and more fluid, but slightly less consistent. |
Triggering Outbound VoiceAgent Calls via API
Rather than starting calls manually, outbound VoiceAgent calls are initiated using the CloudTalk API - often as part of a workflow automation using:
HubSpot Workflows
Zapier / Make
Your internal systems
Webhook automations
To trigger a call, use the following API endpoint:
POST https://api.cloudtalk.io/v1/voice-agent/calls
Required Fields:
VoiceAgent ID - from the agent’s detail page in Dashboard
Phone number in E.164 format
Prompt Variables - These are passed to the VoiceAgent’s system prompt and are visible to the AI during the call. Use them to personalize the call (e.g.,
{{name}}
,{{order_id}}
).Only required if you use variables in the VoiceAgent prompt
Output Variables
Output Variables are not visible to the AI during the call.
Instead, they’re included in the Call Analysis Webhook payload.
Use them to attach context (e.g.,
customer_id
,internal_note
) that can be used for post-call automations or reporting.
Sample API Request:
{
"call_number": "+421903123123",
"voice_agent_id": "67eaa636148d700a8dmckaf1",
"call_properties": {
"system_prompt": {
"variables": {
"name": "John"
}
},
"output": {
"variables": {
"customer_id": 12345,
"internal_note": "lorem"
}
}
}
}
Authentication
To trigger a VoiceAgent call, you must authenticate using Basic Auth in the request headers.
Authorization: Basic BASE64(api_key_id:api_key)
Where:
api_key_id
: is your API Key ID from CloudTalk > Settings > API Keys > API Key Detail > IDapi_key
is your raw API key from CloudTalk > Settings > API Keys > API Key Detail > Key
Some HTTP clients (like Postman or most SDKs) can handle this encoding for you automatically when using Basic Auth.