CloudTalk VoiceAgents allow you to create AI-powered virtual agents that can make outbound calls, guide conversations, and extract insights - all within your existing CloudTalk account.
This article explains how to enable, set up, and launch your first VoiceAgent directly from the CloudTalk Dashboard.
Enabling the VoiceAgent Feature
Prerequisites
VoiceAgents are not enabled by default.
If the VoiceAgents section is not visible in your CloudTalk Dashboard, contact CloudTalk Support to activate the feature for your account.
Once activated, the VoiceAgents section will appear in the main menu of your Dashboard.
Important: VoiceAgent calls are charged per minute.
Charges are automatically deducted from your account balance or applied to your monthly invoice, depending on your billing setup.
To manage usage, a maximum call duration can be configured for each agent.
Creating & Configuring a VoiceAgent
Step 1: Go to Dashboard > VoiceAgents > Agents
From the main Dashboard menu, open the VoiceAgents tab and select Agents.
Here, you can view existing agents or create a new one.
Click the ➕ icon to start creating a new agent.
Step 2: Set General Info
After selecting or creating a new agent, fill in the basics:
Setting | Description |
Agent Name | The internal name of the VoiceAgent. This name will appear in Call Logs, Analytics, and any Integrations (e.g., in HubSpot as the Agent Name). Choose a name that clearly identifies the agent’s purpose or target audience (e.g., "Demo Booker", "Onboarding Agent"). |
Language | Currently, only English is supported. |
Call Direction | You can configure your VoiceAgent to handle either inbound or outbound calls, depending on your use case. Inbound VoiceAgents are ideal for answering incoming calls, providing information, or qualifying callers before transferring them to a human agent. Outbound VoiceAgents are great for reaching out to leads, following up on tasks, or booking meetings. When setting up an inbound VoiceAgent, you also have the option to define a maximum call duration. This lets you control how long the AI stays on the call. If the limit is reached, the call will end automatically, even if the conversation is still ongoing. |
Outbound Number | Choose a specific number or set to Automatic. If set to Automatic, the system will use a number that perfectly matches the destination country. For example, if calling a German number and a German number exists in your account, it will be used. However, the system does not pick the "closest" alternative - only exact country matches are supported. If no match is found, the Failover Number will be used. |
Failover Number | Backup number used if no local number is available. |
Maximum Call Duration | Add a hard limit (in minutes) to control costs. When the limit is reached, the call is immediately terminated - even mid-sentence. The VoiceAgent will not attempt to wrap up or finish its task. We recommend setting a buffer above the expected conversation length to avoid abrupt call endings. |
Step 3: Define Conversation Behavior
Set how your agent greets, responds, and handles the call. You can define tone, script, and dynamic responses using AI prompts.
Want help writing strong prompts? See our Prompt Writing Guide.
Call Analysis Prompt
Call Analysis Prompts are executed after the call ends, using the full transcript of the conversation. We use OpenAI (ChatGPT) to analyze the transcript and return structured data in a predefined JSON format.
What It’s For
Use this prompt to extract key business insights from the call - such as customer interest, challenges, CRM tools, go-live timeline, and demo readiness. These outputs are commonly used to populate CRM records, trigger follow-up actions, or build dashboards.
How It Works
When the call ends, the transcript is sent to ChatGPT along with your Call Analysis Prompt.
The AI is expected to return a JSON object containing exactly the fields you define.
The returned JSON will be delivered to your Call Results Endpoint (configured in the VoiceAgent settings).
Defining What to Extract
You must clearly define:
Which fields you want in the output (e.g.
crm
,wants_demo
,preferred_date
)Field types:
boolean
: true or false (e.g., wants_demo: true)string
: text (e.g., crm: "HubSpot")number
: numerical values (e.g., total_team_size: 5)array
: a list (e.g., team_sizes: [2, 3, 4])
Extraction Logic Tips
For simple fields like
crm
, the AI can usually guess correctly.For more complex ones, add logic to your prompt so the AI knows what to do.
Example logic: "Total Team Size" is a sum of all team sizes mentioned in the call.
Only information mentioned during the call will be extracted. Add clear logic in your prompt if needed.
Example Call Analysis Prompt
This is a basic example used to extract a single field (name
) from the call transcript. It follows all required formatting and response rules for use with CloudTalk's Call Analysis system.
You are a call-analyzing assistant that analyzes sales call transcripts. Your task is to extract key information from the conversation and return it in a specific JSON format.
Only include information that was explicitly mentioned in the conversation.
You must respond with a valid JSON object containing these fields: { "name": string }
Use empty strings for unknown string fields, 0 for unknown numbers, and false for unknown booleans.
Your response must be only the JSON object, without any additional text, markdown formatting, or code blocks.
Call Results Endpoint
This is where JSON results from the Call Analysis Prompt will be sent.
Recommended tools:
Webhook.site (for testing)
Zapier / Make / Tray.io
Your internal systems
Step 4: Configure Voice & Model Settings
Setting | Description |
Provider | Deepgram or Elevenlabs |
Language Model | OpenAI or Anthropic. Choose based on use cases: Fast/simple replies → GPT-4o Mini; Deep, conversational → GPT-4o / Claude Sonet; Structured & concise → Claude Haiku. |
Voice | Pick your VoiceAgent’s voice. You can preview options directly in the dashboard. Some voices include gender, accent, and tone labels (e.g., Jessica - young, American, female) |
Temperature (Elevenlabs only) | Controls response style. Low (0-0.3) = more factual and robotic, High (0.6 - 1) = more expressive and varied. Useful for adjusting the "personality" of the voice. |
Streaming Latency (Elevenlabs only) | Controls response speed. Lower latency = faster response time, but may reduce quality. Higher = smoother output, but with more delay. Low latency may reduce naturalness slightly. |
Stability (Elevenlabs only) | Higher = more consistent tone; good for professional agents. Lower = more dynamic/emotional; great for sales or casual calls. |
Similarity (Elevenlabs only) | Controls how closely the voice mimics its reference; higher = clearer, more exact match, but risks sounding synthetic. Lower = warmer and more fluid, but slightly less consistent. |
Step 5: Set Up Call Transfers (Optional)
If your VoiceAgent might need to hand the conversation over to a real person, you can enable Call Transfers.
Once turned on, you’ll be able to decide who should take over the call, either a specific agent or a group, and what kind of context they receive beforehand. This is helpful when the VoiceAgent reaches the end of its purpose or encounters a scenario that requires human support.
Choosing the Transfer Target
You can choose between two transfer types:
Agent – Pick an individual agent from your CloudTalk team.
Group – Route calls to a ring group (like Support, Sales, or Returns) for broader handling.
Once you select the destination, you’ll see a field labeled Transfer Prompt. This is not something the VoiceAgent says out loud. It is an instruction that tells the AI when to transfer the call based on what the caller says.
Example for General Transfer Instructions
## Transfer Instructions
When a user expresses ANY desire to:
- Speak with a human
- Talk to someone else
- Be transferred
- Connect with a real person
- Any similar request
You can also tailor these prompts to fit your business workflows
## Transfer Instructions
Only transfer the call when customer mentioned they are actively looking for a new solution and have at least 10 employees. Otherwise navigate to trial form on our website - acme.com/trial.
Triggering Outbound VoiceAgent Calls via API
Rather than starting calls manually, outbound VoiceAgent calls are initiated using the CloudTalk API - often as part of a workflow automation using:
HubSpot Workflows
Zapier / Make
Your internal systems
Webhook automations
To trigger a call, use the following API endpoint:
POST https://api.cloudtalk.io/v1/voice-agent/calls
Required Fields:
VoiceAgent ID - from the agent’s detail page in Dashboard
Phone number in E.164 format
Prompt Variables - These are passed to the VoiceAgent’s system prompt and are visible to the AI during the call. Use them to personalize the call (e.g.,
{{name}}
,{{order_id}}
).Only required if you use variables in the VoiceAgent prompt
Output Variables
Output Variables are not visible to the AI during the call.
Instead, they’re included in the Call Analysis Webhook payload.
Use them to attach context (e.g.,
customer_id
,internal_note
) that can be used for post-call automations or reporting.
Sample API Request:
{
"call_number": "+421 903 123 123",
"voice_agent_id": "67eaa636148d700a8dmckaf1",
"call_properties": {
"system_prompt": {
"variables": {
"name": "John"
}
},
"output": {
"variables": {
"customer_id": 12345,
"internal_note": "lorem"
}
}
}
}
Authentication
To trigger a VoiceAgent call, you must authenticate using Basic Auth in the request headers.
Authorization: Basic BASE64(api_key_id:api_key)
Where:
api_key_id
: is your API Key ID from CloudTalk > Settings > API Keys > API Key Detail > IDapi_key
is your raw API key from CloudTalk > Settings > API Keys > API Key Detail > Key
Some HTTP clients (like Postman or most SDKs) can handle this encoding for you automatically when using Basic Auth.