Welcome to this comprehensive guide on creating your first AI phone agent in Kalimna. This tutorial will walk you through every step of the process, from choosing your call direction to testing your fully configured agent.
AI phone agents are transforming how businesses in Qatar, UAE, Saudi Arabia, Kuwait, Bahrain, and Oman handle customer communications. With Kalimna, you can create intelligent agents that handle both Arabic and English conversations naturally, operate 24/7, and integrate seamlessly with your existing systems.
By the end of this guide, you'll have created a fully configured AI phone agent ready to handle inbound or outbound calls for your business. Whether you're automating customer support, appointment booking, lead qualification, or any other phone-based workflow, this guide covers all the essential settings you need to know.
What You'll Learn:
- How to configure inbound vs outbound call directions
- Setting up phone numbers and naming conventions
- Choosing the right engine and language settings for Arabic/English
- Selecting and customizing AI voices for natural conversations
- Configuring advanced LLM settings for optimal performance
- Setting up transcriber, synthesizer, and voicemail detection
- Creating post-call actions and webhooks for data extraction
- Testing and refining your AI agent
Watch the Full Step-by-Step Video Tutorial
Follow along with this comprehensive video walkthrough that covers all the steps in this guide:
💡 Tip: You can watch the video while following the written steps below for the best learning experience.
Step 1 – Choose Inbound or Outbound Calls
The first decision you'll make when creating your AI phone agent is choosing the call direction. This fundamental setting determines how your agent interacts with contacts.
Inbound Calls
Choose Inbound when you want customers to call into your AI agent. This is the most common setup for:
- Customer Support: Answering common questions, troubleshooting issues, providing information
- Appointment Booking: Allowing customers to schedule, reschedule, or cancel appointments
- Information Hotlines: Providing business hours, location details, service information
- Order Status: Letting customers check on their orders or deliveries
- General Inquiries: Handling questions about products, services, or policies
Outbound Calls
Choose Outbound when you want your AI agent to make calls to customers. This is ideal for:
- Lead Qualification: Following up with potential customers, qualifying leads, scheduling demos
- Appointment Reminders: Calling customers to remind them of upcoming appointments
- Follow-up Calls: Checking in after a purchase, service, or support interaction
- Survey Calls: Collecting feedback or conducting customer satisfaction surveys
- Payment Reminders: Notifying customers about upcoming or overdue payments
🌍 Gulf Region Example:
A real estate agency in Dubai might use an inbound agent to answer property inquiries and schedule viewings in both Arabic and English, while also setting up an outbound agent to follow up with leads who filled out forms on their website.
Step 2 – Name Your Assistant and Assign a Phone Number
Choosing a Name
Give your AI assistant a clear, descriptive name that helps you identify its purpose, especially if you plan to create multiple agents for different functions.
Good naming examples:
- "Customer Support - Arabic" or "دعم العملاء"
- "Appointment Booking - Clinic"
- "Lead Qualifier - Real Estate"
- "Order Status - E-commerce"
- "Reception - Dubai Office"
Avoid generic names like "Agent 1" or "Test" if you're creating production agents. Use names that clearly indicate the agent's role, language, or department.
Assigning a Phone Number
Next, you'll assign a phone number to your AI agent. You have two options:
- Purchase a number through Kalimna: The platform allows you to buy phone numbers directly, which simplifies setup and ensures compatibility.
- Bring your own number: If you already have a business number you want to use, you can integrate it with Kalimna.
For Gulf businesses, you can obtain local numbers in Qatar (+974), UAE (+971), Saudi Arabia (+966), Kuwait (+965), Bahrain (+973), and Oman (+968), which helps build trust with local customers.
ℹ️ Note:
Detailed phone number setup and management will be covered in a separate dedicated guide. For now, focus on understanding that each agent needs to be linked to a phone number to function.
Step 3 – Choose the Engine Type
The engine is the core technology that powers your AI agent's conversation capabilities. Kalimna offers multiple engine options, each optimized for different use cases.
Recommended: Pipeline Engine
For most users, especially those just starting out, we recommend the Pipeline Engine. This is Kalimna's most advanced and reliable engine, offering:
- Optimized latency: Fast response times for natural conversations
- Better accuracy: Improved understanding of both Arabic and English
- Stable performance: Consistent results across different call scenarios
- Continuous improvements: Regular updates and enhancements
Unless you have specific technical requirements or have been advised otherwise, start with the Pipeline Engine. You can always test other engines later once you're familiar with the platform.
💡 Pro Tip:
Don't overthink the engine choice at this stage. The Pipeline Engine works excellently for the vast majority of use cases, from simple information hotlines to complex multi-step booking systems.
Step 4 – Set Language and Multilingual Options
One of Kalimna's most powerful features for Gulf businesses is native support for Arabic and English conversations, including the ability to handle both languages in a single call.
Choosing Your Primary Language
Select the main language your AI agent will use when greeting callers and conducting conversations:
- Arabic: Choose this if most of your customers speak Arabic
- English: Choose this if most of your customers speak English
Adding a Secondary Language
For businesses serving diverse customers in the Gulf region, you can enable bilingual support by adding a secondary language. This is particularly useful when:
- Your business serves both Arabic-speaking and English-speaking customers
- You operate in multilingual markets like Dubai, Doha, or Riyadh
- Customers might switch between languages mid-conversation (code-switching)
- You want to accommodate expatriates and local customers equally
To add a secondary language, simply click "Add Secondary Language" and select your preferred option. Your AI agent will then be able to understand and respond in both languages naturally.
🌍 Gulf Region Example:
A healthcare clinic in Qatar might set Arabic as the primary language but add English as a secondary language, ensuring that both Qatari nationals and international residents can communicate comfortably with the AI receptionist. The agent automatically detects which language the caller uses and responds accordingly.
✨ Best Practice:
For maximum flexibility in the Gulf region, we recommend setting up bilingual agents (Arabic + English) for customer-facing functions like support, booking, and inquiries. This ensures no caller is excluded based on language preference.
Step 5 – Choose the Voice
The voice you select for your AI agent significantly impacts how customers perceive your business. A natural, professional voice builds trust and enhances the overall customer experience.
Built-in Voice Library
Kalimna provides a curated selection of high-quality AI voices in multiple languages and styles:
- Arabic voices: Native-sounding voices trained on Gulf dialects
- English voices: Clear, professional voices with various accents
- Male and female options: Choose based on your brand and use case
- Different tones: Professional, friendly, formal, or conversational
Browse through the available voices and test each one to find the best fit for your business personality and target audience.
Custom Voice Cloning
For businesses that want a truly unique identity, Kalimna offers voice cloning. This advanced feature allows you to:
- Clone the voice of a specific person (with their permission)
- Maintain brand consistency across all customer touchpoints
- Create a distinctive, recognizable voice for your business
- Ensure your AI agent sounds exactly the way you want
Voice cloning requires providing audio samples and typically takes a few days to process, but the result is a completely custom voice that represents your brand perfectly.
💡 Voice Selection Tips:
- Match your industry: Professional services might prefer formal tones, while hospitality might choose warmer, friendlier voices
- Consider your audience: A voice that resonates with young tech users might differ from one targeting corporate executives
- Test with real users: Ask colleagues or trusted customers for feedback before finalizing your choice
- Think long-term: Choose a voice you'll be comfortable with for months or years, as changing it later can affect brand recognition
Step 6 – Time Zone, Ambience Sound & Filler Audios
Time Zone Configuration
Set the time zone your AI assistant operates in. This is crucial for:
- Accurate appointment booking and scheduling
- Correct time-based greetings ("Good morning," "Good evening")
- Proper logging and analytics timestamps
- Business hours enforcement if configured
Gulf region time zones:
- Qatar, UAE, Oman: Gulf Standard Time (GST, UTC+4)
- Saudi Arabia, Kuwait, Bahrain: Arabia Standard Time (AST, UTC+3)
Ambience Sound
Ambience sounds are subtle background noises that make your AI agent sound more natural and less robotic. Options include:
- Office ambience: Light background office sounds
- None: Clean, silent background (most common choice)
- Custom options: Depending on your use case
Most businesses choose no ambience for clarity, but you can experiment to see what works best for your brand.
Filler Audios
Filler audios are short, pre-recorded responses that the AI can play while processing information or retrieving data. These help create a more natural conversation flow and reduce awkward silences.
Examples of filler phrases:
- "Let me check that for you..."
- "One moment please..."
- "I'm looking that up now..."
- Arabic equivalents: "لحظة من فضلك..." or "دعني أتحقق من ذلك..."
You can upload custom filler audios or use Kalimna's built-in options. These are optional but can significantly improve the perceived responsiveness of your AI agent.
ℹ️ Recommendation:
Start with the default settings for these options. Once your agent is running and you've observed real conversations, you can fine-tune ambience and filler audios based on actual usage patterns and customer feedback.
Step 7 – Advanced Settings: LLM Model & Temperature
Now we're getting into the more technical settings that control how your AI agent thinks and responds. Don't worry—we'll explain everything in plain language.
LLM Model Selection
LLM stands for "Large Language Model"—the AI brain that powers your agent's understanding and responses. Kalimna offers multiple LLM options from different providers (OpenAI, Anthropic, Google, etc.), each with different strengths.
Why the LLM matters:
- Understanding quality: Better models understand complex questions more accurately
- Response quality: More advanced models generate more natural, contextually appropriate responses
- Language support: Some models excel at specific languages (important for Arabic)
- Cost: More powerful models typically cost more per call
For Arabic and English conversations, we recommend starting with the platform's default recommended model, which is already optimized for Gulf region use cases. You can experiment with different models later if needed.
Temperature Setting
Temperature controls how creative or predictable your AI agent is:
- Lower temperature (e.g., 0.2-0.4): More focused, consistent, and predictable responses. The AI will stick closely to facts and provide similar answers for similar questions.
- Higher temperature (e.g., 0.7-1.0): More creative and varied responses. The AI will be more conversational but might occasionally deviate from expected patterns.
Recommended temperature settings by use case:
- 0.2-0.4: Appointment booking, data collection, function calls, factual information
- 0.5-0.7: General customer support, FAQs, balanced conversations
- 0.7-1.0: Sales calls, friendly conversations, empathetic support
✨ Best Practice for Function Calls:
If your AI agent will be making function calls (like booking appointments in a calendar, updating CRM records, or checking inventory), use a lower temperature (0.2-0.4). This ensures consistent, reliable behavior and reduces the chance of unexpected responses that might interfere with automated actions.
Step 8 – Interruption, Duration, Noise Cancelling, Voicemail & Recording
This section covers important call handling settings that affect how your AI agent manages conversations.
User Interruption
This setting controls whether callers can interrupt the AI while it's speaking. You have two options:
- Allow interruptions (Recommended): Callers can jump in and speak while the AI is talking, creating a more natural conversation. The AI will stop and listen when interrupted.
- Disable interruptions: The AI will finish speaking before listening to the caller. This can be useful for delivering important information that shouldn't be cut off.
For most use cases, especially customer support and bookings, allowing interruptions creates a better, more natural experience.
Duration Settings
These self-explanatory settings control call timing:
- Maximum call duration: Longest a call can last before automatically ending (e.g., 30 minutes)
- Idle timeout: How long to wait for user response before considering them unresponsive
- Initial silence timeout: How long to wait for the caller to speak after the agent's greeting
The default values work well for most scenarios, but you can adjust based on your specific needs. For example, technical support calls might need longer maximum durations than quick information lookups.
Noise Cancelling
Recommendation: Keep this OFF. The transcriber (which we'll configure in Step 9) already handles background noise filtering effectively. Running both noise cancellation systems can sometimes cause issues or unnecessary processing. Trust the transcriber to handle noise management.
Voicemail Detection
For outbound calls, this setting detects when a call reaches voicemail instead of a real person. You can configure the agent to:
- Hang up immediately: End the call as soon as voicemail is detected (saves costs)
- Leave a message: Continue speaking and leave a pre-configured voicemail message
For inbound calls, voicemail detection is typically not relevant and should be disabled.
Call Recording
This option allows you to record all calls for quality assurance, training, or compliance purposes.
⚠️ Important Legal Notice:
Only enable call recording if it complies with local regulations in your region.Many countries and states require consent from all parties before recording calls. In the Gulf region:
- Check your local telecommunications regulations
- Include a recording disclosure in your agent's greeting if required
- Consult with legal counsel to ensure compliance
- Implement proper data protection and storage policies
Step 9 – Synthesizer and Transcriber Settings
These settings control how your AI agent speaks (synthesizer) and listens (transcriber). Getting these right is crucial for natural, responsive conversations.
Synthesizer Settings
The synthesizer converts your AI's text responses into spoken words. You can fine-tune:
Speech Speed
Controls how fast the AI speaks. The value typically ranges from 0.75 (slower) to 1.5 (faster), where 1.0 is normal speed.
Recommended: 1.10 (slightly faster than normal)
- 1.0-1.1: Natural, easy to understand—good for most cases
- 0.8-0.9: Slower, deliberate—good for delivering important information or speaking to elderly customers
- 1.2-1.3: Faster, efficient—good for quick information delivery or tech-savvy audiences
Stability
Controls how consistent the voice sounds. Higher stability means more predictable, uniform speech. Lower stability allows more natural variation but can be less consistent.
Recommended: 0.70
This value provides a good balance between natural-sounding variation and consistent quality. Most users find 0.60-0.80 works well without needing adjustment.
Transcriber Settings
The transcriber converts the caller's spoken words into text that the AI can understand. This is critical for accurate conversations.
Provider
Choose the transcription service provider. Different providers have different strengths with various languages and accents. The platform default is optimized for Arabic and English, so start there unless you have specific needs.
Endpoint Type
This setting determines how the transcriber knows when the caller has finished speaking. Options include:
- Voice Activity Detection (VAD) - Recommended: Automatically detects when the caller has stopped speaking by analyzing audio patterns. This is the most natural option for most conversations.
- Other options: Alternative detection methods that may be useful for specific edge cases, but VAD works best for general use.
Endpoint Sensitivity
This critical setting controls how long the AI waits after the user stops speaking before assuming they're done and responding. The value typically ranges from 0.0 (respond immediately) to 2.0+ (wait longer).
Recommended: 0.70
Finding the right balance:
- Too low (e.g., 0.3): The AI responds very quickly, which feels snappy but might cut off callers who pause briefly between sentences. Can feel interruptive.
- Too high (e.g., 1.5): The AI waits a long time, ensuring callers finish completely, but creates awkward long silences that make the conversation feel sluggish and unnatural.
- Just right (0.7): Provides enough time for natural pauses without creating uncomfortable silences. The caller feels heard, and the AI responds promptly.
💡 Pro Tip: Test and Adjust
The "perfect" endpoint sensitivity can vary based on your use case and caller demographics:
- Quick transactions: Try 0.5-0.6 for faster-paced conversations
- Complex discussions: Try 0.8-1.0 when callers need more time to think
- Elderly/deliberate speakers: Try 1.0-1.2 to accommodate slower speech patterns
Step 10 – Post-Call Actions, Variables & Webhooks
One of Kalimna's most powerful features is the ability to extract data from conversations and send it to your other systems. This is what transforms your AI agent from just a voice interface into a fully integrated business tool.
Understanding Variables
Variables are pieces of information your AI agent extracts from conversations and stores for later use. Think of them as data fields that get filled in during the call.
Default variables: Every agent has two built-in variables:
- Call Summary: Automatic summary of what happened in the call
- Call Transcript: Full text transcript of the conversation
Creating Custom Variables
You can create your own variables to capture specific information relevant to your business. To add a variable:
- Click "Add Variable"
- Name your variable (e.g., "customer_email", "appointment_date", "lead_score")
- Choose the type:
- String: Text data (names, emails, addresses, comments)
- Number: Numeric data (quantities, ages, IDs, scores)
- Boolean: True/false values (interested?, existing_customer?, needs_callback?)
- Describe the variable: Write a clear description so the AI knows what to extract. For example: "Extract the customer's email address if they provide one during the conversation"
Example Variables for Common Use Cases:
Appointment Booking:
- • customer_name (String): "The caller's full name"
- • phone_number (String): "The caller's phone number for confirmation"
- • appointment_date (String): "The preferred appointment date"
- • appointment_time (String): "The preferred appointment time"
- • service_type (String): "The type of service requested"
Lead Qualification:
- • lead_email (String): "The prospect's email address"
- • company_name (String): "The company the lead represents"
- • budget_range (String): "The prospect's budget range"
- • qualified (Boolean): "Whether the lead meets qualification criteria"
- • next_action (String): "What follow-up action is needed"
Customer Support:
- • issue_category (String): "The type of issue (billing, technical, etc.)"
- • ticket_number (String): "Existing support ticket number if mentioned"
- • resolved (Boolean): "Whether the issue was resolved"
- • urgency_level (String): "How urgent the issue is (low, medium, high)"
Setting Up Webhooks
Once you've defined variables, you need somewhere to send that data. That's where webhooks come in.
A webhook is simply a URL endpoint where Kalimna sends the extracted data after each call ends. Your other systems (CRM, database, scheduling software, etc.) receive this data and can act on it automatically.
Common webhook destinations:
- CRM systems: Salesforce, HubSpot, Zoho, Microsoft Dynamics
- Scheduling tools: Calendly, Acuity, custom booking systems
- Databases: Your own database or data warehouse
- Automation platforms: Zapier, Make (formerly Integromat), n8n
- Communication tools: Slack, email, SMS notification systems
To create a webhook, simply provide the URL endpoint and Kalimna will POST the extracted variables in JSON format after each call.
ℹ️ Note:
Post-call actions, variables, and webhooks are advanced features that significantly extend Kalimna's capabilities. A separate, detailed guide dedicated to this topic will be available soon, covering webhook security, error handling, data transformation, and integration examples with popular platforms.
Step 11 – Testing Your AI Phone Agent
Once you've configured all the settings and clicked "Create Assistant" in the top right, your AI agent is ready—but your work isn't done yet. Testing is crucial to ensure everything works as expected before going live with real customers.
Method 1: Speak to Assistant (Recommended)
The "Speak to Assistant" feature lets you have real voice conversations with your AI agent. You have three options:
Testing Options:
1. Have It Call You
Enter your phone number and the agent will call you. This is the most realistic test since it simulates the actual phone experience, including call quality, latency, and audio clarity.
2. Talk Via Web Call
Test directly from your browser using your computer's microphone and speakers. Quick and convenient for rapid iteration, but audio quality might differ from actual phone calls.
3. Send Public Demo Link
Generate a shareable link that anyone can use to test your agent. Perfect for getting feedback from colleagues, team members, or stakeholders without giving them platform access.
Method 2: Test Agent (Chat-Based)
If you can't make voice calls at the moment, use the "Test Agent" feature. This lets you chat with the agent via text to see how it responds to different questions and scenarios.
While text testing is useful for quick checks, keep in mind that voice and text interactions can differ. Always do final testing with actual voice calls before going live.
What to Test
Don't just test once and call it done. Run through multiple scenarios to ensure your agent handles different situations:
✓ Test Happy Path Scenarios
Test the ideal, straightforward conversations where everything goes right:
- • Customer clearly states their request
- • All required information is provided smoothly
- • Transaction or booking completes successfully
✓ Test Edge Cases & Problems
Try scenarios where things don't go perfectly:
- • Unclear or rambling customer requests
- • Customers who interrupt frequently
- • Questions outside the agent's knowledge
- • Requests to speak with a human
- • Background noise or poor call quality
✓ Test Both Languages (If Bilingual)
If you enabled Arabic and English:
- • Test complete conversations in Arabic
- • Test complete conversations in English
- • Test code-switching (starting in one language, switching to another mid-call)
- • Check that responses maintain the correct language
✓ Test Different Accents & Dialects
If possible, have people with different accents test:
- • Different Gulf Arabic dialects (Qatari, Emirati, Saudi, Kuwaiti, etc.)
- • Non-native English speakers
- • Different age groups (younger vs. older speakers)
✓ Test Data Extraction
If you configured variables and webhooks:
- • Verify that all variables are captured correctly
- • Check that data types are correct (strings, numbers, booleans)
- • Confirm webhooks fire successfully
- • Validate that data arrives in your destination system properly formatted
✨ Best Practice: Iterative Testing
Creating a perfect AI agent is an iterative process. After initial testing:
- 1. Test and identify issues (response quality, timing, misunderstandings)
- 2. Adjust settings (prompts, temperature, endpoint sensitivity, etc.)
- 3. Test again to see if improvements worked
- 4. Repeat until you're satisfied with performance
- 5. Monitor live calls and continue refining based on real usage
What's Next?
Congratulations! You've now created and configured your first AI phone agent in Kalimna. But this is just the beginning. Your agent can become much more powerful as you explore advanced features and optimizations.
Recommended Next Steps
📚 Knowledge Base Setup
Upload documents, FAQs, and company information to make your agent more knowledgeable about your specific business, products, and services.
View Guide →🔧 Mid-Call Tools
Enable your agent to perform actions during calls by connecting to external APIs, CRMs, calendars, and your own systems.
View Guide →📞 Use Your Own Number
Connect your existing business phone number to Kalimna through outbound verification or SIP trunk integration.
View Guide →📊 Call History & Analytics
Monitor all calls, review recordings and transcripts, track performance metrics, and continuously improve your AI agents.
View Guide →🚫 Blacklist & Spam Protection
Block spam callers, telemarketing bots, and abusive numbers to keep your AI call center focused on real customers.
View Guide →🎙️ Voice Cloning
Create a custom, unique voice for your AI agent that represents your brand perfectly and stands out from generic AI voices.
Guide coming soon
Ready to Go Live?
Whether you're in Doha, Dubai, Riyadh, Kuwait City, Manama, or Muscat, you now have the foundation for your AI call center in Arabic and English. Start small, test thoroughly, and scale as you gain confidence.