How to convert Text to Speech
using C# and Google

download Download: google-tts.zip

This example demonstrates how to implement the text-to-speech feature in c#, which is able to convert text messages to human speech. The conversion is based on the powerful Google text-to-speech engine. For example the generated audio stream can be used in a SIP call or stored in mp3. You can decide how to use the audio stream.

To fully understand this example, you might have to study the How to register to a SIP PBX and How to ring a SIP extension chapters first. To try this example, you need to have Ozeki VoIP SIP SDK installed, and a reference to OzekiSDK.dll should be added to your Visual Studio project.

text to speech conversion
Figure 1 - Text to Speech conversion

What is Text-to-Speech used for?

A text-to-speech (TTS) system generates human voice from any text written in a wide variety of languages supported by Google. Ozeki VoIP SDK can automatically read these text messages.

Text-to-Speech refers to the ability to read any written text message and forward the audio stream to any connection or file format. This TTS engine supports different languages and specialized vocabularies. A GoogleTTS class in Ozeki VoIP SIP SDK manages the text to speech function.

You can listen to the computer generated speech on your headset or speaker. The speech is digitally represented by a pulse-code modulator. PCM is an uncompressed audio format.

How to implement Google text-to-speech in C#?

Everything you need is included in the Ozeki VoIP SIP SDK.

Ozeki VoIP SIP SDK contains the c# GoogleTTS class for the purpose to create instances which are able to convert text from a wide variety of languages to speech. The language is provided as a parameter of the created instance (line 78 in the below example). This instance can be attached to the media sender through the correct sender object (line 81), so the text can be converted by the GoogleTTS class. The example below shows how to convert United Kingdom English text message into speech by reading a simple string provided as a parameter (line 83 to 85).

Rings VoIP phone and after picking up the phone
a text to speech message will be read (C# example)

using Ozeki.Media;
using Ozeki.VoIP;
using System;

namespace Google_Text_To_Speech
{
    class Program
    {
        static ISoftPhone _softphone;   // softphone object
        static IPhoneLine _phoneLine;   // phoneline object
        static IPhoneCall _call;
        static MediaConnector _connector;

        static GoogleTTS googleAPI;
        static PhoneCallAudioSender _mediaSender;

        static void Main(string[] args)
        {
            _softphone = SoftPhoneFactory.CreateSoftPhone(5000, 10000);

            //   SIP account registration data
            //	 supplied by your VoIP service provider
            var registrationRequired = true;
            var userName = "112";
            var displayName = "112";
            var authenticationId = "112";
            var registerPassword = "112";
            var domainHost = "192.168.115.60";
            var domainPort = 5060;

            var account = new SIPAccount(registrationRequired, displayName,
            userName, authenticationId, registerPassword, domainHost, domainPort);
            
            _mediaSender = new PhoneCallAudioSender();
            _connector = new MediaConnector();
        
            RegisterAccount(account);

            Console.ReadLine();
        }

        static void RegisterAccount(SIPAccount account)
        {
            try
            {
                _phoneLine = _softphone.CreatePhoneLine(account);
                _phoneLine.RegistrationStateChanged += line_RegStateChanged;
                _softphone.RegisterPhoneLine(_phoneLine);
            }
            catch (Exception ex)
            {
                Console.WriteLine("Error during SIP registration: " + ex);
            }
        }

        static void line_RegStateChanged(object sender, RegistrationStateChangedArgs e)
        {
            if (e.State == RegState.NotRegistered || e.State == RegState.Error)
                Console.WriteLine("Registration failed!");

            if (e.State == RegState.RegistrationSucceeded)
            {
                Console.WriteLine("Registration succeeded - Online!");
                CreateCall();
            }
        }

        private static void CreateCall()
        {
            var numberToDial = "100";
            _call = _softphone.CreateCallObject(_phoneLine, numberToDial);
            _call.CallStateChanged += call_CallStateChanged;
            _call.Start();
        }
        
        static void SetupTextToSpeech()
        {
            googleAPI = new GoogleTTS(GoogleLanguage.English_United_Kingdom);
            
            _mediaSender.AttachToCall(_call);
            _connector.Connect(googleAPI, _mediaSender);

            googleAPI.AddAndStartText(@"Test message
            							from google text
            							to speech api through voip call.");
        }

        static void call_CallStateChanged(object sender, CallStateChangedArgs e)
        {
            Console.WriteLine("Call state: {0}.", e.State);

            if (e.State == CallState.Answered)
                SetupTextToSpeech();
        }
    }
}

Related Pages

More information