This article is a detailed guide about text to speech technology
in relation with Ozeki VoIP SIP SDK. After reading through this page you
will be fully familiar with all the essential terms concerning text to speech solutions
and what you will need for creating your own solution using Ozeki VoIP SIP SDK.
The essential part of the VoIP or any other communication is voice, but there are some cases when
voice communication cannot be made or is limited. In this case a useful technology can be used, the
text to speech conversion.
Figure 1 - Text to speech conversion
Text to speech conversion means that a program reads up the text you have typed in.
This can be useful when for example a mute person wants to communicate with voice calls.
Text to speech conversion can also be used in interactive voice response (IVR) systems
if you want to have the IVR tree navigation information read out by the computer.
The following program codes are using Ozeki VoIP SIP SDK background support, therefore
you will need to download and install the SDK on your computer before starting to use
the code. You will also need Visual Studio 2010 or compatible IDE and .NET Framework installed
as the code below in written in C# language.
How to make your program read out
You can easily write the code for supporting text to speech conversion by using Ozeki VoIP SIP SDK.
The following example will show you how easy it is.
Figure 2 shows the GUI for the sample program that was extended with a new groupbox
for the text to speech functionality. This is the easiest way to make you program read out a text.
You can, of course, play this speech into a call or record it into a .wav audio file but in this case
the program simply plays it on the speaker.
Figure 2 - A softphone GUI with text to speech support
The TextToSpeech object is basically a MediaHandler that uses the Microsoft Speech engine
to perform the text to speech functionality. First of all you need to declare this
object and then you need to treat it as any
other media handler.
You need to make a new instance of the TextToSpeech object and subscribe it for the Stopped event.
Then you need to start the speaker and connect the TextToSpeech to it.
The TextToSpeech works the following way: you add text(s) to the TextToSpeech object
and start the streaming (Code 1). The TextToSpeech will read out the texts in the order you added
them to it.
This simple example reads out the text you type in onto the textbox on the GUI.
When you stop playing, the TextToSpeech object will be disposed and when
you press start again the reader will start the text to speech conversion from the beginning.
Code 2 shows the event handler for the Stop button's Click event. You need to stop the speaker
and the streaming, disconnect the handlers, dispose and free the TextToSpeech object.
Code 2 - The event handler for stopping the stream
If you want to use this reading out functionality for playing a machine speech
into a call, you need to connect the TextToSpeech object to the AudioSender that
is attached to the call. In this case the text you have typed in will be read
out into the call directly.
You can also record the machine speaking into a .wav audio file. In this case you will need a
WaveStreamRecorder object and you need to connect the TextToSpeech and the recorder
before the streaming.
This article introduced you the basic knowledge about text to speech solutions and
showed how Ozeki VoIP SIP SDK can help you to fulfill your wishes about this topic.
If you have read through this page carefully, you already have all the knowledge
you need to start on your own solution.
As you are now familiar with all the terms concerning this topic, now it is time
to take a step further and explore what other extraordinary solution Ozeki VoIP
SIP SDK can provide to you.