Microsoft speech technology SAPI 5.3 TTS (text-to-speech) application example
See also: Text To Speech (TTS) Engine
本文为微软官方 TTSApp (SAPI 5.3) 范例原文转载。
TTSApp is an example of a text-to-speech (TTS) enabled application. This sample application is intended to demonstrate many of the features for SAPI 5 in a single coherent application. It is not a full featured TTS-enabled application although the foundations of many of the options are present.
TTSApp allows you to hear the resulting audio output from the TTS process for text entered in the main window. Alternatively, you can open a file and TTSApp will speak the contents of that file.
Each word is highlighted in the text window to indicate the current TTS processing position. Features include:
SAPI5 TTSApp | The main display window of the TTSApp sample application. |
Text window | TTSApp speaks the text contained in this window using TTS. |
Speak | Initiates the TTS process. |
Voices | Selects the voice for the audio output. |
Rate | Selects the rate of speech. |
Volume | Selects the volume level of the audio output stream. |
Open File | Enables TTSApp to open and speak the contents of a stored text file. |
Pause | Pauses the TTSApp text phrase speaking process. |
Resume | Resumes the TTSApp text phrase speaking process. |
Stop | Stops the TTSApp text phrase speaking process. |
About | Displays the About TTSApp information dialog box. |
Format | Selects the audio format. |
Skip | Specifies the number of sentences to skip in the phrase speaking process. |
Speak wav | Speaks the contents of a stored wav file. |
Reset | Resets TTSApp to its original configuration setting. |
Save to wav | Saves the contents of the TTSApp audio output stream to a wav file. |
Show all events | Displays all TTSApp SAPI events. |
Process XML | Specifies that the TTS voice will speak the XML tags and their contents in the TTS process. |
Mouth Position | Displays mouth shapes for phrase elements as they are spoken. |
SAPI5 TTSApp
main window.
Use the main TTSApp window to select the configuration settings that affect the TTS process. The elements of TTSApp are listed above. Click the text in the left column for additional information.
Text window
The text content of this window is spoken by TTSApp. All text entered in this window is processed and spoken by TTSApp voice.
By default, the text content of this window is, "
Enter the text you wish spoken here.
"
Speak
Click
Speak
to initiate the text-to-speech process.
Voices
Select a voice using the drop-down list. TTSApp uses the selected voice when speaking a wav file or the contents of the text window.
Rate
Move the slide control to the right to increase the speech rate, and to the left to decrease the speech rate. The Rate level determines the number of text units spoken per minute.
Volume
Move the slide control to the right to increase the volume level, and to the left to decrease the volume level.
Open File
Click
Open File
to access the Windows
Open
dialog box. Select the file, and then click
Open
.
Pause
Click
Pause
to interrupt the TTS process.
Resume
Click
Resume
to continue the TTS process.
Stop
Click
Stop
to stop the TTS process.
About
The
About
window displays information related to TTSApp. Click
OK
to close the
About
window.
Format
Use the drop-down list in
Format
to select one of the following format rates.
Skip
Use the spin box to select the number of skipped sentences.
Skip
functions only while text is being spoken.
Speak wav
Speak wav
enables TTSApp to speak the contents of a wav file. Click
Speak wav
to access the Windows
Open
dialog box. Select a wav file from the dialog box, and then click
Open
.
Reset
Click
Reset
to reset TTSApp to its original configuration state.
Save to wav
Click
Save to wav
to save the TTSApp audio output stream to a wav file.
Show all events
Select
Show all events
to display SAPI related events in the event display window as the input text is processed by TTSApp.
Process XML
Select
Process XML
to include the XML tags and their contents in the audio output stream from TTSApp. When this option is selected, the application will parse and interpret the XML tags literally.
For example, if the Process XML option is selected, the application could be paused for the specified number of milliseconds in the SILENCE tag.
Process XML | XML tag | Result |
---|---|---|
<SILENCE MSEC = "3000"/> | The application would speak 3000 milliseconds of silence. | |
<SILENCE MSEC = "3000"/> | The application will speak the phrase, "less than silence msec equals quote three thousand quote slash greater than." |
Mouth Position
The mouth position displays the various mouth shapes and positions as TTSApp processes the input text stream.