There are multiple text-to-speech implementations/services today that are capable of producing natural-sounding speech audio from written text. The three best known examples of TTS services are Amazon Polly, Google Cloud Text-to-Speech, and Microsoft Azure Speech Service.
The cool feature of TTS services is support for SSML (Speech Synthesis Markup Language)—an XML-based language that allows you to control various aspects of speech such as volume, pitch, rate, and, most interestingly, pronunciation.
In SSML, you can customize the pronunciation of a word in the audio response using the <phoneme>
tag. Here's an example of SSML that changes/overrides the pronunciation of the word "potatoes" by explicitly providing the phonetic transcription:
<?xml version="1.0"?> <speak xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-GB"> She's growing carrots and <phoneme alphabet="ipa" ph="ˈblɑː">potatoes</phoneme> in her garden this year. </speak>
Depending on the TTS implementation, the pronunciation strings in SSML can use different alphabets which may also be usually language-specific. The universal standard for phonetic transcription is the International Phonetic Alphabet (IPA), supported by many TTS services.
In the SSML example above, the custom pronunciation is provided using the International Phonetic Alphabet (IPA)—the standardized system of phonetic notation created by the International Phonetic Association.
As a written representation of speech sounds, IPA is a very powerful and complex tool used extensively in (computational) linguistics. You can also find IPA transcriptions in the pronunciation section of many dictionaries and linguistics textbooks.
When the pronunciation is provided explicitly as in the SSML above, it doesn't actually matter what text is inside the <phoneme>
tag. This means that we're essentially dealing with "IPA-to-speech" rather than "text-to-speech".
By leveraging IPA, a universal phonetic notation, you can achieve a high level of control over SSML's audio output and create tailored audio experiences for a variety of applications.
A poster featuring the phonetic transcription of "صحراء" in the International Phonetic Alphabet (IPA).
Type phonetic symbols with ease. Perfect for linguists, language students, and anyone who needs to type IPA characters on a regular basis.
A collection of phonetic transcriptions showcasing the beauty of phonetic notation.
Explore the fine details of the characters used to communicate the sounds of human speech.
An online tool for converting International Phonetic Alphabet (IPA) phonetic transcriptions into speech audio using speech synthesis technology. Type /həˈloʊ/ and listen to the sound of IPA.
An online IPA chart.
A pronunciation guide for common personal names.
A pronunciation guide for common brand names, using the International Phonetic Alphabet (IPA).
An interactive/clickable International Phonetic Alphabet chart with audio pronunciations.
All prices listed are in United States Dollars (USD). Visual representations of products are intended for illustrative purposes. Actual products may exhibit variations in color, texture, or other characteristics inherent to the manufacturing process. The products' design and underlying technology are protected by applicable intellectual property laws. Unauthorized reproduction or distribution is prohibited.