Text to speech model in Python

In this tutorial, we will learn how we can create a text to speech model in Python?
Submitted by Abhinav Gangrade, on June 04, 2020

Text to speech model is a small application or bot which converts the given text into speech.

The module that we use for text to speech conversion: pyttsx3

pyttsx3 is a text-to-speech conversion library in Python. It is very easy to use tool which converts the entered text into speech.

Download the module:

pip install pyttsx3 (In your terminal or command prompt)

If you are using Pycharm User: Go to the project interpreter and install the module.

The various functions that we will use in this model are,

engine=pyttsx3.init(): The init function is the main function, we have to use this function every time. This function initializes the connection and creates an engine and we can perform all the things on the engine created by the .init() function
engine.say(text): This function will convert the text to speech (text is the input from the user)
engine.runAndWait(): This function will make the speech audible in the system, if you don't write this command then the speech will not be audible to you.
engine.setProperty(): This method sets different properties of the model.
engine.getProperty(): This method is used to get the details with the help of this function.
voice: If we want the Voices and ascent of the model, we can get it with the help of the voice method.

Syntax:

voice=engine.getProperty("voices")
print("voice")

Output:

[<pyttsx3.voice.Voice object at 0x7f1656128b70>, 
<pyttsx3.voice.Voice object at 0x7f165609c978>, 
<pyttsx3.voice.Voice object at 0x7f165609ca20>, 
<pyttsx3.voice.Voice object at 0x7f1655bf9f60>, 
<pyttsx3.voice.Voice object at 0x7f16559c6ac8>, 
<pyttsx3.voice.Voice object at 0x7f16559d7b70>, 
<pyttsx3.voice.Voice object at 0x7f16559d7c18>, 
<pyttsx3.voice.Voice object at 0x7f16559d7cc0>, 
<pyttsx3.voice.Voice object at 0x7f16559d7d68>, 
<pyttsx3.voice.Voice object at 0x7f16559d7e10>, 
<pyttsx3.voice.Voice object at 0x7f16559d7eb8>, 
<pyttsx3.voice.Voice object at 0x7f16559d7f60>, 
<pyttsx3.voice.Voice object at 0x7f16559d7f98>, 
<pyttsx3.voice.Voice object at 0x7f16559d7fd0>, 
<pyttsx3.voice.Voice object at 0x7f1650761048>, 
<pyttsx3.voice.Voice object at 0x7f1650761080>, 
<pyttsx3.voice.Voice object at 0x7f16507610b8>, 
<pyttsx3.voice.Voice object at 0x7f16507610f0>, 
<pyttsx3.voice.Voice object at 0x7f1650761128>, 
<pyttsx3.voice.Voice object at 0x7f16507611d0>, 
<pyttsx3.voice.Voice object at 0x7f1650761208>, 
<pyttsx3.voice.Voice object at 0x7f1650761240>, 
<pyttsx3.voice.Voice object at 0x7f16507612e8>, 
<pyttsx3.voice.Voice object at 0x7f1650761320>, 
<pyttsx3.voice.Voice object at 0x7f16507613c8>, 
<pyttsx3.voice.Voice object at 0x7f1650761400>, 
<pyttsx3.voice.Voice object at 0x7f16507614a8>, 
<pyttsx3.voice.Voice object at 0x7f16507614e0>, 
<pyttsx3.voice.Voice object at 0x7f1650761518>, 
<pyttsx3.voice.Voice object at 0x7f16507615c0>, 
<pyttsx3.voice.Voice object at 0x7f16507615f8>, 
<pyttsx3.voice.Voice object at 0x7f1650761630>, 
<pyttsx3.voice.Voice object at 0x7f1650761668>, 
<pyttsx3.voice.Voice object at 0x7f16507616a0>, 
<pyttsx3.voice.Voice object at 0x7f16507616d8>, 
<pyttsx3.voice.Voice object at 0x7f1650761710>, 
<pyttsx3.voice.Voice object at 0x7f16507617b8>, 
<pyttsx3.voice.Voice object at 0x7f1650761860>, 
<pyttsx3.voice.Voice object at 0x7f1650761898>, 
<pyttsx3.voice.Voice object at 0x7f1650761940>, 
<pyttsx3.voice.Voice object at 0x7f16507619e8>, 
<pyttsx3.voice.Voice object at 0x7f1650761a90>, 
<pyttsx3.voice.Voice object at 0x7f1650761ac8>, 
<pyttsx3.voice.Voice object at 0x7f1650761b00>, 
<pyttsx3.voice.Voice object at 0x7f1650761ba8>, 
<pyttsx3.voice.Voice object at 0x7f1650761be0>, 
<pyttsx3.voice.Voice object at 0x7f1650761c18>, 
<pyttsx3.voice.Voice object at 0x7f1650761cc0>, 
<pyttsx3.voice.Voice object at 0x7f1650761d68>, 
<pyttsx3.voice.Voice object at 0x7f1650761e10>, 
<pyttsx3.voice.Voice object at 0x7f1650761e48>, 
<pyttsx3.voice.Voice object at 0x7f1650761ef0>, 
<pyttsx3.voice.Voice object at 0x7f1650761f98>, 
<pyttsx3.voice.Voice object at 0x7f1650765080>, 
<pyttsx3.voice.Voice object at 0x7f16507650b8>, 
<pyttsx3.voice.Voice object at 0x7f16507650f0>, 
<pyttsx3.voice.Voice object at 0x7f1650765198>, 
<pyttsx3.voice.Voice object at 0x7f1650765240>, 
<pyttsx3.voice.Voice object at 0x7f1650765278>, 
<pyttsx3.voice.Voice object at 0x7f1650765320>, 
<pyttsx3.voice.Voice object at 0x7f16507653c8>, 
<pyttsx3.voice.Voice object at 0x7f1650765400>, 
<pyttsx3.voice.Voice object at 0x7f16507654a8>, 
<pyttsx3.voice.Voice object at 0x7f1650765550>, 
<pyttsx3.voice.Voice object at 0x7f16507655f8>, 
<pyttsx3.voice.Voice object at 0x7f1650765630>, 
<pyttsx3.voice.Voice object at 0x7f1650765668>, 
<pyttsx3.voice.Voice object at 0x7f16507656a0>, 
<pyttsx3.voice.Voice object at 0x7f16507656d8>]

This is the list of various voices, Now if you want to know which accent is there in this list so we can access that with the help of this following code:

for i in voice:
  print(i.id)

So the various ascents are:

['afrikaans', 'aragonese', 'bulgarian', 'bosnian', 'catalan', 
'czech', 'welsh', 'danish', 'german', 'greek', 'default', 
'english', 'en-scottish', 'english-north', 'english_rp', 
'english_wmids', 'english-us', 'en-westindies', 'esperanto', 
'spanish', 'spanish-latin-am', 'estonian', 'persian', 
'persian-pinglish', 'finnish', 'french-Belgium', 'french', 
'irish-gaeilge', 'greek-ancient', 'hindi', 'croatian', 
'hungarian', 'armenian', 'armenian-west', 'indonesian', 
'icelandic', 'italian', 'lojban', 'georgian', 'kannada', 
'kurdish', 'latin', 'lingua_franca_nova', 'lithuanian', 
'latvian', 'macedonian', 'malayalam', 'malay', 'nepali', 
'dutch', 'norwegian', 'punjabi', 'polish', 'brazil', 'portugal', 
'romanian', 'russian', 'slovak', 'albanian', 'serbian', 'swedish', 
'swahili-test', 'tamil', 'turkish', 'vietnam', 'vietnam_hue', 
'vietnam_sgn', 'Mandarin', 'cantonese']

Now if I want to set the ascent so I will do like:

engine.setProperty("voice",voice[15].id)

It means the 15^th ascent.

Now if we want to set the speed of the speech:

engine.setProperty("rate",<The speed you want>)

Program:

# importing the package
import pyttsx3

engine=pyttsx3.init()

text=input("Enter the text that you wanted to convert into speech")

s=engine.getProperty("voices")
engine.setProperty("voice",s[0].id)
engine.setProperty("rate",150)
engine.say(text)
engine.runAndWait()

Note: Do all the setProperty and getProperty stuff before .say() and .runAndWait() otherwise it will not show the results as you want.

Errors:

Many of you can get an error or sometimes the speech will not be audible.

For Linux users: https://github.com/nateshmbhat/pyttsx3/issues/41, https://stackoverflow.com/questions/54243316/how-to-fix-pyttsx3-when-it-isnt-working
For Windows users: https://github.com/nateshmbhat/pyttsx3/issues/29

Comments and Discussions!

Load comments ↻

Advertisement
Advertisement
Advertisement

Top MCQs

Top Programs/Examples

About

Student's Section

Tech Learning @ Home

Text to speech model in Python

Top MCQs

Top Programs/Examples

Top Tutorials

About

Student's Section

Subscribe