Speech Synthesis Options for Robotics

Language can become a barrier for humans and robots when trying to communicate. The robot will have the ability to produce sounds in order to communicate with people, animals and other robots.

Options for speech

When I first started exploring ways to add speech to my robotics projects I was using a small FM transmitter to send audio to the robot. However, it can probably be agreed that just sticking a radio in the robot isn’t that cool. Instead we are going to look at a few of the different software and hardware required make this conversion.

Operational requirements

Can’t slow down other processes to generate the audio.
Ability to stop speaking process on demand.
Ability to queue tasks for processing.
Ability to modify pronunciations as needed.

Design goal

We are going to set up a dedicated speech processor that will receive text data from the robot’s main computer which it will then process into audio signals that can be understood as words. The following options list various pros and cons for available solutions that can handle speech synthesis on either the hardware or software level.

Arduino for speech synthesis

Pros

Arduino board is low cost.
Many programs for speech synthisis on an Arduino have already been created so very little needs to be created from scratch.

Cons

Lowest quality audio output of all options

Raspberry Pi for speech synthesis

Pros

Install commonly used (free) espeak package.

Cons

The Raspberry Pi might be overkill for just speech synthesis processing tasks alone.

eSpeak text to speech module

Pros

Takes the load of speech synthesis off of the robot’s main processor.

Cons

More expensive ($50 to $70)

An amplifier will be required in either case to amplify the audio you generate for output through speakers.