About Text To Speech

Here I’m going to talk about Text-to-Speech (TTS) software, both free and commercial. This article might be focused on Japanese language, as I’m gonna share it with my Japanese class.

If not declared otherwise, this work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

What is it?

A text-to-speech (TTS) system converts normal language text into speech.
by Wikipedia

In layman’s term, let your computer speak out what you gave it.

What tools available out there?

Web

Of course, web is the most easily availble platform that you can use almost everywhere, even on your phone. So, let’s talk about it first.

Yah, there’s plenty of it, You can just google “web text to speech”, and there is full of choices waiting for you to choose. And there’s indeed some among them with a high quality. Some even allow you to download the generated voice file.

Here I list down some websites that I think the sound quality is pretty good.

But there’s one thing, almost all of them comes with a limit. That’s why we can’t be satisfied with just online services. We need some local softwares that can do the thing for us as well.

Software

System built-ins

Actually, many OSes do come with their own speech synthesis engines, at least for Windows and Mac. (Sorry for Linux users, non-proprietary engines don’t sounds in a high quality to me.)

Windows

On Windows, there are some engines that just works. Usually in the language that the system uses. But if you want to use any other language that isn’t currently availble to you, sadly speaking, you need to install the entire language pack, on a supported edition (usually Pro or Ultimate). (Alternatively, if you are tech-savvy, you may want to try this, but only do it when you know what you are doing, and do it at your own risk)

Once you have your voice settled, get a software that save the speech for you, like Balabolka

Mac OS X (a.k.a. macOS)

Things gets lot more easier on OS X. (Yah, I prefer the old name better, cuz I’m still not with Sierra XD.) All you need to do is head to System Preferences -> Dictation & Speech -> Dictation and choose your favorite language and voice. You may go on and download more from the Customize dialog.

Notice: Downloading a voice may be data-costing, please download in an appropriate occasion.

Get your voice saved is pretty simple as well. Select a chunck of text, right click it, then Services -> Add to iTunes as a spoken track. Then choose a voice, a file name and a folder. That’s all. Easy peasy, right? If you want to hear it before saving, right click and go with Speech -> Start speaking.

If you want to try something more…

Windows SAPI

There’s many free and paid SAPI engines on the internet. Google “SAPI voices {language}” (remember to replace {language} with the language name) and look for the one you like. All of them should be compatible with any SAPI tools like Balabolka.

Independent Voice Engines

Many independent voice engines come with a higher level of “customizability”. Means you can even change the tone and mood of the speech. But usually they are paid software.

Some that I would strongly recommend for Japanese users are:

Yukkuri (AquesTalk)

This is a phenomenal TTS voice widely known with its characteristic voice by Japanese Nico-chu’s (Users of Niconico Douga). More importantly, its free, and cross platform.

It works on Windows (1, 2), OS X, and even Android (1, 2) and iOS. Try it out and you’ll know how interesting it is.

Notice: AquesTalk supports Japanese only.

But I want to tweak it more

Then you may want to let the computer “sing” out your words. Singing synthesizers:

Apps (Android)

How it works on Android is similar with Windows, grab a voice and a reading app from the Play Store, and use it straight away. Notice that some voices may require a payment. Currently I’m using:

with