Pronouncing 3 digit and four digit numbers

Forum for TextAloud version 3

Moderator: Jim Bretti

Post Reply
OLINEA
Posts: 5
Joined: Tue Jan 11, 2011 11:08 pm
Contact:

Pronouncing 3 digit and four digit numbers

Post by OLINEA »

How can you use the Pronunciation Dictionary to pronounce 3 and 4 digit numbers a particularized way? For example, the voice comes across the number "121". Instead of changing the pronunciations for every three/four digit number you come across, can you "program" the voices to read 121 as "one twenty one" instead of "one hundred twenty one"? Or to read "3312" as "thirty three twelve" instead of "three three one two" or "three thousand three hundred twelve"? I understand how to change reading years like 1912, but, because it is so common, a simple match with four digit number patterns that begin with 19 is all that is needed. What I don't understand is how to pronounce every four digit number with the two digit pronunciation first without having to match every conceivable number combination. It seems like there should be a way to do this, but I don't understand Matching and Regular Expressions enough to "program" it this way. Thanks for any help - T.J.
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: Pronouncing 3 digit and four digit numbers

Post by Jim Bretti »

Here's something to try. In the TextAloud pronunciation editor, create a pronunciation entry like this:

Set the Text Matching dropdown to "Regular Expression", and use this as the expression:

(?<=\b)(\d{1,2})(\d\d)(?=\b)

Set the Pronounce Using dropdown to "Respell", and use this:

$1 $2

Here is an explanation ... at the beginning of the expression, (?<=\b) is used to look 'behind' for trailing word boundary characters. At the end of the expression, (?=\b) looks 'ahead' for word boundary characters. So the 'look behind' and 'look ahead' force the pattern we're searching for to begin with a word boundary character and end with a word boundary character. (\d{1,2}) is used in the expression to match either 1 or 2 digits, and (\d\d) matches exactly two digits. $1 and $2 in the respell field correspond to the two sets of digits in parentheses. So the result is that when ever we see a number like 123 or 2345, we force a space before the last two digits. Note that when $1 and $2 match up with sets of parentheses in the expression, they don't count look behinds or look aheads.

Let me know if that doesn't work.
Jim Bretti
NextUp.com
enginist
Posts: 2
Joined: Fri Jul 01, 2016 1:36 am
Contact:

Re: Pronouncing 3 digit and four digit numbers

Post by enginist »

I love the NextUp program and get a lot of use from it. But in the four months that I've had it, I haven't been able to get all the numbers--dates, years, currency--pronounced correctly, or at least acceptably. There is a lot of good information scattered all over the forum, and I have taken advantage of what I could find.

I have about 10 dictionary entries for AT&T's Mike, but some numbers continue to frustrate me. As much as I admire the knowledge of Jim Bretti, P. Henry, SFCurly, and others, I don't want to become fluent in regedit, which seems harder to learn than calculus. And I like calculus.

Anyway, I tried the entry in the above post for three- and four-digit numbers, and it works, more or less, in the test area. By "more or less," I mean it will pronounce 74,500 as seventy four five zero zero, which isn't perfect, but I can live with it.

However, when reading the main text, it seems to have a problem with zeroes. It pronounces 1971 correctly, but pronounces 2000 as twenty. It also fails with five-digit numbers, pronouncing 12,500 as twelve five and 74,000 as seventy four oh.

Thank you.

Enginist.

P.S. Are some voices more adept at pronunciation than others. Do some need little or no dictionary maintenance? If so, which are they?
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: Pronouncing 3 digit and four digit numbers

Post by Jim Bretti »

On the problem you're having with AT&T voices, it might be worth disabling all your dictionary entries, and retesting with the numbers you're seeing the problems with (2000, 12,500 and 74,000). On my system, AT&T Mike seems to handle those numbers ok with no dictionary entries. Those three numbers are pronounced as two thousand, 12 thousand five hundred, and 74 thousand. Retesting with dictionary entries disabled should tell us exactly what correction is needed, and if one of your other entries may be interfering.

It is possible that other voices may handle the problems you're having. Best thing to do would be to try the interactive demos for Acapela and Ivona. You can enter your own text and see how the voices handle your numbers. There is an interactive demo for Acapela voices at http://www.acapela-group.com/text-to-sp ... -demo.html, and for Ivona at http://nextup.com/ivona/index.html. If you're interested in Nuance voices, you'll need to mail a text sample to me (jim@nextup.com), and let me know what voice(s) you're interested in. I'll send you mp3 files created from the text you send.
Jim Bretti
NextUp.com
enginist
Posts: 2
Joined: Fri Jul 01, 2016 1:36 am
Contact:

Re: Pronouncing 3 digit and four digit numbers

Post by enginist »

It's fixed now, thanks. As you suggested, I unchecked all the dictionary entries, and that provided a partial solution. But the main culprit, which I discovered accidentally, was in the Character Speaking and Filtering section, where I had (0-9) filtered out. I don't remember inflicting that stupidity on myself, but I'm sure glad I found it.
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: Pronouncing 3 digit and four digit numbers

Post by Jim Bretti »

Character filtering really should at least display a warning when you have alphanumerics in the filter. I'll try to get that in the next update.
Jim Bretti
NextUp.com
Post Reply