how to skip numbers?

Forum for TextAloud version 3

Moderator: Jim Bretti

Post Reply
bgswanson

how to skip numbers?

Post by bgswanson »

I have uploaded several chapters from the bible that I want read to me. When it uploaded I found there was no space between the verse and first letter of every verse. Is there a way to use the pronunciation dictionary to skip numbers so to avoid having to go through every verse and put a space between it and the first letter of the verse?

Question 2:

This seems to be a common issue with some of the other material I download. About every third or forth paragraph that I copy has all the words stuck together. Is there a quick fix for this or am I stuck putting spaces between all the words?
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: how to skip numbers?

Post by Jim Bretti »

We can probably use the TextAloud pronunciation editor to help with the first issue, and possibly the second issue as well. In the first case the idea will be to use pattern match functionality in the pronunciation editor to look for verses immediately followed by text, and remove the verse numbers. Handling the second case will depend on if there is any kind of pattern we can key on.

It will help if you post a few samples of text (cases 1 and 2) so we can look at what the pronunciation dictionary entries need to look like.
Jim Bretti
NextUp.com
bgswanson
Posts: 4
Joined: Mon Apr 04, 2011 3:50 pm
Contact:

Re: how to skip numbers?

Post by bgswanson »

Here is what is going on with case 1: Matthew 8
1When Jesus came down from the hill, many people followed him.

2A man who had leprosy [a bad skin disease] came and kneeled in front of Jesus and worshipped him. He said to Jesus, `Sir, I know you can heal me if you want to.'

3Jesus put out his hand and touched him. He said, `I want to. Be healed.' Right away his leprosy was healed.

4Jesus said to him, `Do not tell anyone about this. But go and let the priest look at you. Moses gave a law about the sacrifice you must give when you are healed. Give it, to prove to people you are healed.'

5When Jesus came to the town of Capernaum, a Roman army officer came to ask him for help.

6He said, `Sir, one of my servants is in bed at my house. He cannot move and has much pain.'

7Jesus said, `I will come and heal him.'

8The officer said, `Sir, I am not good enough to have you come into my house. Just say the word and my servant will be healed.

9I myself am a man who takes orders, and I have soldiers who take orders from me. I say to one, "Go," and he goes. I say to another one, "Come," and he comes. I say to my servant, "Do this," and he does it.'

10Jesus was surprised when he heard this. He said to the people who followed him, `I tell you the truth. I have not found any Jew who believes as this man does.

Case 2 is too random. It picks any paragraph and simply fails to put spaces between the word. I'll post a sample tomorow.
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: how to skip numbers?

Post by Jim Bretti »

Here is one way to try handing the verse numbers. It looks like we can count on verse numbers starting at the beginning of a line, as well as immediately followed by an upper case character.

From the Tools menu, display Pronunciation Dictionary Maintenance. Create a new dictionary entry, setting the Text Matching dropdown to "Regular Expression", and the Pronounce Using dropdown to "Skip Text". For the regular expression, use this:

((?<=^)|(?<=\n))\d+(?=[A-Z])

Also, enable the Case Sensitive checkbox.

The ^ symbol references the beginning of the text, and \n matches newline characters. \d+ matches one or more decimal digits, and [A-Z] matches uppercase letter (this is where you need the case sensitive checkbox enabled). The expression uses 'lookbehinds' and 'lookaheads' to match the carriage returns and upper case character, meaning the only text returned on a match is the verse number. Using Skip Text as the Pronounce Using option should skip the verse numbers.

Let me know if you have any trouble getting that to work.
Jim Bretti
NextUp.com
bgswanson
Posts: 4
Joined: Mon Apr 04, 2011 3:50 pm
Contact:

Re: how to skip numbers?

Post by bgswanson »

It did not work.
bgswanson
Posts: 4
Joined: Mon Apr 04, 2011 3:50 pm
Contact:

Re: how to skip numbers?

Post by bgswanson »

Additionally, it is speaking "back quote"
Bluemoon
Posts: 1
Joined: Sat Aug 21, 2010 11:49 am
Contact:

Re: how to skip numbers?

Post by Bluemoon »

It works on my TA 3.0.17, but not on MS Word 2007.

Regards,
PHenry1026
Posts: 231
Joined: Thu Jan 11, 2007 12:10 pm
Contact:

Re: how to skip numbers?

Post by PHenry1026 »

Greetings,

I tested your sample text and the following regex works on it:

(?#BibleSkip)(?m)(?<=^)\d+(?=(?-i)[A-Z](?i))

Follow the instructions given by Jim (above) to enter this regex, but do not enable case sensitivity.

If you have spaces before the numbers this alternative should work for all situations:

(?#BibleSkip)(?m)^\s*?\d+(?=(?-i)[A-Z](?i))

The first regex is a cleaner regex, you should only use the second if you encounter spaces before the number.

Percy Henry
bgswanson
Posts: 4
Joined: Mon Apr 04, 2011 3:50 pm
Contact:

Re: how to skip numbers?

Post by bgswanson »

It ended up working. Thanks a bunch.
dinoe
Posts: 12
Joined: Sun Jan 29, 2006 4:02 am
Contact:

Re: how to skip numbers?

Post by dinoe »

With TA 3.0.23 I have a similar problem to solve: numbers at beginning of line should be skipped - lines are all numbered at the beginning then followed by space, then upper or lower case words. Only these digits/numbers at the beginning should be skipped. I tried the masks from Jim Bretti and PHenry1026 for the regular expression but it did not work for me...
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: how to skip numbers?

Post by Jim Bretti »

One thing you can try is to use a modifier on the regular expression to have the expression work line by line. The modifier is (?m), and allows you to use the ^ symbol in the expression to point at start of line.

So your regular expression could look like this:

(?m)^(\d+\.\s)

and set Pronounce using to "Skip Text".

The expression here is using the multiline option (?m), and is looking for one ore more digits at the beginning of a line, followed by a period and a space.

Does that work?
Jim Bretti
NextUp.com
dinoe
Posts: 12
Joined: Sun Jan 29, 2006 4:02 am
Contact:

Re: how to skip numbers?

Post by dinoe »

No unfortunately not (all digits are pronounced with "(?m)^(\d+\.\s)"). I must add, that the digits are on the beginning of each line in ASCII, since TA does textwrap there are not numbers on each and every line (depending on the length of the line). Thanks a lot for looking into it!
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: how to skip numbers?

Post by Jim Bretti »

It might help if you can send me a sample article to look at. From the TextAloud main menu, click File -> Save Article as Text File, and mail the text file to me at jim@nextup.com
Jim Bretti
NextUp.com
dinoe
Posts: 12
Joined: Sun Jan 29, 2006 4:02 am
Contact:

Re: how to skip numbers?

Post by dinoe »

I just copied you something at random and sent it to you (in the attachment an ASCII file with digits at the beginning of the line).
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: how to skip numbers?

Post by Jim Bretti »

I think I see what the problem is, the expression I gave you above is looking for a period following the number, thats my mistake.

Try using the this regular expression instead:

(?m)^(\d+\s)

and set Pronounce Using to "Skip Text".

So now, we're looking for one or more digits at the start of a line, followed by space.
Jim Bretti
NextUp.com
dinoe
Posts: 12
Joined: Sun Jan 29, 2006 4:02 am
Contact:

Re: how to skip numbers?

Post by dinoe »

Hi Jim, thank you, it´s now working very well :-)
Post Reply