Removing/muting in-text references from tech document

Forum for TextAloud version 4

Moderator: Jim Bretti

Post Reply
gtaus
Posts: 31
Joined: Fri Sep 19, 2008 12:23 pm
Contact:

Removing/muting in-text references from tech document

Post by gtaus »

I have converted a .pdf medical tech document into a .txt document so I can use TTS and make an audio file. Problem is, there are so many in-text references that the audio file is not natural sounding. For example; "any text here... (Katz, 2013)" For the audio file, I really don't want to hear all those references. Almost every third sentences has a in-text reference.

Is there a way to search/replace all these in-line texts and either delete them or mute them from the TTS audio file?

All references start with "("
all references have some date "2013" "2019" etc...
all references end with ")"

Thanks for any suggestons.
Jim Bretti
Posts: 1558
Joined: Wed Oct 29, 2003 11:07 am
Contact:

Re: Removing/muting in-text references from tech document

Post by Jim Bretti »

One way to handle would be to create a TextAloud pronunciation dictionary entry to filter the references. If all the references have this same format, you can use a regular expression to match these patterns and exclude them.

Here is how you could do it. From the TextAloud menu click Control Center -> Pronunciation Dictionary Maintenance. Create a new dictionary entry, and configure like this:

Set the Text Matching dropdown to Regular Expression. For the expression use:
\(.+?\s\d{4}.*?\)

Then set the Pronounce Using dropdown to Skip Text.

In the expression, backslash characters are required to escape the open / close parentheses characters. The pattern is matching an open parentheses character followed by 1 or more characters, a whitespace character, a 4 digit number, 0 or more characters and closing parentheses.

Let me know if you need help. Also there is a good reference on using regular expressions at https://www.regular-expressions.info/
Jim Bretti
NextUp.com
gtaus
Posts: 31
Joined: Fri Sep 19, 2008 12:23 pm
Contact:

Re: Removing/muting in-text references from tech document

Post by gtaus »

Thank you for the regex, which works great. And thank you even more for the link to the regular expressions website with tutorials. I was not aware of the power of regex, but now I can see many possibilities for use in my TTS documents. Muting all those references makes the audio text so much easier to listen to and makes everything more enjoyable. Again, thank you.
Post Reply