Page 1 of 1
Regular Expression to match the beginning of a line
Posted: Mon Mar 03, 2014 1:14 pm
by TurnieGC
Right now I'm trying to use the pronounciation dictionary to skip various characters used for
markdown formatting.
To give you an easy example, the following line marks a chapter
Now I tried to skip the '#' by using e.g. the following regular expression
Most of my regular expressions work fine so far. But they fail once I try to access text at the beginning of a line, i.e. starting the regexp with a '^' sign.
Is there some way to get it to work ? Or is this feature not supported by TextAloud ?
Best regards,
Michael
Re: Regular Expression to match the beginning of a line
Posted: Tue Mar 04, 2014 1:10 am
by Jim Bretti
Hi Michael
By default, regular expressions in TextAloud process text as one big text string. So the symbol ^ will match the beginning of the text.
You can force an expression to process text line by line using the modifier (?m). So the expression (?m)^# should match a # symbol at the beginning of a line. With the (?m) modifier, ^ matches the beginning of a line and $ matches the end of a line.
Re: Regular Expression to match the beginning of a line
Posted: Wed Mar 05, 2014 2:56 am
by TurnieGC
Thanks a lot for your help.
It's working well, although I encountered another behaviour i didn't expect.
I want to suppress a couple of lines starting with a certain qualifier, but skipping the whole line after the qualifier as well.
So I tried the following regular expression
It looks like ".*" isn't recognized. But that's no big deal, in the end the following solution did the same thing
Best regards,
Michael
Re: Regular Expression to match the beginning of a line
Posted: Wed Mar 05, 2014 10:40 am
by Jim Bretti
A possible problem with the expression (?m)^qualifier.*$ could have to do with greedy vs ungreedy (or lazy) pattern matching. By default, pattern matches are greedy, meaning they will match as much as possible. So the expression (?m)^qualifier.*$ begins matching at the first qualifier at the beginning of a line. Since the expression is greedy, it matches all the way to the very end of the text, since there is nothing to stop the . wildcard. So we're matching up the last place in the text where $ matches, rather than the first, which is what you want.
To make the matching ungreedy (or lazy), use a question mark (?) after the asterisk. So the expression to do what you want looks like this:
(?m)^qualifier.*?$
Greedy pattern matching is the default for the regular expression engine we use in TextAloud.