regex multi/single line mode

Hints, tips and tricks for newbies

Moderators: Dorian (MJT support), JRL

Post Reply
optoron
Newbie
Posts: 11
Joined: Sun Jan 31, 2010 2:08 pm

regex multi/single line mode

Post by optoron » Sun Jan 31, 2010 3:13 pm

does regex in ms support multi/single line mode.. ??

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Sun Jan 31, 2010 8:03 pm

Yes, both are supported.

If you have a string consisting of multiple lines, like first line\nsecond line (where \n indicates a line break), it is often desirable to work with lines, rather than the entire string. Therefore, you have the option to expand the meaning of both anchors. ^ can then match at the start of the string (before the f in the above string), as well as after each line break (between \n and s). Likewise, $ will still match at the end of the string (after the last e), and also before every line break (between e and \n).

You have to explicitly activate this extended functionality. It is traditionally called "multi-line mode". Modifiers are used to do that, here are the four modes:

/i makes the regex match case insensitive.
/s enables "single-line mode". In this mode, the dot matches newlines.
/m enables "multi-line mode". In this mode, the caret and dollar match before and after newlines in the subject string.
/x enables "free-spacing mode". In this mode, whitespace between regex tokens is ignored, and an unescaped # starts a comment.

Examples:
(?m) turns on multi-line mode, (?-m) turns it off.
(?i-sm) turns on case insensitivity, and turns off both single-line mode and multi-line mode.

-------------------
These comments above are summaries of info available at: http://www.regular-expressions.info
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Sun Jan 31, 2010 10:56 pm

Hi optoron and Bob,

Specifically, info on those RegEx Modifiers can be found here: http://www.regular-expressions.info/refadv.html
example out-take from info at above link wrote:(?i)

Turn on case insensitivity for the remainder of the regular expression. (Older regex flavors may turn it on for the entire regex.)
Note: Most of the time, you'll probably want to place your Modifiers at the start of your RegEx expression so they all apply immediately... but I am thinking that with the modern PCRE RegEx flavor used in Macro Scheduler, we can probably flip these modifiers both on and off multiple times in different locations within a single RegEx expression if desired.

I have not tested this, but if anyone knows and can confirm or deny, please do.

Take care
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

optoron
Newbie
Posts: 11
Joined: Sun Jan 31, 2010 2:08 pm

Post by optoron » Thu Feb 04, 2010 2:57 am

sorry for the late reply.
and cheers for the info..

is this the correct regex for taking out a single line that contains (pos)
within a readfile??

Code: Select all

regex>(?m)^.+(pos).+$,txt,m_a,no_m,0


i keep getting the whole of the file instead!!?? i tried it out in editpad and that seen to work.. but not in ms??
it's not that important as i can use pos> to test individual line.. but i was just starting in regex as it seems to be alot quicker.. especially with a 170,00 lines file... so this question..
jpuziano wrote:Hi optoron and Bob,

Note: Most of the time, you'll probably want to place your Modifiers at the start of your RegEx expression so they all apply immediately... but I am thinking that with the modern PCRE RegEx flavor used in Macro Scheduler, we can probably flip these modifiers both on and off multiple times in different locations within a single RegEx expression if desired.
lol..that's way beyond my league..lol.. can't even imagine how to write a regex for that..!!

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Thu Feb 04, 2010 5:19 am

Try this:

Code: Select all

regex>^.*\(pos\).*\n,%txt%,0,vMatchSet,vMatchNumber,1,,vNewTxt
Syntax: RegEx>pattern,text,easypatterns,matches_array,num_matches,replace_flag[,replace_string,replace_result]

Breakdown of syntax with sample above
RegEx>
pattern = ^.*\(pos\).*\n
text = %txt%
easypatterns = 0
matches_array = vMatchSet
num_matches = vMatchNumber
replace_flag = 1
replace_string =
replace_result = vNewTxt

This looks for ^.*\(pos\).*\n in the variable %txt%, and replaces it with nothing. The modifier is not needed.

This should remove all lines that have "(pos)" anywhere on the line. The new file will be in %vNewTxt%

--------------
NOTE: That this is untested, do not have access to Macro Scheduler right now.
===========================
that's way beyond my league..lol.
He is only bringing up the usage of something like using(?i) and (?-i) in multiple places in the search, like this, toggling case sensitivity:
RegEx>^(?-i)Fredxyz(?i)[F-M](?-i)RsT,text,easypatterns,matches_array,num_matches,replace_flag[,replace_string,replace_result]
I have not yet tried this in Macro Scheduler but have used it in other programs with RegEx tools. I suspect that it works fine here also, but would like to have it confirmed if someone has done this.
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Thu Feb 04, 2010 9:47 am

Hi Bob,

I tried out your suggestion and it does not seem to work. I tried turning on multi line mode but that did not help. Here is a simple test of your regex:

Code: Select all

Let>lines_from_file=1st line%CRLF%2nd line contains (pos)%CRLF%3rd line

MDL>lines_from_file

regex>^.*\(pos\).*\n,%lines_from_file%,0,vMatchSet,vMatchNumber,1,,vNewTxt

MDL>vNewTxt
The two MDL commands display the text before and after. It should be removing only the 2nd line but it removes more.

Here is a different regex which seems to work...

Code: Select all

Let>lines_from_file=1st line%CRLF%2nd line contains (pos)%CRLF%3rd line

MDL>lines_from_file

regex>(?=(\r\n)).*?\(pos\).*?(?=($|\r\n)),%lines_from_file%,0,match_array,number_of_matches,1,,vNewTxt

MDL>vNewTxt
...though it does have a weakness... if (pos) is in the very first line, it will not remove it.

Open challenge - can anyone tweak the above regex so that it takes care of that as well?

Assumptions:
- Windows/DOS format i.e. every line ends with CRLF
- except perhaps the very last line, that may or may not end in CRLF but even so, if that line contains (pos), the regex should remove that line
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Fri Feb 05, 2010 4:05 am

Hi Bob and optoron,

Ahh... now I see... here is something that works in all cases.

This will remove any line from a multi-line text blob that contains (pos)

Code: Select all

Let>lines_from_file=1st (pos) line%CRLF%2nd line%CRLF%3rd (pos) line

MDL>lines_from_file

regex>(?m-s)^.*\(pos\).*(\r\n|$),%lines_from_file%,0,vMatchSet,vMatchNumber,1,,vNewTxt

MDL>vNewTxt
I used two modifiers:

(?m-s)
the good folks at [url]http://www.regular-expressions.info/refadv.html[/url] mostly wrote:(?m) Caret and dollar match after and before newlines for the remainder of the regular expression. (Older regex flavors may apply this to the entire regex.)

(?-s) Turn off "dot matches newline" for the remainder of the regular expression.

(?m-s) This combines both of the above... though they could also be used separately as (?m)(?-s)
The regex above is written assuming Windows/DOS line terminators i.e. CRLF.

To make it work with UNIX files that just use LF for a line terminator, replace \r\n with just \n in the regex expression.

Let us know if this works for you optoron... and thanks for posting the question.

Take care
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

optoron
Newbie
Posts: 11
Joined: Sun Jan 31, 2010 2:08 pm

Post by optoron » Fri Feb 05, 2010 10:27 am

jpuziano wrote:Hi Bob and optoron,

Code: Select all

Let>lines_from_file=1st (pos) line%CRLF%2nd line%CRLF%3rd (pos) line

MDL>lines_from_file

regex>(?m-s)^.*\(pos\).*(\r\n|$),%lines_from_file%,0,vMatchSet,vMatchNumber,1,,vNewTxt

MDL>vNewTxt
I used two modifiers:

(?m-s)
it works!! at 1st i thought if it could be the [dot] that's causing the trouble (dot matches new line), but i can't seem to find anything on it for the regex.. in the end, it still did not work. i was about to give up.. but thank you all for helping..

cheers jpuziano for the solution!!
Bob Hansen wrote:
RegEx>^(?-i)Fredxyz(?i)[F-M](?-i)RsT,text,easypatterns,matches_array,num_matches,replace_flag[,replace_string,replace_result]
I have not yet tried this in Macro Scheduler but have used it in other programs with RegEx tools. I suspect that it works fine here also, but would like to have it confirmed if someone has done this.
that works too.. hmm, i might be able to use that later on... nice find!!

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts