regex multi/single line mode
Moderators: Dorian (MJT support), JRL
regex multi/single line mode
does regex in ms support multi/single line mode.. ??
- Bob Hansen
- Automation Wizard
- Posts: 2475
- Joined: Tue Sep 24, 2002 3:47 am
- Location: Salem, New Hampshire, US
- Contact:
Yes, both are supported.
If you have a string consisting of multiple lines, like first line\nsecond line (where \n indicates a line break), it is often desirable to work with lines, rather than the entire string. Therefore, you have the option to expand the meaning of both anchors. ^ can then match at the start of the string (before the f in the above string), as well as after each line break (between \n and s). Likewise, $ will still match at the end of the string (after the last e), and also before every line break (between e and \n).
You have to explicitly activate this extended functionality. It is traditionally called "multi-line mode". Modifiers are used to do that, here are the four modes:
/i makes the regex match case insensitive.
/s enables "single-line mode". In this mode, the dot matches newlines.
/m enables "multi-line mode". In this mode, the caret and dollar match before and after newlines in the subject string.
/x enables "free-spacing mode". In this mode, whitespace between regex tokens is ignored, and an unescaped # starts a comment.
Examples:
(?m) turns on multi-line mode, (?-m) turns it off.
(?i-sm) turns on case insensitivity, and turns off both single-line mode and multi-line mode.
-------------------
These comments above are summaries of info available at: http://www.regular-expressions.info
If you have a string consisting of multiple lines, like first line\nsecond line (where \n indicates a line break), it is often desirable to work with lines, rather than the entire string. Therefore, you have the option to expand the meaning of both anchors. ^ can then match at the start of the string (before the f in the above string), as well as after each line break (between \n and s). Likewise, $ will still match at the end of the string (after the last e), and also before every line break (between e and \n).
You have to explicitly activate this extended functionality. It is traditionally called "multi-line mode". Modifiers are used to do that, here are the four modes:
/i makes the regex match case insensitive.
/s enables "single-line mode". In this mode, the dot matches newlines.
/m enables "multi-line mode". In this mode, the caret and dollar match before and after newlines in the subject string.
/x enables "free-spacing mode". In this mode, whitespace between regex tokens is ignored, and an unescaped # starts a comment.
Examples:
(?m) turns on multi-line mode, (?-m) turns it off.
(?i-sm) turns on case insensitivity, and turns off both single-line mode and multi-line mode.
-------------------
These comments above are summaries of info available at: http://www.regular-expressions.info
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!
Bob
A humble man and PROUD of it!
Hi optoron and Bob,
Specifically, info on those RegEx Modifiers can be found here: http://www.regular-expressions.info/refadv.html
I have not tested this, but if anyone knows and can confirm or deny, please do.
Take care
Specifically, info on those RegEx Modifiers can be found here: http://www.regular-expressions.info/refadv.html
Note: Most of the time, you'll probably want to place your Modifiers at the start of your RegEx expression so they all apply immediately... but I am thinking that with the modern PCRE RegEx flavor used in Macro Scheduler, we can probably flip these modifiers both on and off multiple times in different locations within a single RegEx expression if desired.example out-take from info at above link wrote:(?i)
Turn on case insensitivity for the remainder of the regular expression. (Older regex flavors may turn it on for the entire regex.)
I have not tested this, but if anyone knows and can confirm or deny, please do.
Take care
jpuziano
Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post -
Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post -
sorry for the late reply.
and cheers for the info..
is this the correct regex for taking out a single line that contains (pos)
within a readfile??
i keep getting the whole of the file instead!!?? i tried it out in editpad and that seen to work.. but not in ms??
it's not that important as i can use pos> to test individual line.. but i was just starting in regex as it seems to be alot quicker.. especially with a 170,00 lines file... so this question..
and cheers for the info..
is this the correct regex for taking out a single line that contains (pos)
within a readfile??
Code: Select all
regex>(?m)^.+(pos).+$,txt,m_a,no_m,0
i keep getting the whole of the file instead!!?? i tried it out in editpad and that seen to work.. but not in ms??
it's not that important as i can use pos> to test individual line.. but i was just starting in regex as it seems to be alot quicker.. especially with a 170,00 lines file... so this question..
lol..that's way beyond my league..lol.. can't even imagine how to write a regex for that..!!jpuziano wrote:Hi optoron and Bob,
Note: Most of the time, you'll probably want to place your Modifiers at the start of your RegEx expression so they all apply immediately... but I am thinking that with the modern PCRE RegEx flavor used in Macro Scheduler, we can probably flip these modifiers both on and off multiple times in different locations within a single RegEx expression if desired.
- Bob Hansen
- Automation Wizard
- Posts: 2475
- Joined: Tue Sep 24, 2002 3:47 am
- Location: Salem, New Hampshire, US
- Contact:
Try this:
Syntax: RegEx>pattern,text,easypatterns,matches_array,num_matches,replace_flag[,replace_string,replace_result]
Breakdown of syntax with sample above
RegEx>
pattern = ^.*\(pos\).*\n
text = %txt%
easypatterns = 0
matches_array = vMatchSet
num_matches = vMatchNumber
replace_flag = 1
replace_string =
replace_result = vNewTxt
This looks for ^.*\(pos\).*\n in the variable %txt%, and replaces it with nothing. The modifier is not needed.
This should remove all lines that have "(pos)" anywhere on the line. The new file will be in %vNewTxt%
--------------
NOTE: That this is untested, do not have access to Macro Scheduler right now.
===========================
RegEx>^(?-i)Fredxyz(?i)[F-M](?-i)RsT,text,easypatterns,matches_array,num_matches,replace_flag[,replace_string,replace_result]
I have not yet tried this in Macro Scheduler but have used it in other programs with RegEx tools. I suspect that it works fine here also, but would like to have it confirmed if someone has done this.
Code: Select all
regex>^.*\(pos\).*\n,%txt%,0,vMatchSet,vMatchNumber,1,,vNewTxt
Breakdown of syntax with sample above
RegEx>
pattern = ^.*\(pos\).*\n
text = %txt%
easypatterns = 0
matches_array = vMatchSet
num_matches = vMatchNumber
replace_flag = 1
replace_string =
replace_result = vNewTxt
This looks for ^.*\(pos\).*\n in the variable %txt%, and replaces it with nothing. The modifier is not needed.
This should remove all lines that have "(pos)" anywhere on the line. The new file will be in %vNewTxt%
--------------
NOTE: That this is untested, do not have access to Macro Scheduler right now.
===========================
He is only bringing up the usage of something like using(?i) and (?-i) in multiple places in the search, like this, toggling case sensitivity:that's way beyond my league..lol.
RegEx>^(?-i)Fredxyz(?i)[F-M](?-i)RsT,text,easypatterns,matches_array,num_matches,replace_flag[,replace_string,replace_result]
I have not yet tried this in Macro Scheduler but have used it in other programs with RegEx tools. I suspect that it works fine here also, but would like to have it confirmed if someone has done this.
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!
Bob
A humble man and PROUD of it!
Hi Bob,
I tried out your suggestion and it does not seem to work. I tried turning on multi line mode but that did not help. Here is a simple test of your regex:
The two MDL commands display the text before and after. It should be removing only the 2nd line but it removes more.
Here is a different regex which seems to work...
...though it does have a weakness... if (pos) is in the very first line, it will not remove it.
Open challenge - can anyone tweak the above regex so that it takes care of that as well?
Assumptions:
- Windows/DOS format i.e. every line ends with CRLF
- except perhaps the very last line, that may or may not end in CRLF but even so, if that line contains (pos), the regex should remove that line
I tried out your suggestion and it does not seem to work. I tried turning on multi line mode but that did not help. Here is a simple test of your regex:
Code: Select all
Let>lines_from_file=1st line%CRLF%2nd line contains (pos)%CRLF%3rd line
MDL>lines_from_file
regex>^.*\(pos\).*\n,%lines_from_file%,0,vMatchSet,vMatchNumber,1,,vNewTxt
MDL>vNewTxt
Here is a different regex which seems to work...
Code: Select all
Let>lines_from_file=1st line%CRLF%2nd line contains (pos)%CRLF%3rd line
MDL>lines_from_file
regex>(?=(\r\n)).*?\(pos\).*?(?=($|\r\n)),%lines_from_file%,0,match_array,number_of_matches,1,,vNewTxt
MDL>vNewTxt
Open challenge - can anyone tweak the above regex so that it takes care of that as well?
Assumptions:
- Windows/DOS format i.e. every line ends with CRLF
- except perhaps the very last line, that may or may not end in CRLF but even so, if that line contains (pos), the regex should remove that line
jpuziano
Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post -
Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post -
Hi Bob and optoron,
Ahh... now I see... here is something that works in all cases.
This will remove any line from a multi-line text blob that contains (pos)
I used two modifiers:
(?m-s)
To make it work with UNIX files that just use LF for a line terminator, replace \r\n with just \n in the regex expression.
Let us know if this works for you optoron... and thanks for posting the question.
Take care
Ahh... now I see... here is something that works in all cases.
This will remove any line from a multi-line text blob that contains (pos)
Code: Select all
Let>lines_from_file=1st (pos) line%CRLF%2nd line%CRLF%3rd (pos) line
MDL>lines_from_file
regex>(?m-s)^.*\(pos\).*(\r\n|$),%lines_from_file%,0,vMatchSet,vMatchNumber,1,,vNewTxt
MDL>vNewTxt
(?m-s)
The regex above is written assuming Windows/DOS line terminators i.e. CRLF.the good folks at [url]http://www.regular-expressions.info/refadv.html[/url] mostly wrote:(?m) Caret and dollar match after and before newlines for the remainder of the regular expression. (Older regex flavors may apply this to the entire regex.)
(?-s) Turn off "dot matches newline" for the remainder of the regular expression.
(?m-s) This combines both of the above... though they could also be used separately as (?m)(?-s)
To make it work with UNIX files that just use LF for a line terminator, replace \r\n with just \n in the regex expression.
Let us know if this works for you optoron... and thanks for posting the question.
Take care
jpuziano
Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post -
Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post -
it works!! at 1st i thought if it could be the [dot] that's causing the trouble (dot matches new line), but i can't seem to find anything on it for the regex.. in the end, it still did not work. i was about to give up.. but thank you all for helping..jpuziano wrote:Hi Bob and optoron,
I used two modifiers:Code: Select all
Let>lines_from_file=1st (pos) line%CRLF%2nd line%CRLF%3rd (pos) line MDL>lines_from_file regex>(?m-s)^.*\(pos\).*(\r\n|$),%lines_from_file%,0,vMatchSet,vMatchNumber,1,,vNewTxt MDL>vNewTxt
(?m-s)
cheers jpuziano for the solution!!
that works too.. hmm, i might be able to use that later on... nice find!!Bob Hansen wrote:
RegEx>^(?-i)Fredxyz(?i)[F-M](?-i)RsT,text,easypatterns,matches_array,num_matches,replace_flag[,replace_string,replace_result]
I have not yet tried this in Macro Scheduler but have used it in other programs with RegEx tools. I suspect that it works fine here also, but would like to have it confirmed if someone has done this.