Capture string in between two given string

Technical support and scripting issues

Moderators: Dorian (MJT support), JRL

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Capture string in between two given string

Post by Niroj@Work » Thu Apr 01, 2010 5:56 am

Hi,

I need to capture a string between two given string using RegEx command.

Let>str1=ABC
Let>str2=ZZZ
Str2 may vary like "|" or "ZZ" r anything as user want.

Now,

Let>str=sshshcg ABC hghdg hgs 123 ZZZ hgshd gghgh hg ZZZ gchh

In the above string str, "ABC" is unique but "ZZZ" can occurs more than one. Here I want to extract data between "ABC" and 1st occurrence of "ZZZ" i.e( hghdg hgs 123) here.

Any idea..?

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Post by Niroj@Work » Thu Apr 01, 2010 6:14 am

RegEx>%str1%(.+?)%str2%,%str%,0,,,1,$1,Val

Mdl>Val

Why this is not working?

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Thu Apr 01, 2010 4:57 pm

Hi Niroj@Work,

If you want a purely regex solution, this can be done with lookahead and lookbehind... there's some info in this post:

RegEx> example using Lookahead and Lookbehind

Try this:

Code: Select all

Let>str1=ABC
Let>str2=ZZZ
Let>str=sshshcg ABC hghdg hgs 123 ZZZ hgshd gghgh hg ZZZ gchh
Let>pattern=(?<=%str1% ).+?(?= %str2%)
RegEx>pattern,%str%,0,match_array,num,0
MDL>match_array_1
The two spaces in the pattern are there so that the final result in match_array_1 does not have leading and trailing spaces.

If you want those leading and trailing spaces in the result string, just remove those two spaces from the pattern.

Does this work for you?

Take care
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Post by Niroj@Work » Tue Apr 06, 2010 8:21 am

Thanks buddy! This is quite useful

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Post by Niroj@Work » Tue Apr 06, 2010 9:11 am

I have different scenario where I need to replace the new line characters between two specific group.

For Example:

Let>str=abcd %CRLF% hhjh jhj {select hgh %CRLF% hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jkj}%CRLF%

I need to replace all the new line characters between the curly braces {...}with null.

So required is:
Let>str=abcd %CRLF% hhjh jhj {select hgh hh=!3j h jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh nb jkj}%CRLF%

Could you please help here..

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Bad coding

Post by Niroj@Work » Tue Apr 06, 2010 10:18 am

The below way I tried and replace the same at the end. But not able to replace at the same position. Whenever I am replacing {...} with _RPL_ME_ and later in the loop trying to replace the 1st occurrence I am getting absurd error.
======================
Program to Replace the same at the End:
================================
Let>str=abcd %CRLF% hhjh jhj {select hgh %CRLF% hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jkj}%CRLF%


Let>pattern=\{.+?\}
RegEx>pattern,%str%,0,match_array,num,1,,str

Let>i=1

Repeat>i
RegEx>%CRLF%,match_array_%i%,0,,,1,,q
Let>str=%str% %q%
Add>i,1
Until>i>num

mdl>str


======================
Getting error: Program to Replace the {..%CRLF%...}at the same position:
================================
Let>str=abcd %CRLF% hhjh jhj {select hgh %CRLF% hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jkj}%CRLF%


Let>pattern=\{.+?\}
RegEx>pattern,%str%,0,match_array,num,1,_REPLACE_ME_,str

Let>i=1

Repeat>i
RegEx>%CRLF%,match_array_%i%,0,,,1,,q
RegEx>(_REPLACE_ME_).+?,str,0,,,1,%q%,str
Add>i,1
Until>i>num

mdl>str

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Tue Apr 06, 2010 10:49 pm

Interesting... I tried the following:

Code: Select all

Let>str=abcd %CRLF% hhjh jhj {select hgh %CRLF% hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jkj}%CRLF%

Let>pattern=(?<={.+?)\r\n(?=.+?})

RegEx>pattern,str,0,match_array,num,1,~,new_string

MDL>new_string
However it fails with the error: Regular Expression pattern not compiled.

I'm not sure why though, maybe Marcus can shed some light.

Even if the above error could be eliminated though, I am not sure if this would catch and change multiple occurances of %CRLF% within { and } chars... it might only catch the first or last occurance within any { and } pair.

You can do this brute force with procedural code of course... but I am thinking you're looking for a regex solution here.

Marcus or anyone else, please jump in.
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Wed Apr 07, 2010 12:17 am

No access to Macro Scheduler right now, but does this work?

Let>pattern={(.*)%CRLF%(.*)%CRLF%(.*[^}])}

Replace with: $1$2$3
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Post by Niroj@Work » Wed Apr 07, 2010 4:18 am

It will not work as %CRLF% may come any number times ..

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Wed Apr 07, 2010 4:47 am

Hi Bob,

Thanks for that, it got me on the right track.

Niroj@Work - try the following:

Code: Select all

Let>str=abcd %CRLF% hhjh jhj {select hgh %CRLF% hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jkj}%CRLF%

Let>pattern=({.*?)\r\n(.*?)\r\n(.*?})

RegEx>pattern,str,0,match_array,num,1,$1%~$2~$3,new_string

MDL>new_string
I chose ~ as my replacement char instead of null so that the MDL> line can show the result to prove that it is working.

To replace with a null char instead, change the RegEx line to this:

RegEx>pattern,str,0,match_array,num,1,$1%NULLCHAR%$2%NULLCHAR%$3,new_string

Note though that then, MDL will not be able to show you the whole result string as it must interpret the first null char it hits as the end of the string and it stops there as in a "null-terminated string".

Does this work for you?
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Wed Apr 07, 2010 4:51 am

Niroj@Work wrote:It will not work as %CRLF% may come any number times ..
I just saw this. If you're saying that %CRLF% may appear any number of times inside a set of braces { } then we'll need a different approach.

This one will be challenging to do with only RegEx... any other ideas anyone?
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Post by Niroj@Work » Wed Apr 07, 2010 5:23 am

There is some problem with RegEx command itself.

Few days back I raised the issue with RegEx when we are matching a string containing curly braces.
http://www.mjtnet.com/forum/viewtopic.p ... highlight=

As curly braces are used for complex evaluation in macroscheduler it is creating problem.


Please run the below code:
=======================
Let>str=abcd %CRLF% hhjh jhj {select hgh %CRLF% hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jkj}%CRLF%


Let>pattern=\{.+?\}
RegEx>pattern,%str%,0,match_array,num,1,_REPLACE_ME_,str

Let>i=1

Repeat>i
RegEx>%CRLF%,match_array_%i%,0,,,1,,q
RegEx>(_REPLACE_ME_).+?,str,0,,,1,%q%,str
Add>i,1
Until>i>num

mdl>str
==============

The error I am getting is: "Unknown Identifier SELECT" and "Unknown Identifier NB".. These are silly error is coming at "{" position.

User avatar
jpuziano
Automation Wizard
Posts: 1085
Joined: Sat Oct 30, 2004 12:00 am

Post by jpuziano » Wed Apr 07, 2010 5:51 am

Hi Niroj@Work,

There is something coming in Macro Scheduler version 12 that might address the problem that { and } chars may be causing due to their use in Complex Expressions. I'll leave Marcus to explain more about that...

In the meantime, I put together a non-RegEx solution that can handle any number of newlines within any { braces } section:

Code: Select all

Let>str=abcd %CRLF% hhjh jhj {select hgh hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jk %CRLF% j}%CRLF%
Length>str,str_len
Let>processed_str=
Let>char_located_inside_braces=NO
Let>char_index=0

Repeat>char_index
  Let>char_index=char_index+1
  MidStr>str,char_index,1,char
  If>char_located_inside_braces=NO
    If>char<>{
      ConCat>processed_str,char
    Else
      Let>char_located_inside_braces=YES
      //initialize this within-braces sub string
      Let>braces_section=
      ConCat>braces_section,char
    EndIf
  Else
    //code for when char_located_inside_braces=YES
    ConCat>braces_section,char
    If>char=}
      Let>char_located_inside_braces=NO
      //clean this within-braces sub string
      StringReplace>braces_section,%CRLF%,~,braces_section
      ConCat>processed_str,braces_section
    EndIf
  EndIf
Until>char_index=str_len

MDL>processed_str
Again, I chose ~ as my replacement char instead of null so that the MDL> line can show the result to prove that it is working.

To replace with a null char instead, change the StringReplace line to this:

StringReplace>braces_section,%CRLF%,%NULLCHAR%,braces_section

So... does this do the job for you?
jpuziano

Note: If anyone else on the planet would find the following useful...
[Open] PlayWav command that plays from embedded script data
...then please add your thoughts/support at the above post - :-)

Niroj@Work
Pro Scripter
Posts: 63
Joined: Thu Dec 10, 2009 8:13 am

Post by Niroj@Work » Wed Apr 07, 2010 6:06 am

jpuziano wrote:
Niroj@Work wrote:It will not work as %CRLF% may come any number times ..
I just saw this. If you're saying that %CRLF% may appear any number of times inside a set of braces { } then we'll need a different approach.

This one will be challenging to do with only RegEx... any other ideas anyone?
As I mentioned this I am trying in the below way:
=============================
1) First I am trying to replace "{.....}" with "_REPLACE_ME_" anywhere in the line and catching the strings within "{...}" in match_array.

Let>pattern=\{.+?\}
RegEx>pattern,%str%,0,match_array,num,1,_REPLACE_ME_,str

2) Then I am replacing the %CRLF% in the strings by looping through "match_array" with NULL.
RegEx>%CRLF%,match_array_%i%,0,,,1,,q

3) After that again I am trying to replace the 1st occurrence of "_REPLACE_ME_" in %str% by %q% (But don't know how to do?).
I tried as:
Let>i=1
Repeat>i
RegEx>%CRLF%,match_array_%i%,0,,,1,,q
//Please try to replace _REPLACE_ME_ 1st occurrence.
RegEx>(_REPLACE_ME_).+?,str,0,,,1,%q%,str
//
Add>i,1
Until>i>num


And by doing so I am getting absurd error messages like unknown identifier SELECT, NB ... as I already discussed (in "{" issue).

Could you please try in my way once where for each value of match_array I want to replace the 1st occurrence of "_REPLACE_ME_" of %str% with %q%?

User avatar
Marcus Tettmar
Site Admin
Posts: 7395
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Wed Apr 07, 2010 8:05 am

jpuziano wrote:Interesting... I tried the following:

Code: Select all

Let>str=abcd %CRLF% hhjh jhj {select hgh %CRLF% hh=!3j h %CRLF% jh ? }jdjk jdkj jkj %CRLF% {nb bb hjdh %CRLF%%CRLF% nb jkj}%CRLF%

Let>pattern=(?<r>pattern,str,0,match_array,num,1,~,new_string

MDL>new_string
However it fails with the error: Regular Expression pattern not compiled.

I'm not sure why though, maybe Marcus can shed some light.

Even if the above error could be eliminated though, I am not sure if this would catch and change multiple occurances of %CRLF% within { and } chars... it might only catch the first or last occurance within any { and } pair.

You can do this brute force with procedural code of course... but I am thinking you're looking for a regex solution here.

Marcus or anyone else, please jump in.
Seems to only happen if the replace flag is set. If not replacing there is no error. I'm not sure why. I have put it on the to do list and will investigate.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts