Help with retrieving any email addresses from a text file

Hints, tips and tricks for newbies

Moderators: Dorian (MJT support), JRL

Post Reply
User avatar
Rain
Automation Wizard
Posts: 550
Joined: Tue Aug 09, 2005 5:02 pm
Contact:

Help with retrieving any email addresses from a text file

Post by Rain » Sun Dec 09, 2007 5:40 pm

I'm sure this has been discussed in the past I just can't find any examples. :oops:

I'm working on a script that retrieves emails from my server and have it extract the email address from the sender and reply using the the extracted address. I have the script working perfectly, all I need now is to extract the senders email address. I can read each line of the text file in Macro Scheduler but I'm stumped when it comes to retrieving the email address only. Note! The email addresses are never the same.

This is the line I want to retrieve the email addresses from.
From: "senders_name@senders_server.com"

I would appreciate if someone could post an example or point me to a discussion that covered this already. I don't need help with retrieving or sending of emails only the email extraction part.

Thank you in advance.

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Sun Dec 09, 2007 7:44 pm

You need to use VBScript's Regular Expressions (RegEx) to extract the email address.

VBScript has RegEx tools. No time for sample now, but search tis forum for RegEx examples.

----------------------------
If not RegEx, you can also use Position and MidStr functions.
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

Me_again
Automation Wizard
Posts: 1101
Joined: Fri Jan 07, 2005 5:55 pm
Location: Somewhere else on the planet

Post by Me_again » Mon Dec 10, 2007 2:02 am

You always want the email address between the ? Here's the simple way using Pos> and MidStr>:

Code: Select all

Let>mystring=From: "senders_name@senders_server.com" <senders_name@senders_server.com>
Pos><,mystring,1,emstart
Pos>>,mystring,1,emstop
Let>emstart=emstart+1
Let>emlen=emstop-emstart
MidStr>mystring,emstart,emlen,emaddr
MDL>emaddr
Get the line with ReadLn> or whatever.

User avatar
Rain
Automation Wizard
Posts: 550
Joined: Tue Aug 09, 2005 5:02 pm
Contact:

Post by Rain » Mon Dec 10, 2007 1:16 pm

Bob Hansen wrote:You need to use VBScript's Regular Expressions (RegEx) to extract the email address.

VBScript has RegEx tools. No time for sample now, but search tis forum for RegEx examples.

----------------------------
If not RegEx, you can also use Position and MidStr functions.
Thanks Bob

I found an example at http://www.mjtnet.com/forum/viewtopic.php?p=16467#16467

I keep getting this error:
"Microsoft VBScript runtime error :5

Invalid procedure call or argument: 'Mid'

line 12, Column 2 "

I followed all the instructions posted in that thread but to avail. There was never really a solution for the VB script error.

Here is my code. I have http://www.google.com on line 5 in the MSG1.txt to test the URL search script.

Code: Select all

//A VBScript Function to search a string for a regex pattern
//returns a list of matches separated by semicolons
VBSTART
Function regExSearch(patrn,str)
  Set regEx = New RegExp ' Create regular expression.
  regEx.Pattern = patrn ' Set pattern.
  regEx.IgnoreCase = True ' Make case insensitive. Default=False
  Set matches = RegEx.Execute(str)
  List = ""
  For each match in matches
  	 List = List & match.value & ";"
  Next
  regExSearch = Mid(List,1,Len(List)-1)
End Function
VBEND

//Read the file contents into a variable
ReadFile>C:\RetrievedMail\MSG1.txt,FileData

//replace CRLF chars with VBScript equivalents
StringReplace>FileData,CR," & vbCR & ",FileData
StringReplace>FileData,LF," & vbLF & ",FileData
//Double quote any quotes for VBScript
StringReplace>FileData,","",FileData

//Perform the regex search
VBEval>regExSearch("REGEX_PATTERN","%FileData%"),URLList

//We now have a semicolon delimited list of URLs.  We could explode this into an array:
Separate>URLList,;,URLS
If>URLS_COUNT>0
  Let>k=1
  Repeat>k
    Let>ThisURL=URLS_%k%
    MessageModal>ThisURL
	//we could write it to a file:
	WriteLn>C:\RetrievedMail\result.txt,result,ThisURL
    Let>k=k+1
  Until>k,URLS_COUNT
Endif
I also would like to know how I can replace VBEval>regExSearch("REGEX_PATTERN","%FileData%"),URLList with this code ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$ to search for email addresses instead of URL's in the text file?

Thanks

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Tue Dec 11, 2007 4:35 am

I had no time to look at the code you provided but just took a quick look at your final question. Also assumes your RegEx is correct.

I think that these UNTESTED lines should work for you.

Let>Pattern=^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$
VBEval>regExSearch("%Pattern%","%FileData%"),URLList

Replace the URLList and FileData with the appropriate values.
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

User avatar
Rain
Automation Wizard
Posts: 550
Joined: Tue Aug 09, 2005 5:02 pm
Contact:

Post by Rain » Fri Dec 14, 2007 1:34 pm

Thanks for taking the time to reply, Bob.

I'll let you know what I come up with.

User avatar
Marcus Tettmar
Site Admin
Posts: 7380
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Fri Dec 14, 2007 1:53 pm

Here's a working example:

Code: Select all

//A VBScript Function to search a string for a regex pattern
//returns a list of matches separated by semicolons
VBSTART
Function regExSearch(patrn,str)
  Set regEx = New RegExp ' Create regular expression.
  regEx.Pattern = patrn ' Set pattern.
  regEx.IgnoreCase = True ' Make case insensitive. Default=False
  Set matches = RegEx.Execute(str)
  List = ""
  For each match in matches
  	 List = List & match.value & ";"
  Next
  If List <> "" Then
    regExSearch = Mid(List,1,Len(List)-1)
  end if
End Function
VBEND

//test data:
Let>Line=From: "senders_name@senders_server.com" <senders_name@senders_server.com>

//replace CRLF chars with VBScript equivalents
StringReplace>Line,CR," & vbCR & ",Line
StringReplace>Line,LF," & vbLF & ",Line
//Double quote any quotes for VBScript
StringReplace>Line,","",Line

//Perform the regex search
Let>Pattern=[_a-zA-Z\d\-\.]+@[_a-zA-Z\d\-]+(\.[_a-zA-Z\d\-]+)+
VBEval>regExSearch("%Pattern%","%Line%"),Email
MessageModal>Email
The benefit of the regular expression is that it can be set up to cope whatever the format. However, if the format is always the same you may as well use me_again's simpler solution above.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

User avatar
Marcus Tettmar
Site Admin
Posts: 7380
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Fri Dec 14, 2007 2:02 pm

And this one sets the Global property to true and will therefore return ALL emails in the string, and could therefore be used to return all email addresses found in your mail file:

Code: Select all

//A VBScript Function to search a string for a regex pattern
//returns a list of matches separated by semicolons
VBSTART
Function regExSearch(patrn,str)
  Set regEx = New RegExp ' Create regular expression.
  regEx.Pattern = patrn ' Set pattern.
  regEx.Global = True
  regEx.IgnoreCase = True ' Make case insensitive. Default=False
  Set matches = RegEx.Execute(str)
  List = ""
  For each match in matches
  	 List = List & match.value & ";"
  Next
  If List <> "" Then
    regExSearch = Mid(List,1,Len(List)-1)
  end if
End Function
VBEND

//test data:
//Let>FileData=From: "senders_name@senders_server.com" <senders_name@senders_server.com>
ReadFile>C:\RetrievedMail\MSG1.txt,FileData

//replace CRLF chars with VBScript equivalents
StringReplace>FileData,CR," & vbCR & ",FileData
StringReplace>FileData,LF," & vbLF & ",FileData
//Double quote any quotes for VBScript
StringReplace>FileData,","",FileData

//Perform the regex search
Let>Pattern=[_a-zA-Z\d\-\.]+@[_a-zA-Z\d\-]+(\.[_a-zA-Z\d\-]+)+
VBEval>regExSearch("%Pattern%","%FileData%"),Email
MessageModal>Email
This returns a semi-colon delimited list. Use Separate to explode to an array.

I expect you actually just want to read the file line by line and ONLY return the email address for the line beginning "From:". Otherwise how do you know which email address you want?

Therefore I'd combine my first example with a ReadLn loop which checks to see if the first five chars of the line are "From:" and if so calls the regex function to find the email address.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

Me_again
Automation Wizard
Posts: 1101
Joined: Fri Jan 07, 2005 5:55 pm
Location: Somewhere else on the planet

Post by Me_again » Sat Dec 15, 2007 7:24 pm

A couple of comments.

I tested Marcus' first example and it is capturing the info between the "" quotes, but in an email header that's usually the real name, the email address is between the carets.

If the purpose is to reply to the emails then it would be better to grab the Reply-to header line rather than the From line.

User avatar
Marcus Tettmar
Site Admin
Posts: 7380
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Sat Dec 15, 2007 7:30 pm

The first example captures only the first matching email address, which in the case of the test data was between quotes. If you replace that with a name it should find the email between chevrons which is now the only one in the line.

The second example returns all email addresses in the input data.

Feel free to tweak the regex pattern according to your needs - otherwise the code remains the same.

Me_again
Automation Wizard
Posts: 1101
Joined: Fri Jan 07, 2005 5:55 pm
Location: Somewhere else on the planet

Post by Me_again » Sat Dec 15, 2007 7:33 pm

Yeah, I just thought of, and tested, the first match thing and was coming back to report that :oops:

User avatar
Rain
Automation Wizard
Posts: 550
Joined: Tue Aug 09, 2005 5:02 pm
Contact:

Post by Rain » Sun Dec 16, 2007 4:34 pm

Thank you!
The 1st example works perfectly. I replaced the quotes with carets to grab the email address and not the name.

Again, thank you Marcus and Bob.

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts