Finding Text on HTML page - identifying X,Y coordinates

Technical support and scripting issues

Moderators: Dorian (MJT support), JRL

Post Reply
dbish
Junior Coder
Posts: 49
Joined: Wed Jan 08, 2003 8:38 am

Finding Text on HTML page - identifying X,Y coordinates

Post by dbish » Wed Apr 06, 2005 2:23 pm

I have either a simple or very hard question. I am trying to find some text on a web page and then clicking on the link at that location.

I bring up the webpage and Run the IE Find Text function to locate my text

ExecuteFile>http://www.mysimon.com/
Wait>10
SetFocus>Simon*
Press ALT
Send>E
Release ALT
Wait>.1
Send>F
Wait>.2
Send>Books
Wait>.2
Press Enter
Wait>.2
SetFocus>Find
Press ALT
Press F4
Release ALT
Wait>5
GetCursorPos>x,y
messageModal>s= %x% y= %y%

First, as an aside, I have found that in many apps the CTRL F (or whatever keyboard shortcut is) doesn't work but Going to toolbar / Edit and then typing Keyboard shortcut does work.

Anyway, I bring up the FIND box, enter my text, find it on the web page which is then highlighted. I want to click on the highlighted piece. The code above comes up with some spurious location (perhaps where I last typed on any app?)

How do I capture the highlighted text location?

Thanks

dbish

User avatar
support
Automation Wizard
Posts: 1450
Joined: Sat Oct 19, 2002 4:38 pm
Location: London
Contact:

Post by support » Wed Apr 06, 2005 2:28 pm

If you use VBScript and automate the IE activeX interface you will get access to the document object model and can then get elements via their x,y position. You won't be able to do it the way you are at present.

Try CTRL-f instead of CTRL-F. As a rule of thumb I always send lowercase characters where I am pressing a key with ALT or CTRL to avoid it being interpreted as having shift pressed (F = SHIFT-f).
MJT Net Support
[email protected]

dbish
Junior Coder
Posts: 49
Joined: Wed Jan 08, 2003 8:38 am

Post by dbish » Wed Apr 06, 2005 3:07 pm

Thanks for the fast reply. Going to the second part first, I tried using CTRL f and it worked great - I never asked this question and have used the workaround for many many years. Just goes to show I should ask!

On the main question - I have no idea what you are talking about. The web page is not mine so I don't think I can activate activeX stuff in the page. I used an example in the code I provided that is different from my actual page. The data I am trying to get lives behind a password protected wall. The actual page is a simple page which just lists a text description and then a link for files to download. It looks like what IE displays when you use IE to access an FTP server - but this is not an FTP site - it is a html page.

Example:

File 1: HIGHLIGHTED LINK FOR FILE of 20050401
File 2: HIGHLIGHTED LINK FOR FILE of 20050402
File 3: HIGHLIGHTED LINK FOR FILE of 20050403
File 4: HIGHLIGHTED LINK FOR FILE of 20050404
etc.

So, considering that I don't control the web page, can I do what you had suggested? If so, I don't even know where to start.

Sorry to sound stupid . . . . but - well, there you have it (g>)

dbish

User avatar
support
Automation Wizard
Posts: 1450
Joined: Sat Oct 19, 2002 4:38 pm
Location: London
Contact:

Post by support » Wed Apr 06, 2005 3:32 pm

Internet Explorer has an ActiveX interface that lets you get right inside and access the document within it. Search this board for InternetExplorer.Application and you will see ways of creating an IE object and accessing it's methods and properties. One of these properties is the document object itself which exposes various other objects and methods for getting right at the tags and elements within the document. What you can do is extremely powerful and I'd recommend reading up about the Document Object Model on msdn.microsoft.com. For now here's a *very* simple example that extracts all the links in a page:

VBSTART
Dim IE
Sub QuickExample
Set IE = CreateObject("InternetExplorer.Application")
IE.visible = true
IE.Navigate "http://www.mjtnet.com/"
do while IE.busy = true
loop
dim i
for i = 0 to IE.Document.Links.length - 1
MsgBox(IE.Document.Links.item(i).InnerText & vbCRLF & IE.Document.Links.Item(i).href)
next
End Sub

Sub ElementFromPoint(x,y)
dim e
Set e = IE.document.ElementFromPoint(x,y)
MsgBox e.tagname & ":" & e.outerHTML
End Sub

VBEND

VBRun>QuickExample
VBRun>ElementFromPoint,100,75

This only scratches the surface of what is possible. You can access all the elements in the page. You will need to read the DOM documentation to find out the syntax for all the elements and to see all the other methods that are available.

Essentially instead of doing what you are doing now and trying to automate the GUI of IE and using the find function you will be able to create an IE object, navigate to the page in question and then parse the document to find the text you want and extract the information required using this approach.
MJT Net Support
[email protected]

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Thu Apr 07, 2005 6:32 am

Here is a link to Document Object Model

And here is a link from within those pages, specifically for Finding Text in the Document
Hope this was helpful..................good luck,
Bob
A humble man and PROUD of it!

dbish
Junior Coder
Posts: 49
Joined: Wed Jan 08, 2003 8:38 am

Post by dbish » Thu Apr 07, 2005 3:59 pm

I have read up on this technique and I have two observations - one - yes it is powerful; and two - it is complicated. Makes one long for the old days: Set Focus, click and be done!

I just finished re-teaching myself QBASIC so some of this makes sense and some it not clear. My immediate goal is to just solve a problem and not "LEARN" the new language. I substituted my website in Marcus's example (from above) and it churned through the links as it should. However, when I actually go to create the IE object for the first time in an IE session I am prompted for the user ID and password. This object is not a webpage form but it looks like a network rights login box. In MSCHED I would normally say setfocus>Login, wait, and then put my userid/password in. Now, however, the VSCRIPT module has control. How do I identify the object name of the Login Box and how does one "set Focus" and add the required info.

If this is too open-ended (and hard to answer) I understand.

I have about five major projects on my plate now and this is not priority one. If someone has thoughts I would love to hear them. I will continue to read up on DOM (including Bob's latest link) and may be able to grasp it yet.

Thanks for the feedback.

Dave

P.S. - Marcus, do you see MSCHED moving toward a VSCRIPT type of tool or staying a "click here - do that" / keystroke emulator app?

User avatar
pgriffin
Automation Wizard
Posts: 460
Joined: Wed Apr 06, 2005 5:56 pm
Location: US and Europe

VBScript with MacroScheduler

Post by pgriffin » Thu Apr 07, 2005 4:39 pm

dbish,

You are like me. I don't want to "learn" everything under the sun, but I love to work in MacroScheduler. My advice, do the macroschedule part yourself and send the VBScript bits to Rentacoder.com. Thousands of coders from who-knows-where in the world will gladly write some VB for peanuts.

SkunkWorks.

JRS
Pro Scripter
Posts: 71
Joined: Thu Nov 04, 2004 5:19 am

Post by JRS » Mon Aug 22, 2005 4:56 am

To Mjtnet Support,

I'm finding the link that has led me here to DOM (Document
Object Models) and IE Objects to be like a doorway whole new
and unfamiliar universe. I have to admit its kind of slow going
for me. In your *very* simple example that extracts all the
links on a page ... I kind of see what's going on. I was just
wondering if it would be that much more of an issue in your
example to add to it getting/deriving the the x-y coordinates of
the links that are being extracted in the loop?

If this would be possible I think I could continue from there and
would be happy to share some solutions and code I develop on
this forum in this regard onceI get the gist of this.

Thanks for your understanding.

Joel S.

User avatar
support
Automation Wizard
Posts: 1450
Joined: Sat Oct 19, 2002 4:38 pm
Location: London
Contact:

Post by support » Mon Aug 22, 2005 7:45 am

To get the position of an object use the offsetLeft and offsetTop properties. This gives you it's position relative to it's parent element. So to get actual position on a page relative to the body you may need to drill down (or back up).

If you want the offsetTop and offsetLeft of the object just do this:

Left = IE.Document.Links.item(i).offsetLeft
Top = IE.Document.Links.item(i).offsetTop

Don't be too overwhelmed by the document object model. It's just a case of knowing the names of the properties and tagging them on. There are also collections such as the Links collection this example uses. Collections are just easier ways of accessing certain element types.

If you follow the link Bob provided you will see an example on getting an elements position and dimensions:

http://msdn.microsoft.com/workshop/auth ... _Element_s

Click on "Getting an Element's Position and Dimensions"

The above URL explains collections, how to access them, how to access elements and how to access their properties.
MJT Net Support
[email protected]

JRS
Pro Scripter
Posts: 71
Joined: Thu Nov 04, 2004 5:19 am

Post by JRS » Mon Aug 22, 2005 10:31 am

Ahh Mjtnet Support ...

Thank you for your prompt and (as always) cordial and courteous reply.
Will investigate and study your reply/suggestion(s). I think
it will really help. Much Appreciated.

Joel S.

JRS
Pro Scripter
Posts: 71
Joined: Thu Nov 04, 2004 5:19 am

Post by JRS » Thu Aug 25, 2005 8:59 am

I have been earnestly taking some good advice and studying
DOM (Document Object Models) and IE Objects. (before I continue
please pardon any obscure/incorrect terminology I may use - you
can tell what kind of a newbie I am in this regard)

From what I understand becoming well versed in DOM and IE Objects
will allow much better (more accurate, faster etc.) control
of especially a web page than interacting with the GUI of the
browser itself. In other words instead of tabbing to a given field,
I'm directly accessing the object (i.e. the field) itself with immediate
control of the attributes and properties of the object.

It seems (uh boy I hope I explain myself OK here) from the
examples (from the Microsoft page) what is being "keyed" on is
the HTML code of the given web page. In Interent Explorer the is
a "View" option and on selecting it in the dropdown menu there is
a "Source" option. On selecting "source" I see all the HTML of
the given web page. On almost every web page I go to if I
select an link to a different frame in the given web page, on
selecting Source will usually show all of the text of the "new
frame". In other words here on the Mjtnet.com it seems at any
"place" I am on this web page be it the main page, the forum
etc. if I select Source I will "see" an HTML representation of the
page.

However on http://www.betfair.com which is critical for my intended
application/objective whatever selection I make I always will
"see" the same HTML code on "implementing a Source". Whatever
I do/wherever I go ... the HTML seems remain the same and
whatever is going on with the webpage is being conducted within
a certain Javascript script (js)

My question is in this context are the DOM and IE Objects still applicable?
The objects seem to be the scripts themselves (???). If this is the case
then would not the solution/work around be interaction with the
browser GUI?.

In case I don't get a handle on DOM and IE Objects (as my question
may indicate how far away I am) I still am programming from the
Internet browser "GUI respect" and while having success ... my goal
is to program most effectively/efficiently. But that said are there
instances (perhaps this one?) where DOM and IE Object aren't
applicable?

Putting it simplest terms from the Source I don't see a lot of
anything I can search and/or control from a DOM/IE Object
perspective.

Thanks,

Joel S.


==============================================

HTML from http://www.betfair.com

==============================================







Betfair - online sports betting exchange, bet on football & horse racing fixed odds




















var bCanTrade = false;




























%>



Please click here if...




is a

User avatar
pgriffin
Automation Wizard
Posts: 460
Joined: Wed Apr 06, 2005 5:56 pm
Location: US and Europe

Post by pgriffin » Thu Aug 25, 2005 2:29 pm

I posted this script on another thread, but here it goes again. This will find the first string matching the text you enter into the input box, then activate that link. On a site like Betfair, it has it's shortcomings, to say the least, but it might be a starting point and far simpler than DOM and ActiveX, et...

Just open Betfair.com and past this script into your MacroSched editor and run it.

label>Start
input>Link2Find,Enter the link you want find
setfocus>Betfair*
wait>1
rem> keystrokes to open the Find dialog box
Press CTRL
Send>f
Release CTRL
wait>1
rem> Here you could send any text or variable.
Send>%Link2Find%
wait>.1
rem> Keystrokes to select the Find button
Press ALT
Send>F
Release ALT
wait>1
rem> Move to the Cancel button to close the Find dialog box
Press TAB
Press ENTER
wait>.5
rem> With the desired text selected by "Find", it only takes one TAB to move the cursor to that text.
Press TAB
rem> voila!, (unless it didn't work for you....) then replace the 'voila' with 'oops'...
Press ENTER


SkunkWorks

JRS
Pro Scripter
Posts: 71
Joined: Thu Nov 04, 2004 5:19 am

Post by JRS » Thu Aug 25, 2005 8:38 pm

Hello Skunkworks,

Thank you very much for posting (again) your alternative to
DOM and ActiveX etc. My solution is variation on your theme
but one definitely one based on your premise. You really helped
me a lot. My point in my post is I (and I'm sure yourself as well)
want to use interact with programs and web pages most effectively
and efficiently as possible with Macro Scheduler. It just seems
of all the web pages I need to "tangle with" just my luck (no pun
intended), this Betfair.com is a "different animal" or has a "different
look" than most others. Thankfully, I have an alternative to something
that has the appearance "you don't want to go there" in a large
part to your help.

Joel S.

carfan
Newbie
Posts: 9
Joined: Tue Jul 17, 2007 11:07 am

Using FireFox instead of IE

Post by carfan » Mon Aug 13, 2007 3:38 am

How can I get it to work using FireFox instead of IE?
Can I replace InternetExplorer.Application with "Firefox.Application"
Where can I find a name list of "qualified" applications that can be created using CreateObject?
Set IE = CreateObject("InternetExplorer.Application")

Thanks.

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts