Difficulty capturing data from HTML elements

Hints, tips and tricks for newbies

Moderators: Dorian (MJT support), JRL

Post Reply
vinipinho14
Newbie
Posts: 4
Joined: Sat Jul 01, 2023 9:25 pm

Difficulty capturing data from HTML elements

Post by vinipinho14 » Sun Jul 02, 2023 2:11 pm

"I tried to traverse a DIV element in an HTML file to capture data from other DIVs contained within it, but so far I haven't been successful. I attempted to use EdgeFindElements, but it is not returning the desired information. Could someone assist me with this?

Here is a simplified example of my HTML code:"

Code: Select all

<!DOCTYPE html>
<html>
<head>
  <title>Example DIV</title>
</head>
<body>
<div id="q" class="y">
  <div id="7521" data-it="card">content1</div>
  <div id="8523" data-it="card">content2</div>
  <div id="6598" data-it="card">content3</div>
  <div id="2154" data-it="card">content4</div>
  <div id="5262" data-it="card">content</div>
</div>
<div id="e" class="t">
  <div id="ass" class="card">content</div>
</div>
</body>
</html>
Thank you in advance for any help you can provide!

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1389
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Re: Difficulty capturing data from HTML elements

Post by Dorian (MJT support) » Sun Jul 02, 2023 2:53 pm

I have edited your html as you were missing the = after data-it on some of the lines.

Here are a couple of ways you can do this :

Code: Select all

let>TheURL=D:/vinipinho14.htm
Let>EDGEDRIVER_EXE=d:\msedgedriver.exe
EdgeStart>session_id

EdgeNavigate>session_id,url,%TheURL%

//Extract each div[@data-it='card'
EdgeFindElements>session_id,xpath,//div[@data-it='card'],el

let>extractloop=0
repeat>extractloop
  let>extractloop=extractloop+1
  EdgeGetElementData>session_id,el_%extractloop%,text,TheText
  mdl>TheText
Until>extractloop,el_count

//Or get anything with id=q all at once.
EdgeFindElements>session_id,id,q,elements
EdgeGetElementData>session_id,elements_1,text,TheText2
MDL>TheText2
Yes, we have a Custom Scripting Service. Message me or go here

vinipinho14
Newbie
Posts: 4
Joined: Sat Jul 01, 2023 9:25 pm

Re: Difficulty capturing data from HTML elements

Post by vinipinho14 » Sun Jul 02, 2023 10:08 pm

Dorian, thank you for the response and help. Please clarify one more point for me, as I still have doubts about how this tool works.

When I tried the code you provided, it returned all the texts from the lines in the same message box. The problem is that I would like to access each line (DIV) individually so that I can capture, for example, each of the IDs and handle each line separately.

Let's suppose I want to access and extract individual texts from the IDs. For example:
Line 1: ID="7521" and text: "content1"
Line 2: ID="8523" and text: "content2"
Line 3: ID="6598" and text: "content3"

And so on.

How can I do this, calling the parent DIV and having it provide me with its child DIVs?

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1389
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Re: Difficulty capturing data from HTML elements

Post by Dorian (MJT support) » Sun Jul 02, 2023 10:32 pm

My first method got them individually.
Yes, we have a Custom Scripting Service. Message me or go here

vinipinho14
Newbie
Posts: 4
Joined: Sat Jul 01, 2023 9:25 pm

Re: Difficulty capturing data from HTML elements

Post by vinipinho14 » Sun Jul 02, 2023 11:08 pm

Thank you for your response, Dorian.

I didn't express myself clearly, so let me explain it better below.

In the example code I provided, the ID and attribute of the DIVs are explicitly specified. It's what you locate using EdgeFindElements.

I'm dealing with a situation where the IDs of the child DIVs are randomly generated, and they vary based on the system's order number. Therefore, I don't have that information beforehand. That's why I need to traverse through the parent DIV (which remains constant) and extract the randomly generated ID from each child DIV. Once I have the ID, I can extract other related information that will be generated.

Did I explain myself better?

I apologize as I am a beginner, and I have already struggled with the codes you provided without finding a solution. That's why I'm reaching out to you for help.

In Python it would look something like this:

Code: Select all

from selenium import webdriver

# Initialize the Selenium driver
driver = webdriver.Chrome()

# Access the HTML document
driver.get("url_of_the_html_document")

# Locate the parent div using XPath
parent_div = driver.find_element_by_xpath('//*[@class="q"]')

# Locate all the child divs within the parent div
child_divs = parent_div.find_elements_by_tag_name('div')

# Iterate over the child divs and retrieve the ID and class of each
for child_div in child_divs:
    div_id = child_div.get_attribute('id')
    div_class = child_div.get_attribute('class')
    print("ID:", div_id)
    print("Class:", div_class)

# Close the Selenium driver
driver.quit()

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1389
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Re: Difficulty capturing data from HTML elements

Post by Dorian (MJT support) » Mon Jul 03, 2023 8:48 am

While my first example would have returned 5 data-it="card" results, it was not specifically looking for it as a child of id="q". This one does

Code: Select all

EdgeFindElements>session_id,xpath,//div[@id='q']/div[@data-it='card'],el
So :

Code: Select all

let>TheURL=D:/vinipinho14.htm
Let>EDGEDRIVER_EXE=d:\msedgedriver.exe
EdgeStart>session_id

EdgeNavigate>session_id,url,%TheURL%

EdgeFindElements>session_id,xpath,//div[@id='q']/div[@data-it='card'],el
MDL>%el_count% found

let>extractloop=0
repeat>extractloop
  let>extractloop=extractloop+1
  EdgeGetElementData>session_id,el_%extractloop%,text,TheText
  mdl>TheText
Until>extractloop,el_count
Yes, we have a Custom Scripting Service. Message me or go here

vinipinho14
Newbie
Posts: 4
Joined: Sat Jul 01, 2023 9:25 pm

Re: Difficulty capturing data from HTML elements

Post by vinipinho14 » Mon Jul 03, 2023 1:25 pm

Dorian, thank you very much! You helped me a lot!

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1389
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Re: Difficulty capturing data from HTML elements

Post by Dorian (MJT support) » Mon Jul 03, 2023 1:42 pm

My pleasure.
Yes, we have a Custom Scripting Service. Message me or go here

Post Reply
cron
Sign up to our newsletter for free automation tips, tricks & discounts