Multiple Text Selected

Questions and answers about anything related to Helium Scraper
Post Reply
zeeshans
Posts: 7
Joined: Sun Jun 05, 2011 9:03 am

Multiple Text Selected

Post by zeeshans » Sun Jun 05, 2011 9:06 am

Hi,

I am trying to create different kinds etc Name, Address, Email, Website etc of a web database. But in the selection mode when I try to select the text, all the text is selected where as i need to select only Name and Address and other things separately so that I can create kinds. Can you help me out with selecting only the text I want and then making a kind?

Regards,

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Multiple Text Selected

Post by webmaster » Sun Jun 05, 2011 7:31 pm

Sounds to me like they are in an IFRAME. Is it possible for you to send the URL of the page or if not, the HTML code?
Juan Soldi
The Helium Scraper Team

zeeshans
Posts: 7
Joined: Sun Jun 05, 2011 9:03 am

Re: Multiple Text Selected

Post by zeeshans » Mon Jun 06, 2011 5:33 am

Dear Juan,

Thanks for your reply. Please find below the url

http://sourcemiddleeast.com/subcategory ... bound.html

I need to produce the list of these companies with Name, Address, Phone, Email etc in separate columns.

Thanks for your help!

Regards,

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Multiple Text Selected

Post by webmaster » Mon Jun 06, 2011 7:39 pm

Hi,

The attached project should get you started. It uses JavaScript gatherers to extract the text between two sections of text out of the whole business section's HTML code (by business section I mean what gets selected when you click, for instance, on an address or a phone). If you go to Project -> JavaScript Gatherers you will see three gatherers that extract phone, fax and address. You can them as a template to extract other things. You will see on the top of each of them code two lines that start with "var strLeft = " and "var strRight = ". These are variables that take as values the beginning and the end of a text you want to find inside the business section's code.

To create a JavaScript gatherer that selects another piece of text, just copy and paste on of the existing JavaScript gatherers, and replace the values of "var strLeft = " and "var strRight = " for the code that is in between whatever piece of text you want to extract. To see the code, just select any business section by activating selection mode and then clicking on a phone or address, and then click on the "View code" button in the selection panel. This will let you see the code that surrounds any piece of text you would like to extract.

Note that to select the address, I'm using as a left side delimiter the "P.O. Box " text. I'm assuming all addresses in this site will start with this text.
Attachments
SourceMiddleEast.hsp
(395.74 KiB) Downloaded 604 times
Juan Soldi
The Helium Scraper Team

Post Reply