Extract particular lines from a block of text

Here we will be posting premade Helium Scraper projects and helpful stuff.
Post Reply
webmaster
Site Admin
Posts: 491
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Extract particular lines from a block of text

Post by webmaster » Tue Jun 28, 2011 4:03 am

This project contains a JavaScript gatherer that you can customize to extract one or more lines from a text block that spans over several lines. I've seen many pages on which, for instance, addresses and phones are displayed on web pages as a single element, which causes Helium Scraper to select the whole thing instead of just the phone or just the address.

The attached project solves this problem on cases where the required elements are always on the same line (for instance, the phone number is always on the third line). To use it, open the JS_LinesBlock JavaScript gatherer from the Project -> JavaScript Gatherers menu, and set the first and last line you would like that gatherer to extract (there are further instructions on the gatherer's code). If you would like to create another gatherer to extract some other lines, you can simply create a new gatherer with the New button and copy and paste the code from the JS_LinesBlock gatherer.

You can see what will be extracted by selecting the text block and looking at the selection panel bellow the browser.
Attachments
LinesBlock.hsp
(298.17 KiB) Downloaded 532 times
Juan Soldi
The Helium Scraper Team

Post Reply