JS gatherer with search in page

Questions and Answers about programming Helium Scraper.
Post Reply
Maxkuba
Posts: 5
Joined: Wed May 11, 2011 4:31 pm

JS gatherer with search in page

Post by Maxkuba » Wed May 18, 2011 5:45 pm

http://heliumscraper.com/wordpress/?p=115

Hi juan,
you postet on your Blog that with java script gathers a lot is possible.
Can you help me in this ?
Im trying to search in a page the string "Kg", and the return the value to a variable which should be used to extract the text.
I cannot write js.
I tried like that:

Code: Select all

var myRegExp1 = /kg|Kg/;
var result = search(myRegExp1);
return result;
Can you see what Im doing wrong?
I tested it like you described by only selecting this "JS_Kg" in the active properties options..
Regards

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: JS gatherer with search in page

Post by webmaster » Wed May 18, 2011 7:13 pm

I'm not sure what you are trying to do. Are you trying to get the text before "kg" or are you trying to return true or false depending on whether the text contains the "kg" word or not?

If you are trying to get the text before "kg" perhaps this post will help. Note that the code posted there is case sensitive, which means you will either need to use "Kg" or "kg". To use both you can just change this line in the JavaScript gatherer:

Code: Select all

text = element.innerText;
for this:

Code: Select all

text = element.innerText.toLowerCase();
and then use just "kg" since if there is a "Kg" it will be converted to "kg" because of the call to toLowerCase().

Also, a note about your code, you are not using the element parameter in it. A JavaScript without it doesn't make any sense because the function of a JavaScript gatherer is to return some information about the element. The element is a parameter passed to the JavaScript gatherer that contains an HTML element. For instance, if you would like to write a gatherer that gets the text of the element, this would be the code:

Code: Select all

return element.innerText;
This is not necessary though, because there is already a built-in property gatherer that does that called "InnerText". But if you would like, for instance, to get the text of the element but as lower case, you could do this:

Code: Select all

return element.innerText.toLowerCase();
I hope this help clarifying how JavaScript gatherers work.
Juan Soldi
The Helium Scraper Team

Maxkuba
Posts: 5
Joined: Wed May 11, 2011 4:31 pm

Re: JS gatherer with search in page

Post by Maxkuba » Wed May 18, 2011 7:32 pm

Well your link helped me understanding how it works.
Thank you very much.
I feel I have a lot to learn about nodes and DOM and Javascript... :roll: :mrgreen:
Regards

Post Reply