can i extract "meta keyword" or "meta description" ?

Questions and answers about anything related to Helium Scraper
Post Reply
boyberm
Posts: 4
Joined: Sun Apr 03, 2011 1:37 pm

can i extract "meta keyword" or "meta description" ?

Post by boyberm » Sun Apr 03, 2011 1:39 pm

The information which I want extract is in the <header> tag.

How can I extract the <meta name="keywords"> or <meta name="description"> of the website?


Best Regards

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: can i extract "meta keyword" or "meta description" ?

Post by webmaster » Sun Apr 03, 2011 10:37 pm

Hi,

The attached file is a sample project that extracts information from the HEAD. If you press play in the actions tree, no matter which page you are at, it will extract the current URL (for reference) and the meta description. In Project -> JavaScript Gatheres I wrote a gatherer that gets the meta description. To create another gatherer that, for instance, gets the keywords, you would just need to copy and paste that code into a new gatherer, and change this line

Code: Select all

if(metas[n].getAttribute("name") == "description")
for this line

Code: Select all

if(metas[n].getAttribute("name") == "keywords")
I also created a kind that selects the BODY element. This is because the "Extract" action needs a kind that will select some element in the body or the body itself to perform the extraction. In this case is irrelevant which element is selected so I chose the body because it's always there and it's always one.

This could have also been done from a Execute JavaScript action without using a kind, but the extraction would have need to be done with code instead of with a Extract action
Attachments
MetasExtraction.hsp
(335.22 KiB) Downloaded 647 times
Juan Soldi
The Helium Scraper Team

Post Reply