The information which I want extract is in the <header> tag.
How can I extract the <meta name="keywords"> or <meta name="description"> of the website?
Best Regards
can i extract "meta keyword" or "meta description" ?
Re: can i extract "meta keyword" or "meta description" ?
Hi,
The attached file is a sample project that extracts information from the HEAD. If you press play in the actions tree, no matter which page you are at, it will extract the current URL (for reference) and the meta description. In Project -> JavaScript Gatheres I wrote a gatherer that gets the meta description. To create another gatherer that, for instance, gets the keywords, you would just need to copy and paste that code into a new gatherer, and change this line
for this line
I also created a kind that selects the BODY element. This is because the "Extract" action needs a kind that will select some element in the body or the body itself to perform the extraction. In this case is irrelevant which element is selected so I chose the body because it's always there and it's always one.
This could have also been done from a Execute JavaScript action without using a kind, but the extraction would have need to be done with code instead of with a Extract action
The attached file is a sample project that extracts information from the HEAD. If you press play in the actions tree, no matter which page you are at, it will extract the current URL (for reference) and the meta description. In Project -> JavaScript Gatheres I wrote a gatherer that gets the meta description. To create another gatherer that, for instance, gets the keywords, you would just need to copy and paste that code into a new gatherer, and change this line
Code: Select all
if(metas[n].getAttribute("name") == "description")
Code: Select all
if(metas[n].getAttribute("name") == "keywords")
This could have also been done from a Execute JavaScript action without using a kind, but the extraction would have need to be done with code instead of with a Extract action
- Attachments
-
- MetasExtraction.hsp
- (335.22 KiB) Downloaded 647 times
Juan Soldi
The Helium Scraper Team
The Helium Scraper Team