Hi
I am an occasional user of helium script i am still learning how to use the script.
I know HS can do what i want but i am still trying to understand the basics.
Please advise on the following :
1, If i want to scrape a site, the code / link for example is Wholesale Suppliers <a href="http://www.wholesalers4u.co.uk">Wholesale Suppliers</a> I want to extract the url (www.wholesalers4u.co.uk) sometimes this could be in among a paragraph of words which i do not want.
2, If the address is in a few lines - 1 title, 2 street, 3 City, 4 state, 5 Zip code, i want to extract this seperatly, when i select in HS it selects all the section only.
3, many of the sites i want to scrape have category => list => detail how do i do this ? i can get HS to go to list then detail and scrape but not when there is another layer.
Any help would be greatly appreciated.
Frustrated User
Re: Frustrated User
Hi,
You should still be able to create a kind that selects only the link and then extract its Link property, which contains the target URL.tipud wrote:1, If i want to scrape a site, the code / link for example is Wholesale Suppliers <a href="http://www.wholesalers4u.co.uk">Wholesale Suppliers</a> I want to extract the url (http://www.wholesalers4u.co.uk) sometimes this could be in among a paragraph of words which i do not want.
This can be done with Text gatherers at Project -> Text Gatherers. Here is some more info on how to use them.tipud wrote:2, If the address is in a few lines - 1 title, 2 street, 3 City, 4 state, 5 Zip code, i want to extract this seperatly, when i select in HS it selects all the section only.
You can go deep down as many levels as you need by putting one Navigate Each action as a child of another. I'm not sure what you mean by having another layer. Perhaps a URL will help me figure it out.tipud wrote:3, many of the sites i want to scrape have category => list => detail how do i do this ? i can get HS to go to list then detail and scrape but not when there is another layer.
Juan Soldi
The Helium Scraper Team
The Helium Scraper Team