Page 1 of 2

Easy deep Yellow Pages extraction

Posted: Fri Apr 08, 2011 9:17 pm
by webmaster
This project is a useful sample of Helium Scraper's new features. It takes a given list of search terms and cities and navigates inside each of the business links for you to extract from the landing pages whatever you want.

All you have to do is go to is fill up the "Input" table with a list of search terms and cities you want to use, then create an actions tree (there is a sample one already created called "Sample Tree") and add a "Execute Actions Tree" action that executes the "Search in YP" tree. You will be asked to enter the amount of pages to go through. Remember that Yellow Pages shows 30 results per page. Then, add a "Extract" action as a child node of the "Execute Actions Tree" action, and extract all you want.

Remember that the "Extract" action will be executed inside each of the business pages and not on the results pages. This is for you to have access to more detailed information about the business.

Give it a try and let me know how it goes.

Re: Easy deep Yellow Pages extraction

Posted: Wed Jun 01, 2011 12:11 am
by rmbraaten
Love this custom script. For some reason though, it stops on the fourth page of listings, so I only get 120 listing (with their accompanying details). The repeat value on the "Search in YP" action is set to 2. I can set it to 3 or 4, but it doesn't make a difference--so I'm not sure how that number affects anything. I don't get an error after the 4th page is loaded saved, just an "action completed" (or something to that effect) when I'd actually like it to load a couple more pages. I can't figure out what's causing it to stop. Any suggestions?

Thanks!

Re: Easy deep Yellow Pages extraction

Posted: Wed Jun 01, 2011 2:43 am
by webmaster
Hi,

You should enter the amount of pages in the "Execute Tree: Search in YP" action by double clicking it. Do not change it in the "Repeat" action, otherwise you'll just get repeated results.

Re: Easy deep Yellow Pages extraction

Posted: Wed Jun 01, 2011 4:49 pm
by rmbraaten
Yes! That's it. Thanks for a quick response.

Re: Easy deep Yellow Pages extraction

Posted: Fri Jun 03, 2011 3:04 pm
by rmbraaten
Would it be possible to add the number of pages variable to the input table? For example, if my input table includes Schools and Restaurants, the schools may only require 1 page of search results while the restaurants may need 5 or 6. If I could define the pages per input record (i.e., "Input.Business", "Input.City", "Input.Pages") I could put all my yp search terms in the input table and let it cycle through all of them without returning the "Can't find NextButton" error. Unless there's another, better way to solve this. Thanks.

Re: Easy deep Yellow Pages extraction

Posted: Sat Jun 04, 2011 7:47 am
by webmaster
Hi,

I'm attaching a project that is pretty much the same as the original, but instead of entering an amount of pages, you enter a maximum amount of pages. This way, if the available amount of pages is less than the given maximum, you won't get any error message and Helium Scraper will only extract as many results as possible.

I think this should solve your problem.

Re: Easy deep Yellow Pages extraction

Posted: Tue Jun 07, 2011 4:50 pm
by rmbraaten
Excellent! Thank you. That's a more elegant solution.

Re: Easy deep Yellow Pages extraction

Posted: Fri Dec 23, 2011 11:29 pm
by liquidcherry
Hello Juan,

I played with HS a while ago and was able to extract the info i needed with your original yp file but i just installed HS again and tried your modified file and it is not extracting the phone# and address,(only the name) i spend some time now to find out why but it exceeds my powers(i also tried to adapt/change things so that i can scrape the vendor listings from getmarried.com to no avail), do you have any idea what i am doing wrong or is it because YP changed their site again since the last time i tried?

Any help would be highly appreciated

Frank

Happy Holidays!!

Re: Easy deep Yellow Pages extraction

Posted: Mon Dec 26, 2011 7:21 pm
by webmaster
Hi,

Here is an updated version. All I needed to do was select the items in some sample page that weren't being extracted and add them to their corresponding kinds (phone and address) with the Add selection to this kind button.

Re: Easy deep Yellow Pages extraction

Posted: Wed Mar 12, 2014 5:30 pm
by Guest
h ijuan,
thx will play with it...and sorry for the late response, have been busy with some stuff:)

kudos to helium, it is an awesome product!

cheers

liquid