'go through all pages' resets

Questions and answers about anything related to Helium Scraper
Post Reply
crookedleaf
Posts: 38
Joined: Tue Dec 11, 2012 6:44 pm

'go through all pages' resets

Post by crookedleaf » Mon Apr 15, 2013 5:56 pm

running into a problem with the 'go through all pages' premade as well as the action. i'll give you a little background on the scrape...

the site i used to scrape used to have a list of links, and i would 'navigate each' on them. today they changed the design of their website so that it only lists 10 links per page, and then you have to click the "next" button or a different page number. so i though, "no problem... i'll use 'go through all pages' to fix it." the problem is, that any time it goes back to the list of links, it resets and goes back to page 1. this is not a problem with helium, as it does this in a normal browser too. if i press the back button, it resets to page 1. there is also a dropdown on the top of the page where you can change the amount of links. so i tried using the 'select option', changing this to a higher result, but as soon as it goes back to the list of links, it also resets back to the original amount listed.

of there any way that you guys can think of to help get past this issue? :cry: in a pretty bad bind right now.

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: 'go through all pages' resets

Post by webmaster » Wed Apr 17, 2013 4:48 pm

See if these links are actual links or ajax ones. To test this, right click a link -> Copy shortcut and then paste in the address bar and press enter. If it takes you to the page you'd be taken to if you'd click the link, then they are actual links. If so, don't navigate each but instead extract the Link property of the links to a table in a Go Through All Pages action, such that no going back to the links page is involved but you only move forward. Then, in another actions tree, use a Navigate URLs action that uses this table.
Juan Soldi
The Helium Scraper Team

crookedleaf
Posts: 38
Joined: Tue Dec 11, 2012 6:44 pm

Re: 'go through all pages' resets

Post by crookedleaf » Wed Apr 17, 2013 6:22 pm

Yes, they are actual links and not AJAX. I am going to try this now. The link does contain a Session ID number, but this shouldn't be a problem. I will let you know if I have any problems. Thanks!

Post Reply