Running Out of Resources (Auto scrolling to bottom of page)

Questions and answers about anything related to Helium Scraper
Post Reply
hscraper
Posts: 5
Joined: Wed Jun 08, 2016 6:12 pm

Running Out of Resources (Auto scrolling to bottom of page)

Post by hscraper » Wed Jun 08, 2016 6:17 pm

I am attempting to scrape a page which requires me to scroll many times to get to the bottom of the page (in order to load all of the data I wish to scrape.) I came up with a way to autoscroll to the bottom of the page executing javascript: var scroll = setInterval(function(){ window.scrollBy(0, 100); }, 2000).

This seems to work; however, I cannot complete the process because I keep getting the error that HS is running out of resources. HS then freezes and crashes. What is the way around this?

Please note, I'm not a coder and am pretty new to this - speak to me in small words, any step-by-step help would be appreciated. ;)

Thanks

hscraper
Posts: 5
Joined: Wed Jun 08, 2016 6:12 pm

Re: Running Out of Resources (Auto scrolling to bottom of page)

Post by hscraper » Fri Jun 10, 2016 8:51 pm

Could you please create (or could someone share) a premade to load more data by scrolling down?

hscraper
Posts: 5
Joined: Wed Jun 08, 2016 6:12 pm

Re: Running Out of Resources (Auto scrolling to bottom of page)

Post by hscraper » Fri Jun 24, 2016 6:02 pm

I am having the exact same issue. Any guidance on a solution would be awesome!

webmaster
Site Admin
Posts: 494
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Running Out of Resources (Auto scrolling to bottom of page)

Post by webmaster » Fri Jun 24, 2016 9:56 pm

Hi,

The page is keeping in memory all the HTML and images that load every time you scroll to the bottom, so regardless of what you do, scrolling down will eventually crash the browser. Perhaps disabling images in IE will let you scroll a bit further.

I can think of a couple of safe workarounds but they'd all require JavaScript coding, such as clearing the HTML elements that have already been extracted from the page, or using the page's API to get this data.
Juan Soldi
The Helium Scraper Team

Post Reply