HS uses a lot of resources

Questions and answers about anything related to Helium Scraper
Post Reply
caliman
Posts: 21
Joined: Tue May 31, 2011 5:12 pm

HS uses a lot of resources

Post by caliman » Tue Nov 15, 2011 3:04 pm

Hi, in the last 2 version of HS I realized that the software is using a big amount of my pc's memory, even before starting to scrape info. I already updated the IE and changed some info in the regedit based on one of the threads in this forum. Also when I am scraping some websites one window pops up because the software is working with not enough resources and needs to restart. What Can I do? thks

webmaster
Site Admin
Posts: 501
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: HS uses a lot of resources

Post by webmaster » Tue Nov 15, 2011 6:26 pm

Hi,

Try looking at your IE version from Helium Scraper in this site. It should be IE 9. Other than that, the amount of resources used will depend on the site being scraped and sometimes you'll need to restart. In case you are, do not configure IE to not download pictures. This will actually increase the memory being used. Also, consider using the mobile version of the site being scraped since they typically use less scripting.

The amount of memory used have not increased in the latest versions. The only difference is that now you'll get this message to prevent Helium Scraper from crashing. Finally, if scraping from Google, do not use Google Instant because this uses huge amounts of memory (even when you are manually browsing).

For this kind of situation is a good idea to break your project apart into sections. You can also have more than one instance of Helium Scraper running at the same time executing each section. If you need help with breaking your project apart let me know. There are really several ways to accomplish this and finding the best way will depend on the specifics of your project. A simple way to do it is by using a Navigate URLs action.
Juan Soldi
The Helium Scraper Team

caliman
Posts: 21
Joined: Tue May 31, 2011 5:12 pm

Re: HS uses a lot of resources

Post by caliman » Tue Nov 15, 2011 8:06 pm

Thanks Juan, actually I am using a Navigate URLs action. I am scraping this site

Code: Select all

http://www.wordreference.com/sinonimos/consolidacion
And I have 4.000 Urls. How can I break my project? Thanks alot for your help and this awesome software.

webmaster
Site Admin
Posts: 501
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: HS uses a lot of resources

Post by webmaster » Tue Nov 15, 2011 8:40 pm

Hi,

What I usually do is create a copy of whatever table is holding my URLs, clear it, and then copy and paste, say, 500 URLs from the original table. Then set your Navigate URLs action to get URLs from this table. Once is done, save your project, restart Helium Scraper if it is using too many resources and continue with the next chunk.

Also, you can save the project with different names, each of which would have 500 different URLs in the table used by your Navigate URLs action. This way you could run more than one instance at the same time, which could make the extraction a lot faster.
Juan Soldi
The Helium Scraper Team

Post Reply