More than one site to same .csv?

Questions and answers about anything related to Helium Scraper
Post Reply
Tommy
Posts: 15
Joined: Sat Mar 26, 2011 12:44 am

More than one site to same .csv?

Post by Tommy » Sat Jul 09, 2011 10:45 am

Can you scrape different sites pulling different content into the same .csv?

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: More than one site to same .csv?

Post by webmaster » Sat Jul 09, 2011 7:37 pm

Are you extracting all these sites from the same project? Also, do the result tables have the same structure?
Juan Soldi
The Helium Scraper Team

Tommy
Posts: 15
Joined: Sat Mar 26, 2011 12:44 am

Re: More than one site to same .csv?

Post by Tommy » Sat Jul 09, 2011 8:56 pm

webmaster wrote:Are you extracting all these sites from the same project? Also, do the result tables have the same structure?
I haven't worked out how to scrape multiple sites in the same project to be honest. If this can be done, then yes, it will be in the same project ;)

I'm guessing it might be easier to scrape each site individually and do some post-processing, but with the (hopefully) up and coming update that enables us to create stand-alone scrapers, I'm wondering if this is something that could be created.

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: More than one site to same .csv?

Post by webmaster » Sat Jul 09, 2011 9:13 pm

Well, what I would do is use any spreadsheet program (here is a free one) and then paste all my data from one table and then paste data at the end from another table and so on.

But you can also extract from multiple "Extract" actions to a single table, as long as these have the same structure and are in the same project, by using the same table name (when you get the "Replace existing table?" prompt just say yes). The easiest way to do this is by duplicating an existing "Extract" action by right clicking it and selecting "Duplicate Node". Then you can drag it and drop it to another actions tree, and change the kinds being used which most likely will be necessary.

To extract from more than one site from one project you can use one actions tree for each side. Also, you could add at the beginning of each tree a "Execute JavaScript" action with this code:

Code: Select all

window.location.href = "http://www.somesite.com";
which will take you to "www.somesite.com". Furthermore, you can create another actions tree that executes all your other actions trees one by one by using the "Execute Actions Tree" action.

But if all these sites happen to have exactly the same structure (such as when they use the same template) you could just use a single actions tree.

Hope this made sense.
Juan Soldi
The Helium Scraper Team

Tommy
Posts: 15
Joined: Sat Mar 26, 2011 12:44 am

Re: More than one site to same .csv?

Post by Tommy » Sun Jul 10, 2011 4:42 pm

Thanks for the detailed reply.

It's great that scraping multiple sites to a single project is possible. Roll on the stand-alone scraper feature ;)

Post Reply