Iframe scrape possible?

Questions and answers about anything related to Helium Scraper
Post Reply
mrwhite
Posts: 6
Joined: Wed Feb 16, 2011 7:07 pm

Iframe scrape possible?

Post by mrwhite » Thu Mar 17, 2011 4:12 am

Hi,

I'm having trouble scraping from a particular song lyric site (metrolyrics.com). The lyrics appear to be iframed and I can't even drag my mouse to select the lyrics. The best I could do is click highlight the lyrics box but upon scraping, the lyrics don't get scraped. Any chance you can take a look?

Here's the link: http://www.metrolyrics.com/top100.html. I want to scrape the lyrics from each of the top 100 songs.

thx!

webmaster
Site Admin
Posts: 494
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Iframe scrape possible?

Post by webmaster » Thu Mar 17, 2011 7:41 am

OK I found a solution that involves a little JavaScript. Create a new JavaScript Gatherer called "IFrameContent" and paste the following code in it:

Code: Select all

return element.contentWindow.document.body.innerText;
Now, in the selection panel click on "Choose visible properties" and in the dialog, select only "JS_IFrameContent". Now go to the lyrics page and select the IFrame by clicking somewhere on the lyrics with Selection Mode active. The lyrics should show up at the bottom.

Remember that you would also need to change the "InnerText" property (or whichever you had set) to "JS_FrameContent" in the Extract action. Also, since the IFrame seems to load its content dynamically, you might need to add a Wait action right before the Extract one.
Juan Soldi
The Helium Scraper Team

Post Reply