Fields not populating table...

Questions and answers about anything related to Helium Scraper
Post Reply
vandigroup
Posts: 7
Joined: Sun Mar 25, 2012 5:20 am

Fields not populating table...

Post by vandigroup » Sun Mar 25, 2012 5:31 am

Hello. I demo'd Helium with great success and have been really impressed with it. Unfortunately, I purchased it last night and the first job I tried to run is not working. I have read all forum posts, the help manual and finally, broke my project down to very basic while following the video. When I run the action, it goes through each page correctly but nothing get populated in the db table. Can you please take a look at the file and tell me where I am going wrong? You help would be most appreciated. Thanks!

My file:
gloves.rar
(32.6 KiB) Downloaded 659 times
The address I am starting from is: http://www.combatsports.com/csi/gloves.html

PS. I had to zip the file because the maximum size allowed is 512kb.

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Fields not populating table...

Post by webmaster » Sun Mar 25, 2012 3:29 pm

Hi,

Here is where you need to use a Force Select action. You can add this action by clicking on the New Action button in any actions tree and then Execute Actions Tree -> More.... The problem is that this page seems to be loading some stuff with ajax, so Helium Scraper cannot recognize the title or any other kind a little bit after the page has already loaded. The Force Select action will wait until any chosen kind can be found in the page before continuing. I added this action to your project and is working fine now. You can double click the Force Select action to see how is configured.

Let me know if you have any other questions!
Attachments
gloves.hsp
(630.41 KiB) Downloaded 595 times
Juan Soldi
The Helium Scraper Team

vandigroup
Posts: 7
Joined: Sun Mar 25, 2012 5:20 am

Re: Fields not populating table...

Post by vandigroup » Sun Mar 25, 2012 4:53 pm

Thank you for the quick and accurate reply! I looked at it and it is now working perfectly. I do have a couple more questions. Please feel free to point me to a post if any have already been answered.

1. When scraping the price, some product pages have two prices, one the original and one is the "sale" price. After reviewing the code, I see that both prices are in a span with a class of "price". The difference is the ID tags BUT they are being dynamically generated to include an internal system SKU (which I have no reference to). I know in jQuery I would use :last-child to grab the correct selector. Can I use this here?
Example URL: http://www.combatsports.com/csi/combat- ... gel-9.html

2. I would also like to take things one step further and pull the colors/size but what confused me was that some selects are loaded dynamically depending on the first selection. Ideally, I need to generate a different line item for every option.
Example URL: http://www.combatsports.com/csi/combat- ... gel-9.html

3. On each product page, there are multiple images, but it confuses me as to what to do because they each need to be loaded by clicking on a thumbnail. I simple just want to be able to grab all available images.

Thank you again for your wonderful program. For years I have looked at other and they always seemed overly complicated and confusing. Your UI is intuitive and to the point...along with your support, wonderful job!

Joe

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Fields not populating table...

Post by webmaster » Tue Mar 27, 2012 1:50 am

1. Before anything else, try creating a kind that selects only the sale price and another kind that selects the price when there is only one price. Then you can use a set kind (with the Create Set Kind button on top) that is the union of these two kinds. If any of these kinds start selecting more elements than you need, you can try going to Project -> Options -> Select Property Gatherers and then selecting all gatherers under the Kind Defining tab, and then creating your kinds again. If this still doesn't work you'll need some javascript. Perhaps you could try creating a kind that selects the whole price box (just click on any price and click on the Select Parent button until you see "<div class="price-box">"), and then adding a javascript gatherer at Project -> JavaScript Gatherers with something like this:

Code: Select all

return element.children[1].innerText;
And then use your gatherer instead of innerText on your Extract action.

2. You might want to take a look at the Select each item in a list premade project at File -> Online Premades. You'll need to import this action and then run it from a Execute Actions Tree action. The bad news is that you'd need to modify the code to make the second box update itself every time you change the item in the first box. I just made a quick test and seems like some extra events need to be fired (the current code will attempt to fire the "change" event on a few different ways) and still can't figure which are these. You might want to take a look at the code in the Execute JS (Select Each) action in the premade.

3. About 99% of cases I've seen, the URL of the thumb image is a variation of the URL of the large picture. Here is a sample thumb URL:

http://lghttp.5679.nexcesscdn.net/8042C ... /tg1_1.jpg

And here is the larger version:

http://lghttp.5679.nexcesscdn.net/8042C ... /tg1_1.jpg

See the pattern? You'll basically need to use a javascript gatherer that takes the thumb and returns the src attribute with the "thumbnail/56x" replaced by "image/400x400".

By the way, if you haven't read it yet, you might find this post useful.
Juan Soldi
The Helium Scraper Team

vandigroup
Posts: 7
Joined: Sun Mar 25, 2012 5:20 am

Re: Fields not populating table...

Post by vandigroup » Thu Mar 29, 2012 4:37 am

Thanks for your help. Everything is working great now! Truly an amazing product...

Post Reply