Page 1 of 1

HS skipping/duplicating data

Posted: Tue Sep 06, 2011 4:17 pm
by doug
My project is scraping 20 rows from each page in a sequence. When I finish and use one of the kinds as a key in a database I'm told there are duplicates, but the data is guaranteed to be unique. Examining the data shows that some extracts were duplicated a page at a time. But also that a page of data is missing. HS extracted a page using the data from the previous load. I've tried force-select but that doesn't help. (If the forced kind was in the old page it seems HS is happy?). I've also tried waiting up to 9 seconds but that is tedious (with failures). What can be used as a trigger for new data to extract?

Re: HS skipping/duplicating data

Posted: Wed Sep 07, 2011 4:26 am
by webmaster
Hi,

I'd need to look at your project to see exactly what's causing duplicated content. But I've made a small variation to the Force Select premade that might help with your problem. This one will force-select-new-content. It simply stores the last selected content and keeps trying to select until different content is found or until the timeout is reached.

Note that it stores this data in the Global.UserData variable, so if you use this variable anywhere else, the code would need to be modified a little bit. Let me know if you need help with this. Also, you can only use the Force Select New Content actions tree in one point in your extraction tree. I think this should be good enough for your particular case.

You should place the Force Select New Content right before your Extract action.