Infinite Scroll
Posted: Fri Jan 04, 2019 7:56 am
This function handles pages with infinite scroll, even when the scrollable area is not the document but an element inside the page. For a quick start, check out this video tutorial on how to quickly import and use this premade from within Helium Scraper.
This function takes the following arguments:
To use this on your project, follow these instructions:
The maximum number of elements to be extracted can be limited using the Sequence.Take function as in the following example (this example assumes the project already contains a selector called ListItem that selects the list items, and the code above was pasted under a global called InfiniteScroll):
Note that the example above would extract 500 list items, each of which will usually occur many times per page, as opposed to 500 pages.
This function takes the following arguments:
- itemSelector: A selector that selects the items in the scrollable area, typically each of the results items. After scrolling down, the previous elements will be deleted to minimize memory consumption. Usually these elements can be automatically selected using the Detect List button on the page.
- scrollDelay: A wait time in milliseconds, to wait for new data to load after scrolling down.
- removeOldElements: Whether to remove old elements after they've been extracted. Should be true unless removing elements breaks the page's scrolling mechanism.
To use this on your project, follow these instructions:
Code: Select all
function (itemSelector scrollDelay removeOldElements)
Browser.ScrollLoop
· itemSelector
· Sequence.First
· itemSelector
Browser.ScrollToBottom
Browser.Wait
· scrollDelay
· removeOldElements
Code: Select all
Sequence.Take
· 500
· InfiniteScroll
· Select.ListItem
· 100
· true