Page 1 of 1

Saving full HTML of URLs

Posted: Sun May 03, 2020 11:51 pm
by sawal86
Hello!
I have a list of URLs and I want it just to be saved as HTML files to my PC.

I want to set a selector as the name of each HTML file, for example, Select.CompanyName.

Is it possible in Helium3 ?

If yes, help with the template, please!

Regards.
Aleks.

Re: Saving full HTML of URLs

Posted: Tue May 19, 2020 10:27 pm
by webmaster
You can use Gather.HTML to get the current page HTML (or the HTML of any particular element when the element is selected), and since version 3.2.4.8 you can use Sequence.WriteFile to write files with arbitrary text content. In your case, you could do something like this, supposing all the pages you're visiting have a title element selected by a selector called Title:

Code: Select all

Query.URLs
as (url)
Browser.Load
   ·  url
extract
   html
      Gather.HTML
      as html
      Select.Title
      as title
      Sequence.WriteFile
         ·  html
         ·  +
               ·  title
               ·  ".html"
         ·  false