extracted table with constant value

Questions and answers about anything related to Helium Scraper
Post Reply
Jmarc
Posts: 4
Joined: Fri Aug 16, 2013 5:45 pm

extracted table with constant value

Post by Jmarc » Fri Aug 16, 2013 5:56 pm

I am testing Helium Scraper for few days and I succeed to have better results than all other software tested :-)

Now, I didn't succeed to do something which seems to be easy...
In the final extracted table built by Helium Scraper, I try to have a column with a constant value, with no link with the web page content.
For example, I want to add a column "MyColumn" where, for all the extracted line, I want to have the constant text value "This is my column"

How can I do this ?

Thanks for your help
Jean-marc (France)

crookedleaf
Posts: 38
Joined: Tue Dec 11, 2012 6:44 pm

Re: extracted table with constant value

Post by crookedleaf » Thu Aug 29, 2013 12:16 am

Jmarc wrote:I am testing Helium Scraper for few days and I succeed to have better results than all other software tested :-)

Now, I didn't succeed to do something which seems to be easy...
In the final extracted table built by Helium Scraper, I try to have a column with a constant value, with no link with the web page content.
For example, I want to add a column "MyColumn" where, for all the extracted line, I want to have the constant text value "This is my column"

How can I do this ?

Thanks for your help
Jean-marc (France)
this may not be the best way to do it, but it's the quickest i could think of:

1. Go to the Projece Menu and click "Text Gatherers"
2. Click the green + button to create a new TG and call it "MyColumn" or whatever you want to call it
3. In the middle on the left, you will see "Source" with a drop down menu. Change the source to "URL" and click the green + sign to the right of that (not the same green + sign at the top that you clicked to create the TG) and click "Slice"
4. click the "Fixed" tab at the top
5. set the "Start position" to 4
6. click the green + next to source again, click "Replace", and then click "Blank"
7. type "http" into "Replace"
8. type whatever you want to be the fixed text into "With"
9. click the purple floppy disk to save it
10. on your Extract action, put whatever you want the column name to be under "column name", set the kind name to "BODY", and then set the proprty to whatever you named the TG. if you named it "fixedtext" for example, the property's name will be "JS_fixedtext"

this will do exactly what you're looking for :)

Post Reply