I am testing Helium Scraper for few days and I succeed to have better results than all other software tested
Now, I didn't succeed to do something which seems to be easy...
In the final extracted table built by Helium Scraper, I try to have a column with a constant value, with no link with the web page content.
For example, I want to add a column "MyColumn" where, for all the extracted line, I want to have the constant text value "This is my column"
How can I do this ?
Thanks for your help
Jean-marc (France)
extracted table with constant value
-
- Posts: 38
- Joined: Tue Dec 11, 2012 6:44 pm
Re: extracted table with constant value
this may not be the best way to do it, but it's the quickest i could think of:Jmarc wrote:I am testing Helium Scraper for few days and I succeed to have better results than all other software tested
Now, I didn't succeed to do something which seems to be easy...
In the final extracted table built by Helium Scraper, I try to have a column with a constant value, with no link with the web page content.
For example, I want to add a column "MyColumn" where, for all the extracted line, I want to have the constant text value "This is my column"
How can I do this ?
Thanks for your help
Jean-marc (France)
1. Go to the Projece Menu and click "Text Gatherers"
2. Click the green + button to create a new TG and call it "MyColumn" or whatever you want to call it
3. In the middle on the left, you will see "Source" with a drop down menu. Change the source to "URL" and click the green + sign to the right of that (not the same green + sign at the top that you clicked to create the TG) and click "Slice"
4. click the "Fixed" tab at the top
5. set the "Start position" to 4
6. click the green + next to source again, click "Replace", and then click "Blank"
7. type "http" into "Replace"
8. type whatever you want to be the fixed text into "With"
9. click the purple floppy disk to save it
10. on your Extract action, put whatever you want the column name to be under "column name", set the kind name to "BODY", and then set the proprty to whatever you named the TG. if you named it "fixedtext" for example, the property's name will be "JS_fixedtext"
this will do exactly what you're looking for