Looping through web tables
-
- Posts: 20
- Joined: Thu May 12, 2011 4:05 pm
Looping through web tables
I havent installed Helium Scraper yet as I don't have the admin privelages to install the .NET framework. Can you confirm Helium can read data in from a file (csv or text), use that to conudct searches, then export the results back out?
I was using Automation anywhere to scrape data from this site
http://www-947.ibm.com/support/entry/portal/parts
I had the Type (sample type is 2904) and Serial (sample serial is R83D12G) in my file. I would put it in, click submit, loop through the web table, right that data out to a csv and add in the original serial to each row. Just need to know if this is possible so I can work with IT to get this installed
thanks
I was using Automation anywhere to scrape data from this site
http://www-947.ibm.com/support/entry/portal/parts
I had the Type (sample type is 2904) and Serial (sample serial is R83D12G) in my file. I would put it in, click submit, loop through the web table, right that data out to a csv and add in the original serial to each row. Just need to know if this is possible so I can work with IT to get this installed
thanks
Re: Looping through web tables
Yes, Helium Scraper can do that.
The way to "read" data from a csv or text file is by simply copying and pasting your data into Helium Scraper's data table editor and saving it. You can also import a whole MDB database.
If you need to automatically fill up the fields you will need a little javascript. I'll be glad to help you out with that if necessary.
The way to "read" data from a csv or text file is by simply copying and pasting your data into Helium Scraper's data table editor and saving it. You can also import a whole MDB database.
If you need to automatically fill up the fields you will need a little javascript. I'll be glad to help you out with that if necessary.
Juan Soldi
The Helium Scraper Team
The Helium Scraper Team
-
- Posts: 20
- Joined: Thu May 12, 2011 4:05 pm
Re: Looping through web tables
Excellent. I downloaded the software and identified the 3 Kinds I'm looking for. I've uploaded my test file. At this point it will just extract data when I fill in the Type and Serial manually. Can you help me with the javascript to use the data from the source table. grab the data and append it to the partsexport table?
- Attachments
-
- LenovoScraper.zip
- (30.8 KiB) Downloaded 737 times
Last edited by massradius on Thu May 12, 2011 8:38 pm, edited 1 time in total.
-
- Posts: 20
- Joined: Thu May 12, 2011 4:05 pm
Re: Looping through web tables
Can you recommend a good intro javascript book?webmaster wrote:
If you need to automatically fill up the fields you will need a little javascript. I'll be glad to help you out with that if necessary.
Re: Looping through web tables
Hi,
I'm attaching a project that should fit your needs. If you fill up the "SourceTbl" table and press play, it will extract your data to the "PartsExport" table.
I figure the easiest way to explain you how the javascript code works in this project is by adding comments everywhere in the code (comments are the green lines of text that start with "//"). You can see all the javascript code by double clicking each "Execute JS" action. There is actually not much code as you can probably tell.
There are plenty of javascript tutorials online. The only problem with them is that they are usually focused on javascript related to HTML, which we don't really need in Helium Scraper. Here is one I found that is very short and doesn't involve HTML other than in the very beginning. It should get you started. As for javascript applied to Helium Scraper, you can find everything you need to know in the documentation at Actions -> Actions List -> Execute JavaScript.
I'm attaching a project that should fit your needs. If you fill up the "SourceTbl" table and press play, it will extract your data to the "PartsExport" table.
I figure the easiest way to explain you how the javascript code works in this project is by adding comments everywhere in the code (comments are the green lines of text that start with "//"). You can see all the javascript code by double clicking each "Execute JS" action. There is actually not much code as you can probably tell.
There are plenty of javascript tutorials online. The only problem with them is that they are usually focused on javascript related to HTML, which we don't really need in Helium Scraper. Here is one I found that is very short and doesn't involve HTML other than in the very beginning. It should get you started. As for javascript applied to Helium Scraper, you can find everything you need to know in the documentation at Actions -> Actions List -> Execute JavaScript.
- Attachments
-
- LenovoScraper2.hsp
- (736.42 KiB) Downloaded 769 times
Juan Soldi
The Helium Scraper Team
The Helium Scraper Team
-
- Posts: 20
- Joined: Thu May 12, 2011 4:05 pm
Re: Looping through web tables
Nearly perfect. On occasion if I put in a number longer than 7 characters for instance
Type: 4051
Serial: ABU0196460
The page leaves the data from the last search up and shows "Serial must be 7 characters.". Would I just create this error as a kind? If I can add in this error handling as part of the javascript?
Also if the serial is 7 characters but not in the dB it returns "Error message: MachineTypeSerialNotFound No information was found matching the search criteria. Please try again. " If you can help me with the first one I can probably add in this check. I'm thinking I can do something as compare the entry serial with the serial (defined as a kind) and if they don't match skip the export.
Lastly, what will happen if there are no records for a part? The second table will have no data in it. Will this program just export nothing since there are no matching kinds? Sorry for all the questions. Attached is the dB with a few parts to illustrate the examples
Thanks again for the help (I found a trim function).
Type: 4051
Serial: ABU0196460
The page leaves the data from the last search up and shows "Serial must be 7 characters.". Would I just create this error as a kind? If I can add in this error handling as part of the javascript?
Also if the serial is 7 characters but not in the dB it returns "Error message: MachineTypeSerialNotFound No information was found matching the search criteria. Please try again. " If you can help me with the first one I can probably add in this check. I'm thinking I can do something as compare the entry serial with the serial (defined as a kind) and if they don't match skip the export.
Lastly, what will happen if there are no records for a part? The second table will have no data in it. Will this program just export nothing since there are no matching kinds? Sorry for all the questions. Attached is the dB with a few parts to illustrate the examples
Thanks again for the help (I found a trim function).
- Attachments
-
- LenovoScraper2.zip
- (47.77 KiB) Downloaded 763 times
Re: Looping through web tables
Hi,
The easiest thing to do there is to just let Helium Scraper extract duplicated data and then filter duplicates out with SQL. If your table is called, say, MyTable, this query would return unique results:
And this other query will create a table called NewTable containing only unique rows taken from the MyTable table:
If you really don't want to extract duplicated results, you could create a kind that selects the error message, add a "Select Kind" action and underneath this action, add a "Execute JS" with this code:
This will execute the child nodes of the "Execute JS" action if and only if no error is found. So you would place your "Extract" action inside this "Execute JS" action.
Let me know if you need any further help. BTW, I've posted a couple of tutorials to our blog that you might find useful.
The easiest thing to do there is to just let Helium Scraper extract duplicated data and then filter duplicates out with SQL. If your table is called, say, MyTable, this query would return unique results:
Code: Select all
SELECT DISTINCT * FROM [MyTable]
Code: Select all
SELECT DISTINCT * INTO [NewTable] FROM [MyTable]
Code: Select all
if(Node.Counter > 0) return false;
if(Global.Browser.Selection.Count > 0) return false;
else return true;
Let me know if you need any further help. BTW, I've posted a couple of tutorials to our blog that you might find useful.
Juan Soldi
The Helium Scraper Team
The Helium Scraper Team
-
- Posts: 20
- Joined: Thu May 12, 2011 4:05 pm
Re: Looping through web tables
Thanks Juan. My big issue isn't the dupes. The problem is I don't want to assosciate the wrong parts witht the wrong serial (since it would pick up the parts on the previous serial and now I would think those compnonents were there). I just need a way to search for error msgs and then skip the extract step in those cases.
Re: Looping through web tables
Hi,
In that case, the second solution will do. I've attached a project that implements it. The "test" actions tree will execute its child nodes if no error is found. I've added 3 kinds that will try to be selected. If any of them is found, the child nodes won't be executed. Look at the code in the "Execute JS (Check For Errors)". If you would like to check for another error, just create a kind, say, called "SomeErrorKind" and add this code right before the "return true;" line at the bottom:
Inside the "Execute JS (No error)" action, there is a line of code that if you uncomment (remove the "//" at the beginning), you will get a message box every time no error is found. You can use this for testing purposes.
Let me know how it goes.
In that case, the second solution will do. I've attached a project that implements it. The "test" actions tree will execute its child nodes if no error is found. I've added 3 kinds that will try to be selected. If any of them is found, the child nodes won't be executed. Look at the code in the "Execute JS (Check For Errors)". If you would like to check for another error, just create a kind, say, called "SomeErrorKind" and add this code right before the "return true;" line at the bottom:
Code: Select all
Global.Browser.SelectKind("SomeErrorKind");
if(Global.Browser.Selection.Count > 0) return false;
Let me know how it goes.
- Attachments
-
- LenovoScraper3.hsp
- (1.18 MiB) Downloaded 807 times
Juan Soldi
The Helium Scraper Team
The Helium Scraper Team
-
- Posts: 20
- Joined: Thu May 12, 2011 4:05 pm
Re: Looping through web tables
That's working great. Is there a way to check the HTML itself vs creating a kind? This is when searching for errors. I could look for "must be" vs creating two different kinds. This should be my last question. Thanks a lot for the help