Extract Superscript And Subscript Text

Questions and Answers about programming Helium Scraper.
Post Reply
edvukass
Posts: 13
Joined: Wed Jan 30, 2013 3:58 pm

Extract Superscript And Subscript Text

Post by edvukass » Mon May 20, 2013 12:45 pm

Hi, I'm trying to extract text that has html superscript or subscript tags "<sup>2</sup>".
Every time there is something like superscript("<sup>2</sup>") or subscript ("<sub>3</sub>") tags in the text it is either not selected at all and line is splitted into two lines or it's extracted as normal text but not as superscript or subscript index.
I've got the project file which is too big to upload here but would be able to send it via email.
Is there a way to select correctly superscript or subscripts using Boolean text gatherer? Can anyone send me an example?

sample:

I need these two lines to be extracted to two different rows in the database. Thank you.

<td>
15g (40cm<sup>2</sup>)=£7.52.<br>
50g (130cm<sup>2</sup>)=£24.89
</td>

Post Reply