Provides data input and output functionality.
Creates functions that receive a URL and perform a web request. The response is then parsed as a JSON string and a structure having a particular JSON schema is returned by the function.
These functions are created with the 'Ajax.' prefix.
Creates functions that parse a JSON string and produces and structure matching the provided JSON schema.
These functions are created with the 'Parse.' prefix.
Helium Scrapers uses a non-strict way of reading JSON data, for both Ajax and JSON Parsers. Here are a few points to take into consideration:
- If the JSON Schema represents a string, and the JSON is any other valid JSON than a string, no error will be thrown and a string representation of the object will be returned.
- If the JSON Schema represents an object with a single property called
value, and the JSON is an atomic value, this value will be used as the value of the
valueproperty. This is useful when extracting arrays of atomic values directly into the database, since arrays of atomic values are not extracted into a separate table and only the first value would be extracted unless the Schema represents an array of objects.
- If the JSON Schema represents an array, and the JSON is a value of a non-array type, then a single item list containing this value will be returned.
- If the JSON Schema represents an array, and the JSON represents
null, and empty list will be returned.
Since there is no null value in Helium Scraper, nullable types can be represented as arrays containing either no items or a single item. To do this, add the
"optionals": "list" property to the top level object of the JSON Schema. This property causes JSON non-array object properties, that are not marked as required, to be converted to lists. This prevents errors in case their value is null or undefined.
JSON Schema can be inferred given any sample JSON. To do this, click the JSON Schema Inference button on the ajax or JSON parser editor to shown the inference panel. Then, either paste the JSON on the JSON editor to the left, or enter a URL and press the Download JSON button to download the JSON and fill up the JSON editor. Then press the Infer JSON Schema button and the schema editor will be filled up with the inferred schema.
When the JSON Schema is inferred, the
"optionals": "list" property is always included.
Creates values or functions that output the results of a given SQL query. If the query has no parameters, the result is a value. If it does, the result is a function that takes the selected parameters. To add parameters, select the Parameters button on the query editor and enter one or more parameters. To add them to the query, prefix their names with the '@' symbol.
Queries on table sets can be quickly created by right clicking a table set and selecting Create Query.
These functions are created with the 'Query.' prefix.
This is an easy way to quickly create a function that extracts and transforms text from HTML elements.
To get started, right click the Text category and select Create. If any elements were selected on the main browser, their text will be shown on a table. Otherwise a default text will be shown. New sample text can be added to this table by typing on the last row. Then, click the Add Step button and select one or more steps until the desired output is produced. Each of the following three kinds of steps transform the input text in a different way:
- Slice: Outputs a section of the input text. The source text is split into sections by a given delimiter, and the section at the zero based Slice Position is used as the output.
- Replace: Replaces every occurrence of a string with another string. If regular expressions are used, the replacement text can include the
$Nplaceholder to output regular expression matches, where
Nis a zero based index representing the index of the capturing group.
- Regular Expression: Runs a regular expression on the input text and outputs the match at Match Position.
After the desired output has been produced, press the Save button and the function will be accessible from any global using the 'Text.' prefix. If the function is added below a selector, just like with any gathering action, the function will be applied to the elements selected by the selector above. The following example uses a text transformation function called
MyTextFunction to extract text from the elements selected by the