Provides data input and output functionality.
Creates functions that receive a URL, or a raw HTTP request without a body, and perform a web request. The response is then parsed as a JSON string and a structure having a particular JSON schema is returned by the function.
These functions are created with the 'Ajax.' prefix.
Creates functions that parse a JSON string and produces and structure matching the provided JSON schema.
These functions are created with the 'Parse.' prefix.
Helium Scrapers uses a non-strict way of reading JSON data, for both Ajax and JSON Parsers. Here are a few points to take into consideration:
- If the JSON Schema represents a string, and the JSON has any other type, a string representation of the object is returned.
- If the JSON Schema represents a number or a boolean, and the JSON has any other type, 0 or
- If the JSON Schema represents an object with a single property called
value, and the JSON has an atomic type, this value will be used as the value of the
valueproperty. This is useful when extracting arrays of atomic values directly into the database, since arrays of atomic values are not extracted into a separate table and only the first value would be extracted unless the Schema represents an array of objects.
- If the JSON Schema represents an array, and the JSON has a non-array type, then a single item list containing this value will be returned.
- If the JSON Schema represents an array, and the JSON represents
null, an empty list will be returned.
Since there is no
null value in Helium Scraper, nullable types can be represented as arrays containing either no items or a single item.
JSON Schema Inference
JSON Schema can be inferred given any sample JSON. To do this, click the JSON Schema Inference button in the Ajax or JSON Parser editor to show the inference panel. Then, either paste the JSON into the JSON editor on the left, or enter a URL or a raw HTTP request, without a body, into the URL text box, and then press the Download JSON button to download the JSON and populate the JSON editor. Finally, press the Infer JSON Schema button and the schema editor will be filled up with the inferred schema. The following Inference Settings are available:
- Wrap Objects: When true, the inferred JSON schema of every object will be an array of objects, rather than an object, when the object is not already a member of an array. This, in turn, causes all objects to be parsed as sequences of objects, which are more convenient to work with than objects.
Creates values or functions that output the results of a given SQL query. If the query has no parameters, the result is a value. If it does, the result is a function that takes the selected parameters. To add parameters, select the Parameters button on the query editor and enter one or more parameters. To add them to the query, prefix their names with the '@' symbol.
Queries on table sets can be quickly created by right clicking a table set and selecting Create Query.
These functions are created with the 'Query.' prefix.
This is an easy way to quickly create a function that extracts and transforms text from HTML elements.
To get started, right click the Text category and select Create. If any elements were selected on the main browser, their text will be shown on a table. Otherwise a default text will be shown. New sample text can be added to this table by typing on the last row. Then, click the Add Step button and select one or more steps until the desired output is produced. Each of the following three kinds of steps transform the input text in a different way:
- Slice: Outputs a section of the input text. The source text is split into sections by a given delimiter, and the section at the zero based Slice Position is used as the output.
- Replace: Replaces every occurrence of a string with another string. If regular expressions are used, the replacement text can include the
$Nplaceholder to output regular expression matches, where
Nis a zero based index representing the index of the capturing group.
- Regular Expression: Runs a regular expression on the input text and outputs the match at Match Position.
- Date Time Conversion: Transform a date-time string using input and output .NET standard date-time formats. Optionally, input an output languages can be specified using a language code such as en-US.
After the desired output has been produced, press the Save button and the function will be accessible from any global using the 'Text.' prefix. If the function is added below a selector, just like with any gathering action, the function will be applied to the elements selected by the selector above. The following example uses a text transformation function called
MyTextFunction to extract text from the elements selected by the