This extractor loads a single CSV file from an HTTP/HTTPS URL and stores it in Storage.
Find the HTTP extractor in the list of extractors and create a new configuration. Name it.
A base URL is the prefix (for example,
htttps://mydomain.com) for all downloaded CSV files from a given website.
The base URL can also contain folder specification if the same folder is used for all files downloaded using this base URL.
To create a new table, click the New Table button and assign a name. It will be used to create the destination table name in Storage and can be modified.
The configuration can extract as many tables as you wish. The list is fully searchable, and you can delete or disable each table. In addition, you can explicitly run an extraction of only one table. The extraction order of the tables can be changed.
Each table has different settings (path, load type, etc.) but they all share the same base URL.
For each table you have to specify a path that leads to a single CSV file or to an archive (GZ and ZIP are supported), which will be imported into a single table in Storage.
There are three options for determining column names:
col_2and so on.
Primary Key can be used to specify the primary key in Storage, which can be used with Incremental Load and New Files Only to create a configuration that incrementally loads all new files into a table in Storage.
For more features, switch the configuration of each table to the Power User Mode by clicking the Open JSON editor link. Through editing the full JSON configuration you can set up the component (all options described in the GitHub repository) and also the processors (to learn more about processors, see the Developers Docs).
Changing the JSON configuration may render the visual form unable to represent the configuration, and switching back may be disabled. Reverting such changes will re-enable the visual form. But whenever possible, the JSON will translate back to the visual form and vice versa.