This extractor allows you to automatically retrieve email contents and/or attachments via the IMAP protocol using basic authentication. It supports incremental loads and IMAP queries to define specific criteria.
The IMAP protocol offers several advantages:
Feature | Note |
---|---|
Generic UI form | Dynamic UI form adapting to various configurations. |
Row based configuration | Execute each row in parallel. |
Incremental loading | Fetch new data in increments. |
IMAP query syntax | Filter emails using standardized IMAP query syntax. |
Download email contents | Download the full email body into the Storage column. |
Download email attachments | All attachments are downloaded by default into File Storage. |
Filter email attachments | Download only attachments that match a specified regex expression. |
Processors support | Use processors to modify outputs before saving to Storage; e.g., process attachments to be stored in Tabular Storage. |
Enable IMAP service on your email account. You will need your IMAP credentials (name, password), as well as the hostname and port of your IMAP server. Check with your email provider if you need more details.
Note: The app fetches emails from the root INBOX
folder. If you use labels or filters in email providers (e.g., Gmail) that move messages to a different folder,
set the imap_folder
configuration parameter.
Keboola extractor
.Username
field.Password
field.IMAP host field
, entere the Gmail imap address: imap.gmail.com
.993
.Fill in the Username
, Password
, Hostname
, and Port
of your provider’s IMAP server. See the Gmail example for guidance.
Click Add Row
and name the row accordingly.
Enter a Search query
to filter the emails you want. By default, all emails are downloaded. A common use case is to filter
by Subject and Sender, e.g., (FROM "sender-email@example.com" SUBJECT "the subject")
. More complex queries are also supported;
refer to the query syntax for examples.
Specify the folder to fetch emails from. Defaults to the root folder INBOX
. For example, in Gmail, a label can function as a folder.
When selected, emails that have been extracted will be marked as “seen” in the inbox.
Use this field to filter emails received since a specific date. The field supports fixed dates in the format YYYY-MM-DD
as well as
relative dates like yesterday
, 1 month ago
, 2 days ago
, etc. To avoid missing data, set this to cover a buffer period, e.g., 2 days ago
when
running daily. The data is always incrementally upserted, so duplicates won’t appear in the resulting table.
Select this option to download the email content.
When enabled, attachments are also downloaded. You may use a regex pattern to filter for attachments that match your definition.
For example, to match only PDF files, use the pattern .+\.pdf
. If left empty, all attachments are downloaded by default.
The files are saved in File Storage by default. Use processors to control the behaviour.
If your attachments are in CSV format, you can use this combination of processors to store them in Table Storage:
folder
parameter in the first processor to match the resulting table name.{
"before": [],
"after": [
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables",
"folder": "result_table"
}
},
{
"definition": {
"component": "keboola.processor-create-manifest"
},
"parameters": {
"delimiter": ",",
"enclosure": "\"",
"incremental": false,
"primary_key": [],
"columns_from": "header"
}
},
{
"definition": {
"component": "keboola.processor-skip-lines"
},
"parameters": {
"lines": 1
}
}
]
}
If your attachments are in XLSX format, you can use this combination of processors to store them in Table Storage:
{
"before": [],
"after": [
{
"definition": {
"component": "kds-team.processor-xlsx2csv"
},
"parameters": {
"addFileName": true,
"selectSheets": [],
"ignoreSheets": []
}
},
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables"
}
}
]
}
Use this processor to store attachments in File Storage with custom tags. It adds custom tags to the resulting files and offers additional options to create tags based on the file name.
{
"before": [],
"after": [
{
"definition": {
"component": "kds-team.processor-create-file-manifest"
},
"parameters": {
"tags": [
"SOME_TAG"
],
"is_permanent": false,
"tag_functions": []
}
}
]
}
A single table named emails
contains the email contents.
Results are inserted incrementally to avoid duplicates.
Columns: 'pk', 'uid', 'mail_box', 'date', 'from', 'to', 'body', 'headers', 'number_of_attachments', 'size'
Attachments are stored by default in File Storage, with filenames prefixed by the generated message primary key, e.g., bb41793268d4a8710fb5ebd94eaed6bc_some_file.pdf
.
Files include tags to distinguish their source:
Additional tags can be specified with the Create File Manifest processor. Attachments can also be further processed and stored in Table Storage using other processors.