This data source connector allows you to automatically retrieve email contents and/or it’s attachments via the IMAP protocol. It supports incremental loads and IMAP query to define specific criteria.
The IMAP protocol provides several advantages:
Feature | Note |
---|---|
Generic UI form | Dynamic UI form |
Row based configuration | Allows execution of each row in parallel. |
Incremental loading | Allows fetching data in new increments. |
IMAP query syntax | Filter emails using standard IMAP query |
Download email contents | Full body of email downloaded into the Storage column |
Download email attachments | All attachments downloaded by default into a file storage. |
Filter email attachments | Download only attachments matching specified regex expression |
Processors support | Use processor to modify the outputs before saving to storage, e.g. process attachments to be stored in the Tabular Storage |
Have IMAP service enabled on your Email account. You will need the IMAP credentials (name, password) and the hostname and port information of the IMAP server.
Please refer to your email provider for more information.
Note that the app fetches emails from the root INBOX
folder. If you use labels and filters in Gmail for instance, that move the messages to a different folder,
please set the imap_folder
configuration parameter.
Keboola data source connector
Username
field.Password
field.imap.gmail.com
in the IMAP host field
993
Fill in the Username
, Password
and the Hostname
and Port
of your providers IMAP server. See the Gmail example for inspiration.
Click the Add Row
button and name the row accordingly.
Fill in a Search query
to filter only the emails you want. By default all emails are downloaded. The most common usecase would be to filter the emails
by the Subject and Sender, e.g. (FROM "sender-email@example.com" SUBJECT "the subject")
. You can create much more complex queries if needed.
Refer to the query syntax for more examples.
Folder to get the emails from. Defaults to the root folder INBOX
. For example a label name in GMAIL = folder.
When checked, emails that have been extracted will be marked as seen in the inbox.
Use this field to filter only emails received since the specified date. This field supports fixed dates in a format YYYY-MM-DD
as well as
relative date period e.g. yesterday
, 1 month ago
, 2 days ago
, etc. We recommend setting this to cover some safety interval, for example 2 days ago
when
scheduled to run every day. The data is always upserted incrementally, so there won’t be any duplicates in the resulting table.
Check this option to download email content.
When set to true, also the attachments will be downloaded. You may use regex pattern to filter only attachments that are matching your definition.
For example to match only pdf files you can use .+\.pdf
pattern. If left empty, all attachments are downloaded.
By default, the files are downloaded into the File Storage. Use processors to control the behaviour.
If your attachments are in csv format you can use this combination of processors to store them in the Table Storage:
folder
parameter of the first processor matches the resulting table name{
"before": [],
"after": [
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables",
"folder": "result_table"
}
},
{
"definition": {
"component": "keboola.processor-create-manifest"
},
"parameters": {
"delimiter": ",",
"enclosure": "\"",
"incremental": false,
"primary_key": [],
"columns_from": "header"
}
},
{
"definition": {
"component": "keboola.processor-skip-lines"
},
"parameters": {
"lines": 1
}
}
]
}
If your attachments are in xlsx format you can use this combination of processors to store them in the Table Storage:
{
"before": [],
"after": [
{
"definition": {
"component": "kds-team.processor-xlsx2csv"
},
"parameters": {
"addFileName": true,
"selectSheets": [],
"ignoreSheets": []
}
},
{
"definition": {
"component": "keboola.processor-move-files"
},
"parameters": {
"direction": "tables"
}
}
]
}
Use this combination of processors to store them in the File Storage with a custom tags set:
{
"before": [],
"after": [
{
"definition": {
"component": "kds-team.processor-create-file-manifest"
},
"parameters": {
"tags": [
"SOME_TAG"
],
"is_permanent": false,
"tag_functions": []
}
}
]
}
Single table named emails
containing the email contents.
The results are always inserted incrementally to avoid duplicates.
Columns: ['pk', 'uid', 'mail_box', 'date', 'from', 'to', 'body', 'headers', 'number_of_attachments', 'size']
Attachments are stored by default in the File Storage prefixed by the generated message pk. bb41793268d4a8710fb5ebd94eaed6bc_some_file.pdf
.
The files will contain additional tags to distinguish the source:
Additional tags can be specified by the Create File Manifest processor or further processed and stored in the Table Storage by other processors.