This data source connector uses the Instagram Graph API (built on the Facebook Graph API) to extract media objects, comments, insights and metrics from Instagram Business Accounts.
In order to access a business account’s data, users have to authorize a Facebook account and choose a Facebook page that is connected to an Instagram Business Account. The rest of the configuration process is almost identical to configuring the Facebook connector.
Before you begin, make sure you have a Facebook page, a role on that page, and an Instagram account. The Facebook page needs to be connected to the Instagram Business Account.
Create a new configuration of the Instagram connector.
Then click Authorize Account to authorize the configuration with a Facebook account
with access to the Facebook page you have the Instagram Business Account connected to.
You will be asked for the instagram_basic,instagram_manage_insights,pages_show_list
permissions.
Optionally, you can use Direct token insert
to specify a manually generated access token.
You can always revoke the authorization by removing the Keboola IG data source from the list in the Facebook apps tab (under settings).
From the list of fetched Instagram Business Accounts associated with the authorized Facebook account, select the one you want to extract from.
Create a new query and specify what data to extract. If you choose a preconfigured template, all necessary fields will be filled automatically.
The query describes the connector request to be sent to the Instagram Graph API that is built on the Facebook
Graph API. Knowing the API will make creating a query easy because all options except name
represent the
Facebook Graph API request parameters.
The Name option describes the query and is used to prefix all table names resulting from the query. One query can produce multiple tables. If a table name produced by the query matches the query name or its substring trimmed after the last occurrence of an underscore, then the output table name will not be prefixed and the query name will be used instead.
For example, if the query name is media_comments
and the produced table name is media_comments
, the output
table name will be media_comments
. If the query name is foo
and the produced table name is insights
,
the output table name will be foo_insights
.
The Endpoint option describes a significant URL part of the request made to the Facebook Instagram API.
The absolute URL is in the following form: https://graph.facebook.com/<api_version>/<endpoint>
.
A typical example would be the media.
If left empty, the Endpoint option references data of the Instagram Business Account itself, which in fact
refers to the user endpoint.
The Fields option describes data returned from the endpoint. Typically, it is a comma-separated list of
fields, but it also can be used to parametrize the fields and nest more endpoints into it.
The media endpoint returns all
media objects created by an Instagram account. Each media object contains fields such as caption
, comments_count
, created_time
and like_count
. The fields parameter in such case is caption,comments_count,created_time,like_count
.
Fields/Endpoint Nesting —
Media can contain comments and those can be included in the fields as well: caption,message,created_time,like_count,comments{text,replies,timestamp,like_count,user}
. The comma separated list in between the curly brackets {}
specifies fields of the “nested” comment field/endpoint for each media. This way, more endpoints can be nested, and there is no limit of nesting levels.
Fields Parametrization —
Each field can be parametrized by a dot following a parameter/modifier name and a value in brackets.
Typical parameters would be since
, until
or limit
,
or modifiers that the particular endpoint offers such as metrics
for the insights endpoint.
An example of parametrized fields: comments.limit(10){text,like_count}
or insights.period(lifetime).since(5 days ago).until(today).metric(impressions)
.
The Instagram Business Account option specifies the Instagram Business Account that the query will be applied to. It can be chosen from a
list of selected accounts after authorization. There is the All Instagram Business Accounts
option meaning that the query will
be applied to all selected accounts. The None
option means that the query will not be applied to any accounts.
It can be useful when extracting data about the authorized account itself. This option is represented
by the Facebook Graph API parameter ids
that is a comma separated list of page ids.
The Since and Until options represent corresponding Facebook Graph API request parameters and
specify the date range that will be applied to the time based data retrieved by the endpoint. For
example, if the endpoint is feed
, then all media objects created within the specified since-until range will be retrieved.
The Since/Until parameter is parsed via the strtotime function and can be specified
yyyy-mm-dd
format, or14 days ago
or last month
.For consistent results, specify both the since and until parameters.
The Limit option represents the Facebook Graph API request parameter limit
; it is the maximum number
of objects that may be returned in one page of the request. (The default is 25 and the maximum is 100.)
It is useful when the Facebook Graph API returns an error saying there is too much data requested; in such
cases, lower the limit and run the query again.
The output data represent a tree where each node is an array of objects returned from the Facebook Graph API. The tree is transformed into one or more CSV tables.
Each row of a table represents one object. Each table has the primary key auto-detected during the
extraction, so the table data is imported incrementally. The columns of the output tables represent
fields from the Fields
query option. Moreover, each table will always contain the following basic set of columns:
id
— id returned by the Facebook Graph APIex_account_id
— id of the Instagram Business Account corresponding to the object stored in the rowfb_graph_node
— describes the “vertical position” of the object in the resulting tree. For example,
for media objects it will be page_media
, for comments of media it will be
page_media_comments
.parent_id
— refers to the id
column of a parent object represented by another row and table.
For instance, if the row is representing a comment object, its parent is a media object, and parent_id
is the id of the media object. The parent object type can be also determined from the fb_graph_node
column as a
substring from the beginning until the last occurrence of an underscore. To give an example, if
fb_graph_node
contains the value page_media_comments
, the parent object type is page_media
. The
top parent is named page
, and it represents Instagram Business Account id.Instagram Graph API versioning follows Facebook Graph API versioning. You can set the version of the Facebook Graph API that will be applied for all requests made to the API by the Instagram data source connector. Read more about the Graph API versions here.