Triggers is how data pipelines are invoked. There are three types of trigger:
- A
NewData
trigger invokes a pipeline every time data is added to the data collection in the data lake specified inDataCollection
. - A
Scheduled
trigger invokes the pipeline if new data was added to the data collection since the trigger was last executed. - A
DirectPost
trigger invokes the pipeline when new data is posted directly to this trigger, bypassing the data collections
A trigger specifies a list of DataSources
which define the input data of the pipeline using the data collections and/or directly posted data.
Syntax
YAML
Name: String
Type: TriggerType
CronSchedule: String
DataCollection: String
Filters:
- Filter
Pipeline: String
DataSources:
- DataSource
JSON
{
"Name": String,
"Type": TriggerType,
"CronSchedule": String,
"DataCollection": String,
"Filters": [ Filter, ... ],
"Pipeline": String,
"DataSources": [ DataSource, ... ]
}
Properties
Name
The name has to be unique (accross one user) and is used to reference the trigger to update or delete it.
Required: No
Type: String
Default: Automatically generated name based on the pipeline
Type
The type specifies the different ways the trigger gets triggered.
Required: Yes
Type: TriggerType
Options:
NewData
(0): This type of trigger listens toDataCollection
and fires every time new data is added.Scheduled
(1): This type of trigger runs on a schedule but fires only if new data was added to theDataCollection
since the trigger last ran.DirectPost
(2): This type of trigger runs when you directly post data to it at/data/DirectToTrigger/{triggerName}
Data collection
The name of the data collection where the data should be taken from.
Required: Conditional, if Type
is NewData
or Scheduled
Type: String
Pipeline
The name of the pipeline that should be invoked by the trigger.
Required: Yes
Type: String
Filters
A list of filters, where each filter specifies a criteria for the trigger to use the data item in question. All filters in the list have to be satisfied. These filters are only used to negotiate whether the trigger should fire.
Required: No
Type: List of Filter
Default: Empty list
Filter/Template
The liquid template applied to the data item. To access the data in the data item filtered, the variable data
can be used.
Required: Yes
Type: Liquid
Filter/ExpectedValue
The expected value the evaluated templated is compared to.
Required: Yes
Type: String
CronSchedule
A string representing a CRON schedule. The trigger is executed based on this schedule.
Required: Conditional, if Type
is Scheduled
Type: String
DataSources
Each data source wires data from a DataCollection
or direct post to the Inputs
of the Pipeline
. If no source is defined the trigger will not have any effect.
Required: No
Type: List of DataSource
Default: Empty list
DataSource/Type
Each type of data source has a different way to retrieve data from DataCollection
or direct post.
Required: Yes
Type: DataSourceType
Options:
LatestItem
(0): The latest item that was added to the data collection. If the Trigger is of typeNewData
and this data source uses the same data collection with thisDataSourceType
, then the item passed to the pipeline will always be the new item.FixedSizeSet
(1): A fixed size data set of the newest items added the the data collection. The size of the set is specified inDataSetSize
.UnprocessedSet
(2): All data that was added to the data collection since this trigger last executed.DirectPost
(3): only valid for TriggerType
DirectPost
. The data that was directly posted to the trigger.
DataSource/DataCollection
The DataCollection
that this data source retrieves data from
Required: Conditional, if Type
is LatestItem
, FixedSizeSet
or UnprocessedSet
Type: String
DataSource/DataSetSize
Required: Conditional, if Type
is FixedSizeSet
Type: Integer
DataSource/Input
The name of the pipeline input this data source should be wired to.
Required: Yes
Type: String
DataSource/Filters
A list of filters, where each filter specifies a criteria for the trigger to use the data items in question. All filters in the list have to be satisfied. These filters filter the data for this DataSource
and make sure only data meeting the filter criteria gets sent to the pipeline. The filter will generally not affect the size of the FixedSizeSet
DataSource
, only if there is not more data available.
Required: No
Type: List of Filter
Default: Empty list