Pipeline triggers

Triggers is how data pipelines are invoked. There are three types of trigger:

  • A NewData trigger invokes a pipeline every time data is added to the data collection in the data lake specified in DataCollection.
  • A Scheduled trigger invokes the pipeline if new data was added to the data collection since the trigger was last executed.
  • A DirectPost trigger invokes the pipeline when new data is posted directly to this trigger, bypassing the data collections

A trigger specifies a list of DataSources which define the input data of the pipeline using the data collections and/or directly posted data.

pipelines

Syntax

YAML

Name: String
Type: TriggerType
CronSchedule: String
DataCollection: String
Filters:
  - Filter
Pipeline: String
DataSources:
  - DataSource

JSON

{
  "Name": String,
  "Type": TriggerType,
  "CronSchedule": String,
  "DataCollection": String,
  "Filters": [ Filter, ... ],
  "Pipeline": String,
  "DataSources": [ DataSource, ... ]
}

Properties

Name

The name has to be unique (accross one user) and is used to reference the trigger to update or delete it.
Required: No
Type: String
Default: Automatically generated name based on the pipeline

Type

The type specifies the different ways the trigger gets triggered.
Required: Yes
Type: TriggerType
Options:

  • NewData (0): This type of trigger listens to DataCollection and fires every time new data is added.
  • Scheduled (1): This type of trigger runs on a schedule but fires only if new data was added to the DataCollection since the trigger last ran.
  • DirectPost (2): This type of trigger runs when you directly post data to it at /data/DirectToTrigger/{triggerName}

Data collection

The name of the data collection where the data should be taken from.
Required: Conditional, if Type is NewData or Scheduled
Type: String

Pipeline

The name of the pipeline that should be invoked by the trigger.
Required: Yes
Type: String

Filters

A list of filters, where each filter specifies a criteria for the trigger to use the data item in question. All filters in the list have to be satisfied. These filters are only used to negotiate whether the trigger should fire.
Required: No
Type: List of Filter
Default: Empty list

Filter/Template

The liquid template applied to the data item. To access the data in the data item filtered, the variable data can be used.
Required: Yes
Type: Liquid

Filter/ExpectedValue

The expected value the evaluated templated is compared to.
Required: Yes
Type: String

CronSchedule

A string representing a CRON schedule. The trigger is executed based on this schedule.
Required: Conditional, if Type is Scheduled
Type: String

DataSources

Each data source wires data from a DataCollection or direct post to the Inputs of the Pipeline. If no source is defined the trigger will not have any effect.
Required: No
Type: List of DataSource
Default: Empty list

DataSource/Type

Each type of data source has a different way to retrieve data from DataCollection or direct post.
Required: Yes
Type: DataSourceType
Options:

  • LatestItem (0): The latest item that was added to the data collection. If the Trigger is of type NewData and this data source uses the same data collection with this DataSourceType, then the item passed to the pipeline will always be the new item.
  • FixedSizeSet (1): A fixed size data set of the newest items added the the data collection. The size of the set is specified in DataSetSize.
  • UnprocessedSet (2): All data that was added to the data collection since this trigger last executed.
  • DirectPost (3): only valid for Trigger Type DirectPost. The data that was directly posted to the trigger.

DataSource/DataCollection

The DataCollection that this data source retrieves data from
Required: Conditional, if Type is LatestItem, FixedSizeSet or UnprocessedSet
Type: String

DataSource/DataSetSize

Required: Conditional, if Type is FixedSizeSet
Type: Integer

DataSource/Input

The name of the pipeline input this data source should be wired to.
Required: Yes
Type: String

DataSource/Filters

A list of filters, where each filter specifies a criteria for the trigger to use the data items in question. All filters in the list have to be satisfied. These filters filter the data for this DataSource and make sure only data meeting the filter criteria gets sent to the pipeline. The filter will generally not affect the size of the FixedSizeSet DataSource, only if there is not more data available.
Required: No
Type: List of Filter
Default: Empty list