Sifflet Agentᴮᴱᵀᴬ

You can connect Sifflet to sources running inside private networks with the Agent.

🚧

The Sifflet Agent is currently in private beta.

Sifflet will extend the capabilities of the Agent over time. Please expect some rough edges when running the Agent. We encourage all users of the Agent to submit their feedback and report any bug to their Customer Success Manager or Sifflet support.

Please contact Sifflet if you want to turn on this feature on your instance.

The Sifflet Agent is a lightweight service that you run inside your own infrastructure. When using the Agent, your sources don't have to be exposed on the Internet.

Overview

The Sifflet Agent queries your Sifflet instance to know which actions it needs to execute on the data sources, and then queries the data sources from inside of the private network. As such, there is not need to open ports for incoming traffic: the Agent only needs to be able to open outgoing HTTPS connections to your Sifflet SaaS instance.

Supported data sources

Currently, the following data sources are supported by the Agent:

Deploying the Agent

Requirements

The Agent is packaged as a Docker image, and runs anywhere you can run Docker containers (including Kubernetes, services such as AWS ECS, or bare virtual machines or servers).

The Agent needs to:

  • Have access to the API of your Sifflet instance, through HTTPS. When using a Sifflet SaaS instance, if you access the Sifflet web application with https://example.siffletdata.com, your API URL is https://example.siffletdata.com/api.
  • Have network access to the data sources it needs to handle, with the credentials you provide in the Sifflet web UI.
  • Use a supported data source (see list above).

Installation

Create an access token for the Agent

The first step is to create an "Agent Service Account" level Access Token that will be used to authenticate the agent.

  • Follow the Access Tokens documentation to create the Access Token.
  • The Access Token needs to have the "Agent Service Account" role.
  • Make sure to safely store the Access Token on your side, as it will only be displayed once after creation.

Deploy the agent

Pull the image

The Agent's Docker image is available in AWS' public gallery under the URI public.ecr.aws/sifflet/agent.

For instance, here's how to pull the Agent image to your local machine:

docker pull public.ecr.aws/sifflet/agent:latest

Start the agent

To start the Agent, simply run the Docker image with the appropriate configuration. For example, with the Sifflet API URL https://example.siffletdata.com/api and the token example-token:

docker run public.ecr.aws/sifflet/agent:latest -u "https://example.siffletdata.com/api" -t "example-token"

Do not forget to give the Agent the necessary network permissions to connect to the Sifflet API and to query your data sources.

You will know that the Agent was successfully started when it logs the message Sifflet agent started.

Create a source that runs on the agent

To check that the Agent works as expected, create a data source supported by the agent. You will have a checkbox giving you the option to use the Sifflet Agent on this data source.

Check this option, and then use the "Test Connection" button. If you get an error, check the Agent's logs to understand why the connection failed.

Configuration

Agent configuration

To configure the agent, there are three options:

  • Using the CLI options directly.
  • Using the CLI options in a configuration file.
  • Using environment variables.

Using CLI options directly

--instance-access-token, --token or -t (string or flag)

Your Sifflet instance access token. If used as a flag, this will prompt you for the token.

--instance-api-url, --url or -u (string)

Your Sifflet instance API URL

--version (flag)

Display version information

Example of usage:

docker run public.ecr.aws/sifflet/agent:latest --url "https://example.siffletdata.com/api" --token "example-token"

Refer to the --help option for details and advanced options.

docker run public.ecr.aws/sifflet/agent:latest --help

Using CLI arguments in a configuration file

You can use a file to pass the same options as presented above. The following is an example file named .options.cmd:

--url https://example.siffletdata.com/api
--token "example-token"

You can then reference it when running the Agent:

docker run public.ecr.aws/sifflet/agent:latest @.options.cmd

Using environment variables

All of the options can also be configured with environment variables. The names of the environment variable are the names of the options, in upper case, preprended with AGENT_. For example:

export AGENT_INSTANCE_ACCESS_TOKEN="example-token"
export AGENT_INSTANCE_API_URL="https://example.siffletdata.com/api"

Use the agent in sources

When creating or updating a data source supported by the agent on the data source configuration page, you will have a checkbox giving you the option to use the Sifflet Agent on this data source.

You can then test the connection to the data source, and run the data source ingestion, as you would without the Agent.

Known limitations

For now, the Sifflet Agent has the following limitations:

  • The "List schemas" button on the source configuration page will return an error. This will be fixed soon.
  • Using the "Preview data" feature on data assets belonging to data sources configured with the Agent will return an error, as this feature is not yet supported with the Agent.
  • Viewing or downloading Failing Rows in a failing monitor linked to an asset from a data source configured with the Agent will return an error, as this feature is not yet supported with the Agent.
  • Images are only provided for x86 architectures.

Alternatives

If the Sifflet Agent does not answer your requirements, Sifflet supports establishing AWS and Azure Private link connections.