URI sources

Concept

URI stands for Uniform Resource Identifier. Using this concept in Sifflet aims to provide a way to identify any object inside and outside the application with a universal id.

You can read more about the generic concept of URI starting on the dedicated Wikipedia page: https://en.wikipedia.org/wiki/Uniform_Resource_Identifier

URI in Sifflet

OpenLineage designed a standard URI model to apply to datasets and jobs. In Sifflet we choose to adapt and extend this standard to our technology requirement.

Every asset present in Sifflet catalog have the possibility to be identified through an URI that is taking the following format:

URI = scheme ":" ["//" authority] uniqueName

  • Scheme is used to define the identifier assignment specification. In our data asset domain, it will be dependent to the technology of the asset we want to define: for example bigquery for BigQuery, mysql for MySQL, oracle for Oracle, etc…
  • Authority is used to define the instance where we are locating our asset. Most commonly it would be composed of an association of the host and the port of a service, but for some technologies it can simply be a workspace identifier, an account identifier. In the case of a fully decentralized service it can be fully omitted. A few examples to illustrate:
    • An Oracle asset authority is defined by the address of the host of the Oracle server and its attached port.
    • A PowerBI asset authority is defined using the id of the workspace where the asset is located.
    • BigQuery assets doesn’t have authority: the BigQuery service is fully decentralized.
  • Unique Name is used to define the path to the asset inside an instance. In some cases it will be directly the identifier or name of the asset and in some other cases it will follow a more complex namespace hierarchy in usage in the system. Here are some examples:
    • A PostgreSQL asset unique name is defined using the hierarchy databaseschematable. It will be written databaseName.schemaName.tableName.
    • A Looker dashboard unique name is defined using the id of the dashboard inside the Looker instance it belongs to. There is no hierarchy to take into account given a dashboard id is unique inside a given instance.
    • A BigQuery asset unique name is defined using the hierarchy projectdatasettable. It will be written projectName.datasetName.tableName.

For full definition of URI per technology or about how to craft a generic specification URI, you can access the dedicated pages of the documentation.

Finding assets' URI

You can get the URI of existing assets from their asset page .

  1. Go to the details page of an asset
  2. Click the three dot menu located at the top right end corner of the page
  3. Click Copy Data Asset URI to copy the asset URI to your clipboard