Lineage is an important part of understanding your data ecosystem, in this page you will learn how to:
- Understand how different parts of your data relate to each other
- Filter your data lineage by Data Type, Search Term, and by Excluding your Search Term
Watch this video for detailed information on using the lineage feature, or continue reading for an overview. Or skip to the next section to read about Lineage in more detail.
Select Star can show you column-level lineage for your data assets. The lineage view is designed to show where the data is coming from and where is it flowing towards, so you can find dependencies of each table, column, or dashboard, and see how changes to your assets would impact your data environment.
When you connect a data source to Select Star, Lineage is automatically generated by parsing the SQL statements that ran in your data source. Which ones and how we parse them varies by source.
Click on the Lineage tab from a table page to see how it connects to your other tables and dashboards.
The lineage modal opens with a graph of the Upstream Sources and Downstream Targets of the data asset you're looking at. There is also a sidebar on the left side of the page which shows the same information as a directory tree.
You can navigate through the lineage using either the graph or the sidebar. The sidebar may be easier to use when navigating large graphs which show lots of connected data.
Make sure Dashboards is checked at the bottom of the lineage graph if you want to see downstream dashboards or reports.
Want to keep track of all the tables you've clicked on? Check Auto-close unrelated assets is checked by default to keep your lineage loading quickly. If you'd like to keep everything you've clicked on open on the graph, uncheck this option.
If in doubt, hover over any of the icons in the graph to show what will happen if you click on them.
To make it even easier to navigate lineage for tables or dashboards with many fields, you can search within a table or dashboard.
First click the item you want to search so the magnifying glass icon appears, then type your search to narrow the results.
If you need to thin down results, use some of our filtering features on your upstream or downstream lineage:
- Filter by:
- Data Type
- Search Term
- Exclude Search Term
Note that you can layer the Data Type and Search Term/Exclude Search Term, but Search Term and Exclude Search Term are not able to be layered/applied simultaneously.
Open your 🔍 Filtering Options to get started:
From there, you can Search by Term:
Search by Term and Filter by Data Type:
And Exclude Search Term:
When talking about lineage, we say that data is propagated downwards to downstream data asset (another table, view, dashboard, etc). Data can be propagated as follows
AS IS: The data in the target is identical in value and format to that in the source.
AGGREGATED: The data in the target has been aggregated and the value in target may be different from the one at source.
TRANSFORMED: The data in the target has been aggregated and the format and values might be different from the ones at source.
When calculating lineage between your assets, we also automatically classify downstream propagation. You can see how a column is propagated by editing the column tags. Learn more about tagging in Tag Management.
Lineage refreshes approximately every 24 hours, after metadata sync is complete.
Select Star looks at DDL statements (used to build and modify the structure of your database) and DML statements (used to query and modify the data in your tables) to identify the lineage of your data.
Select Star will add new relationships to lineage based on both DDL (e.g. CREATE) and DML (e.g. INSERT/UPDATE) statements, however will only remove lineage relationships if a new DDL statement is detected.