Assets that are included in lineage reports
Jobs and mapping specifications are excluded from lineage by default. Assets of all other types are included for lineage by default. When you include or exclude an asset type for a report, the child assets of that asset type are also included in or excluded from the report. In addition, virtual assets are also displayed in a lineage report. A virtual asset is an asset that was not imported or created in the catalog, but it is accessed by a job.
Data lineage and business lineage reports
To see whether an asset type and its child can be included in business lineage reports, go to the Details page of the asset.
You can run data lineage and business lineage reports on the following assets and their child assets.
Asset type | Successive child assets |
---|---|
Application (extended data source) |
|
Business intelligence (BI) |
|
Database |
|
Data file |
|
File (extended data source) | None |
Stored procedure definition (extended data source) |
|
MDM model |
|
IDoc type |
|
Data lineage reports
You can run data lineage, but not business lineage, reports on the following assets:
Asset | Successive child assets |
---|---|
Jobs from IBM® InfoSphere® DataStage® |
|
MDM model |
|
|
Mapping |
Virtual assets
At times, you might not want or be able to import some data sources into the catalog to be catalog assets. In this case, the display of virtual assets can show the uninterrupted lineage even through the data sources that are not imported. As a result, the display of virtual assets provides a complete lineage representation despite incomplete metadata in the catalog.
Virtual assets display information about a data source from the properties of a stage when no corresponding data source exists in the catalog. If a matching data source does exist in the catalog, the actual data source rather than the virtual asset is displayed in the lineage report.
Ideally, for design lineage, all parameters that are used in data source identity definitions of stages need to have realistic default values. This best practice applies to job parameters and to environment variables that are used as project-level parameters. When default values are missing, lineage reports display the job as linked to virtual data sources. These virtual assets are named according to how the stages address them. When the data source name is explicitly given or when default values do exist, the virtual asset name is that data source name. In the case where no default values exist, the virtual asset name is the parameter name enclosed in pound (#) signs. An example might be #some_param_name#.
You need to map data connection objects that are found and extracted from stage properties to databases that exist in the catalog. If this mapping is not done, stages and their jobs are displayed as connecting to a virtual asset rather than to an imported database.
- You cannot search or query for virtual assets
- Virtual assets are only displayed in lineage reports and in the Usage Information section of the Details page of jobs, stages, and stage columns
- Lineage filtering cannot hide virtual assets
- You cannot exclude virtual assets from a business lineage report
The icons for virtual assets are lighter-colored icons than the ones which are used for assets in the catalog.