Manta Flow Informatica EDC User Documentation
Definition of Exported Entities
Manta Flow analyzes SQL scripts (meaning procedures, view definitions, macros, ad-hoc scripts, etc.) and ETL, analytical, and reporting tools. It then imports metadata about every interesting SQL statement (basically, that means every SQL query which directly transfers data — inserts, updates, deletes, etc.), all ETL transformations, and all analytical models and reports to Informatica EDC. The export is based on the latest revision in the Manta repository. Manta Flow creates new assets for each transformation described. These new assets have hierarchical parents, so they are well arranged and easy to find. These assets are also connected to the database or file columns whose data has been transferred.
The documentation below utilizes screenshots from IBM Automatic Data Lineage to illustrate which entities are and are not exported to EDC. The ones that are not exported are tagged with a red X. Please note that the lineage through a non-exported object is exported if the non-exported object is between two exported assets. See the first screenshot for an example.
Entities Exported by Connector Type
-
Database connectors (DB2, MS SQL, Netezza, Oracle, PostgreSQL, Teradata)
-
Transformation assets (such as procedures, functions, scripts, triggers) are exported.
- If there is a sequence of transformations (e.g., a procedure calling a function that calls another function), only the last transformation applied is exported (in this case, the procedure) and the lineage through the functions is contracted without listing the specific intermediate transformation object.
-

-
As of EDC version 10.4.1, if context lineage export is turned on, all procedures and functions are exported.
-
Please note that for the transformation to be exported, a non-deduced source/target data asset (such as a table, view, or synonym) must exist in the EDC native resource.
-
The lineage between transformation assets and assets provided by EDC (tables, views, synonyms, files) is exported.
-
Standalone insert, delete, and truncate statements are exported.
If the detailed lineage export mode is used, this lineage will not be shown when the lineage is visualized from
a table. However, it should be available when the lineage diagram is executed from the transformation.
-
Master selects are exported (e.g., select used as a procedure output).
If the detailed lineage export mode is used, this lineage will not be shown when the lineage is visualized
from a table. However, it should be available when the lineage diagram is executed from the transformation. -
Data integration connectors (DataStage, IPC, ODI, SSIS, Talend)
-
Both the data integration assets and the lineage between them are exported as well as the source/target assets.

-
Analytical connectors (SSAS)
- Both the analytical assets and the lineage between them are exported as well as the source/target assets.
-
Reporting connectors (Cognos, Excel, SSRS)
- Both the reporting assets and the lineage between them are exported as well as the source/target assets.

Entities That Are Not Exported
-
Tables, view assets (lineage for views is exported), synonym assets (lineage for synonyms is exported), and files (as they should be available in EDC by native scanners)
-
Database transformation assets (such as procedures, functions, scripts, and triggers) without a non-deduced source/target data asset (such as a table, view, or synonym) in the EDC native resource
-
Note that items such as a cursor,
BULK COLLECT INTOstatement, variable, or pipelined function output parameter are not considered targets that would cause the transformation asset to be exported. -
For databases, source asset is optional.
-
Targets are required with the following exceptions:
-
If the transformation is a query, delete or truncate table (
PLSQL Query,PLSQL Delete, orPLSQL TruncateTablefor Oracle) -
Oracle specific exception - external system cursors (created in external systems such as a Java application) don't require target assets
-
-
Oracle specific exception - external system cursors (created in external systems such as a Java application) don't require target assets .
- Pseudo-columns and lineage into pseudo-columns; an example of a pseudo-column is RowNumber (as pseudo-columns are not part of the native resources provided by EDC scanner)

- Deduced tables/views, lineage to/from deduced tables (as they are not available in EDC; all tables/views should be scanned from the database and be standard objects)

- Indirect (filter) lineage (EDC up to version 10.4.0 EBF-17357, as it does not support indirect lineage in Manta terminology; i.e., lineage through the WHERE condition, GROUP BY, HAVING, JOIN clauses)

Exceptions
-
(MS SQL) Lineage to volatile and global temporary tables is not exported.
-
(Oracle) External system cursors (created in external systems such as a Java application) are exported.
-
Both the cursor itself and the upstream lineage are exported.
-
The cursor is exported only if there is no downstream lineage from it.
-

-
(PostgreSQL) Lineage to volatile and global temporary tables is not exported.
-
(Teradata) Lineage to volatile tables is not exported.
-
(Teradata) Downstream lineage from tables/views created in scripts is not exported.
Export Model
The diagrams below describe a general domain model of Informatica EDC — the classes and their associations that Automatic Data Lineage uses in EDC. Most of the classes are not used directly. Instead, inherited classes with more specific semantic meanings are used.
For EDC in versions prior to 10.4.1:

As of EDC version 10.4.1:

The diagrams below show the definite classes used by Manta for export to EDC.
For EDC in versions prior to 10.4.1:

As of EDC version 10.4.1:

Export Modes
At the moment, two modes of export to EDC are available in Automatic Data Lineage.
Standard Export
All the data exported to EDC shares the same context, thus all the data assets in EDC that are connected by lineage can ultimately be displayed in one lineage diagram in EDC. This approach has both pros and cons. It is easy to see all parts of the environment that are connected in one diagram. But if the diagram is too large, it might take a long time to render it, and the diagram can be congested and hard to navigate.
If indirect lineage export is turned on (as of Informatica EDC version 10.4.0 EBF-17357), indirect lineage in Manta terminology (i.e., lineage through the WHERE condition, GROUP BY, HAVING, JOIN clauses) is exported as EDC control flow.
Below is an example of how assets are exported to EDC when standard export is used.

Detailed Lineage Export
If the parts of the environment that are connected are too large to be effectively rendered in EDC, it is better to use detailed lineage export mode. In this case, the lineage between data assets is exported in separate contexts. The lineage diagrams in EDC have two levels.
-
Top-level diagram: Only data assets (database assets like tables and views, reports, and analytical tool assets) are visible, and the lineage between them is simplified — aggregation lineage associations are shown.
-
Detailed lineage context diagram: This diagram shows the details of a particular aggregation lineage association. Apart from data assets, transformation assets (stored procedures, packages, ETL transformations, etc.) are also visible.
If indirect lineage export is turned on (as of Informatica EDC version 10.4.0 EBF-17357), indirect lineage in Manta terminology (i.e., lineage through the WHERE condition, GROUP BY, HAVING, JOIN clauses) is exported as EDC control flow.
Below is an example of how assets are exported to EDC when detailed lineage export is used.

Context Lineage Export
Context lineage export is an export of all database transformation objects (not just the last ones) that does not mix up the lineage for different calls of the transformations.
This feature requires EDC version 10.4.1 or newer.
Below is an example of how assets are exported to EDC when context lineage export is used.

Browse Assets
To find an EDC resource containing assets uploaded by Automatic Data Lineage, use the search options on the EDC main screen. At the moment, Automatic Data Lineage creates:
-
Two resources in EDC when standard export mode is used. One resource (with the suffix Scripts) contains the transformation assets that Automatic Data Lineage loads into EDC and their parent-child associations. This resource uses the Manta custom metamodel. The second resource (with the suffix Links) contains the lineage associations. This resource uses the Custom Lineage metamodel.
-
One resource in EDC when detailed lineage export mode is used. This resource (with the suffix Scripts) contains the transformation assets, their parent-child associations, and the lineage associations.
The data assets loaded by EDC (typically database data assets) are in the EDC native resource (with the same name without any suffix). The data assets loaded by Automatic Data Lineage (typically reports and analytical tool assets) are in the Manta
Scripts resource in EDC. If the export of node source code at input level (manta.iedc.exportInputLevelSourceCode) is turned on, an Expression attribute containing the SQL code is exported for procedure, function,
script, etc. providing input-level code.
- Find the EDC resource.

- Choose the type of objects you want to start browsing (in this case Procedures).

- Find a procedure by name.

- Navigate to the child assets as long as needed.

- Some assets along the way have additional attributes filled in by Manta. Below is a database statement with an Expression attribute containing the SQL code of the statement and a statement column asset with the Expression attribute containing extracted transformation logic for a particular column.


- The following shows an Expression attribute containing the input-level SQL code of an Insert Into statement contained in an insert.sql script.

Lineage Diagrams
Lineage diagrams can be rendered for particular column-level or table-level data assets (viable for both standard export and detailed lineage export) or for leaf-level (transformation column) or operation-level (e.g., statement) transformation assets (viable only for standard export).
- Lineage diagrams are available under the Lineage and Impact tab on the asset overview screen.

- Based on the starting asset, a table-level or column-level lineage diagram is shown. By default, EDC will hide some of the assets that are part of the lineage graph.

- To see all the objects that are part of the diagram, adjust the Lineage (for upstream data lineage) and Impact (for downstream data lineage) sliders. The diagrams below show the full detail for standard export mode (1) and detailed lineage export mode (2).


- [Detailed lineage export mode only:] To see the detailed lineage context entry points press the Show Transformation Logic in Lineage button.

- [Detailed lineage export mode only:] Then click on the orange circle to enter the detailed lineage context of a particular association.


Transformation Sources and Target View
As of Informatica EDC version 10.2.2hf1, a tabular summary of the source and target assets (tables, views, synonyms, etc.) for a selected transformation is available.
Limitations: This feature only relates to the table level, not to the column level. Only direct (immediate) sources and targets are displayed. Transitive sources and targets are not supported.
-
Display the Lineage and Impact of a transformation asset at the table level.
-
Click the Open the Tabular Asset Summary icon.

- Click Asset Lineage Summary to see a list of the source assets of the transformation.

- Click Asset Impact Summary to see a list of the target assets of the transformation.

Indirect Lineage View
In this paragraph, "indirect lineage" means indirect lineage in Manta terminology (i.e., lineage through the WHERE condition, GROUP BY, HAVING, JOIN clauses) that is exported to EDC as control flow.Please do not confuse it with "indirect link" in EDC terminology, which means a link with at least one hidden node.
Indirect lineage is supported as of Informatica EDC version 10.4.0 EBF-17357.
Indirect lineage is not displayed in diagrams, but it is possible to view this type of lineage in an asset summary view.
-
Display the Lineage and Impact of a starting asset at the table or column level.
-
Click the Open the Tabular Asset Summary icon.

- Select the Asset Control Summary tab.

You will see the indirect lineage represented by a table of assets Controlling or Controlled by the starting one, where:
-
Controlling means that the asset in the table is a source and the starting asset is a target of the indirect flow.
-
Controlled means that the starting asset is a source and the asset in the table is a target of the indirect flow.
Context Lineage View
Context lineage is supported as of Informatica EDC version 10.4.1.
-
Find and display a contextual transformation (i.e., a transformation that is not the last one before a target asset) as previously described.
-
Choose the context whose lineage you want to display and click Lineage and Impact.

You will see detailed lineage within the selected context.

Export Statistics
The export log file contains information about the file sizes for EDC. This can be used as an indication that many assets have been exported and will be uploaded to EDC.
2020-05-11 20:15:24.588 0 INFO eu.profinit.manta.dataflow.repository.exporter.client.FileLogger Decompressed file exportReport.json with size 166 bytes.
2020-05-11 20:15:24.589 0 INFO eu.profinit.manta.dataflow.repository.exporter.client.FileLogger Decompressed file lineage.csv with size 104980 bytes.
2020-05-11 20:15:24.590 0 INFO eu.profinit.manta.dataflow.repository.exporter.client.FileLogger Decompressed file lineage.set.csv with size 9976 bytes.
2020-05-11 20:15:24.591 0 INFO eu.profinit.manta.dataflow.repository.exporter.client.FileLogger There is a compressed file objects.csv with uncompressed size 382533 bytes.
2020-05-11 20:15:24.592 0 INFO eu.profinit.manta.dataflow.repository.exporter.client.FileLogger There is a compressed file links.csv with uncompressed size 35548 bytes.
2020-05-11 20:15:24.592 0 INFO eu.profinit.manta.dataflow.repository.exporter.client.FileLogger There is a compressed file lineage.csv with uncompressed size 105092 bytes.
2020-05-11 20:15:24.593 0 INFO eu.profinit.manta.dataflow.repository.exporter.client.FileLogger Decompressed file objects.zip with size 17471 bytes.
The export to EDC generates a report of cases and numbers of objects for which applicable mappings and/or a default mapping were used. The report is saved as a JSON file located in
mantaflow\cli\output\{technology}\{outputFolder}\iedc\exportReport.json.
Below is a sample report showing, first, an ORCL database with a DWH and HR using a default mapping to an IEDC resource and, second, an INFASUPER schema using a user-defined mapping.
{
"Using the default mapping for assets with no user-defined mapping" : {
"ORCL/DWH" : 804,
"ORCL/HR" : 137
},
"Used IEDC mappings" : {
"ORCL_INFASUPER|ORCL|INFASUPER|" : 12
}
}