Example: Reporting on the metadata in your metadata repository
You can run reports that give you information about the content of your metadata repository.
- Preparing the implementation of a governance solution
- For example, understanding the type of relationships that are used to link terms or finding out whether a category or information governance policy hierarchy contains categories or information governance policies with the same name.
- Assessing compliance with corporate standards
- For example, checking whether all terms have short descriptions or all information governance policies have stewards.
- Understanding the status and progress of your governance program
- For example, determining what percentage of information governance policies have information governance rules that are assigned or what percentage of these information governance rules have data rules that are assigned.
Depending on the use case, the corresponding metrics might be used in a long-term information governance solution or might guide you through the implementation phase of your governance program. For an initial metadata assessment, an analytical query that produces tabular output might be sufficient. For compliance reporting, however, more sophisticated visualization techniques might be appropriate, addressing the needs of a more business-oriented user community.
Facts for metadata reporting are typically compound objects that result from outer joins of metadata views. You can take advantage of these facts by writing queries that count IDs directly on the compound fact or on related dimensions. These counts might be further evaluated by specific metrics.
Finding duplicate names in a hierarchy
Representing categories and information governance policies by their path in the hierarchy solves the problem, but might make reports difficult to use.
You can use the following query to determine whether your category hierarchy contains two or more categories with the same name. The query lists the names of these categories, the fully qualified names, and the RIDs. A SELECT DISTINCT statement and a corresponding GROUP BY clause are used as a simple way to aggregate related facts. In a real usage scenario, facts would typically be aggregated using computations, such as average or maximum.
WITH categoryUniqueName (UNIQUENAME , NAME, RID, SUPERCATEGORYRID, DEPTH) AS
(SELECT c.NAME, c.NAME, c.RID, c.SUPERCATEGORYRID, 0
FROM IGVIEWS.IGBUSINESSCATEGORY c
WHERE c.SUPERCATEGORYRID is NULL
UNION ALL
SELECT c1.NAME concat ':' concat c2.NAME, c2.NAME, c2.RID,
c2.SUPERCATEGORYRID, c1.DEPTH + 1
FROM categoryUniqueName c1, IGVIEWS.IGBUSINESSCATEGORY c2
WHERE c2.SUPERCATEGORYRID is not NULL AND
c1.RID = c2.SUPERCATEGORYRID AND DEPTH < 1000)
SELECT distinct
c1.NAME Category,
c1.UNIQUENAME FullyQualifiedName,
c1.RID RID
FROM categoryUniqueName c1
INNER JOIN IGVIEWS.IGBUSINESSCATEGORY c2
ON c1.NAME = c2.NAME
WHERE NOT c1.SUPERCATEGORYRID = c2.SUPERCATEGORYRID
GROUP BY c1.NAME, c1.UNIQUENAME, c1.RID
ORDER BY c1.NAME
The results from running the query might look like the data in the following table.
Category | FullyQualifiedName | RID |
---|---|---|
Contact Information | Customer Relations:Contact Information | 6662c0f2.ee6a64fe.8gckait1h.jde4ao4.hbg45a.41evkmliqcmat5ct877nu |
Contact Information | Supplier Relations:Contact Information | 6662c0f2.ee6a64fe.8gk48ictn.7len3a4.8gmso7.d5u5ipaphtng108oqdjm4 |
Depending on the number of categories with the same name, one or more of the following solutions are possible:
- Add the query that maps RIDs to unique names to analytical queries that involve names.
- Make names globally unique by changing them in the glossary.
- Create a name translation table that is based on the result of this query that maps non-unique names to unique names. Add the translation table as a bridge entity to your analytical queries.
A similar query can be used to identify information governance policies with the same name.
Determining at which levels stewards are assigned
The topic Example: Adding business context to your operational data describes a particular approach to tackle the problem that stewards might be assigned to different glossary objects. You can use a query like the following query to determine the percentage of information governance policies, information governance rules, and data rules that have stewards assigned to them. The query also returns the overall number of objects at each level to provide context for that percentage.
WITH IGPolicyCount (num) AS (
SELECT count(RID) from IGVIEWS.IGPOLICY),
StewardsForPolicyCount (num) AS (
SELECT count(distinct policy_dim.RID)
FROM IGVIEWS.IGPOLICY policy_dim
LEFT OUTER JOIN CMVIEWS.STEWARDSHIPASSIGNMENT bridge
ON bridge.ASSIGNEDFORCOMMONOBJECTRID = policy_dim.RID
WHERE bridge.ASSIGNSPRINCIPALRID IS NOT NULL),
IGRuleCount (num) AS (
SELECT count(RID) from IGVIEWS.IGBUSINESSRULE),
StewardsForIGRuleCount (num) AS (
SELECT count(distinct igrule_dim.RID)
FROM IGVIEWS.IGBUSINESSRULE igrule_dim
LEFT OUTER JOIN CMVIEWS.STEWARDSHIPASSIGNMENT bridge
ON bridge.ASSIGNEDFORCOMMONOBJECTRID = igrule_dim.RID
WHERE bridge.ASSIGNSPRINCIPALRID IS NOT NULL),
DataRuleCount (num) AS (
SELECT count(RID) from IAVIEWS.IARULE),
StewardsForDataRuleCount (num) AS (
SELECT count(distinct datarule_dim.RID)
FROM IAVIEWS.IARULE datarule_dim
LEFT OUTER JOIN CMVIEWS.STEWARDSHIPASSIGNMENT bridge
ON bridge.ASSIGNEDFORCOMMONOBJECTRID = datarule_dim.COMMONRULERID
WHERE bridge.ASSIGNSPRINCIPALRID IS NOT NULL)
SELECT GlossaryObjectName, OverallNumberOfGlossaryObjects,
PercentageWithStewards FROM (
SELECT
0 Level,
'Policy' GlossaryObjectName,
IGPolicyCount.num OverallNumberOfGlossaryObjects,
CASE
WHEN IGPolicyCount.num = 0 THEN 0
ELSE CAST(100 * StewardsForPolicyCount.num /
IGPolicyCount.num AS INTEGER)
END PercentageWithStewards
FROM IGPolicyCount, StewardsForPolicyCount
UNION
SELECT
1,
'Information Governance Rule',
IGRuleCount.num,
CASE
WHEN IGRuleCount.num = 0 THEN 0
ELSE CAST(100 * StewardsForIGRuleCount.num /
IGRuleCount.num AS INTEGER)
END
FROM IGRuleCount, StewardsForIGRuleCount
UNION
SELECT
2,
'Data Rule',
DataRuleCount.num,
CASE
WHEN DataRuleCount.num = 0 THEN 0
ELSE CAST(100 * StewardsForDataRuleCount.num /
DataRuleCount.num AS INTEGER)
END
FROM DataRuleCount, StewardsForDataRuleCount
)
ORDER BY Level
The following example output shows results for a glossary with 69 policies, 5% of which have a steward that is assigned. Of the 83 information governance rules, 97% have a steward that is assigned, but no stewards are assigned at the data rule level.
GlossaryObject | NumberOfGlossaryObjects | PercentageWithStewards |
---|---|---|
Policy | 69 | 5 |
Information Governance Rule | 83 | 97 |
Data Rule | 275 | 0 |
Investigating the linkage among information governance policies, information governance rules, and data rules
The linkage between information governance policies and information governance rules, and between information governance rules and data rules, is important for any report that puts data quality facts into a business context. You can use the following query to find out the degree to which this linkage is present in your glossary:
WITH
NumPolicies (num) AS (SELECT count(RID) FROM IGVIEWS.IGPOLICY),
NumUnLinkedPolicies (num) AS (
SELECT count(distinct rcp.POLICYRID)
FROM IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp
WHERE rcp.RULERID IS NULL),
NumFullyLinkedPolicies (num) AS (
SELECT count(distinct rcp.POLICYRID)
FROM IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp
INNER JOIN IGVIEWS.IGIMPLEMENTEDBYIARULE iar ON rcp.RULERID =
iar.BUSINESSRULERID
WHERE iar.IMPLEMENTATIONRULERID IS NOT NULL
AND (NOT EXISTS (SELECT *
FROM IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp2
WHERE rcp2.POLICYRID = rcp.POLICYRID AND
rcp2.RULERID IS NULL))
AND (NOT EXISTS (SELECT *
FROM IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp2
LEFT OUTER JOIN IGVIEWS.IGIMPLEMENTEDBYIARULE iar2 ON
rcp2.RULERID = iar2.BUSINESSRULERID
WHERE rcp2.POLICYRID = rcp.POLICYRID AND
iar2.IMPLEMENTATIONRULERID IS NULL))),
NumIGRules (num) AS (SELECT count(RID) FROM IGVIEWS.IGBUSINESSRULE),
NumUnLinkedIGRules (num) AS (
SELECT count(distinct rcp.RULERID)
FROM IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp
INNER JOIN IGVIEWS.IGIMPLEMENTEDBYIARULE iar ON rcp.RULERID =
iar.BUSINESSRULERID
WHERE rcp.POLICYRID IS NULL AND iar.IMPLEMENTATIONRULERID IS NULL
),
NumFullyLinkedIGRules (num) AS (
SELECT count(distinct rcp.RULERID)
FROM IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp
INNER JOIN IGVIEWS.IGIMPLEMENTEDBYIARULE iar ON rcp.RULERID =
iar.BUSINESSRULERID
WHERE (NOT EXISTS (SELECT *
FROM IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp2
WHERE rcp2.RULERID = rcp.RULERID AND rcp2.POLICYRID
IS NULL))
AND (NOT EXISTS (SELECT *
FROM IGVIEWS.IGIMPLEMENTEDBYIARULE iar2
WHERE iar2.BUSINESSRULERID = iar.BUSINESSRULERID AND
iar2.IMPLEMENTATIONRULERID IS NULL))),
NumDataRules (num) AS (SELECT count(RID) FROM IAVIEWS.IARULE),
NumUnLinkedDataRules (num) AS (
SELECT count(distinct iar.IMPLEMENTATIONRULERID)
FROM IGVIEWS.IGIMPLEMENTEDBYIARULE iar
WHERE iar.BUSINESSRULERID IS NULL),
NumFullyLinkedDataRules (num) AS (
SELECT count(distinct iar.IMPLEMENTATIONRULERID)
FROM IGVIEWS.IGIMPLEMENTEDBYIARULE iar
INNER JOIN IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp ON rcp.RULERID =
iar.BUSINESSRULERID
WHERE (NOT EXISTS (SELECT *
FROM IGVIEWS.IGIMPLEMENTEDBYIARULE iar2
WHERE iar2.IMPLEMENTATIONRULERID =
iar.IMPLEMENTATIONRULERID AND
iar2.BUSINESSRULERID IS NULL))
AND (NOT EXISTS (SELECT *
FROM IGVIEWS.IGIMPLEMENTEDBYIARULE iar2
LEFT OUTER JOIN
IGVIEWS.IGRULEREFBYCONTAINERPOLICY rcp2
ON rcp2.RULERID = iar2.BUSINESSRULERID
WHERE iar2.IMPLEMENTATIONRULERID IS NULL)))
SELECT
GlossaryObject, NumOverall, NumFullyLinked, NumPartiallyLinked, NumUnlinked
FROM (
SELECT
0 Level,
'Policies' GlossaryObject,
NumPolicies.num NumOverall,
NumFullyLinkedPolicies.num NumFullyLinked,
NumPolicies.num - NumFullyLinkedPolicies.num NumPartiallyLinked,
NumUnLinkedPolicies.num NumUnLinked
FROM
NumPolicies, NumFullyLinkedPolicies, NumUnlinkedPolicies
UNION SELECT
1,
'Information Governance Rules',
NumIGRules.num,
NumFullyLinkedIGRules.num,
NumIGRules.num - NumFullyLinkedIGRules.num,
NumUnLinkedIGRules.num
FROM
NumIGRules, NumFullyLinkedIGRules, NumUnLinkedIGRules
UNION SELECT
2,
'Data Rules',
NumDataRules.num,
NumFullyLinkedDataRules.num,
NumDataRules.num - NumFullyLinkedDataRules.num,
NumUnLinkedDataRules.num
FROM
NumDataRules, NumFullyLinkedDataRules, NumUnLinkedDataRules
)
ORDER BY LEVEL
The query investigates what percentage of information governance policies, information governance rules, and data rules is linked by any of the following relationships:
- Relationship between policy and information governance rules (referred to as R1 in the next list)
- Relationship between information governance rule and data rule ('implements'; referred to as R2 in the next list)
For information governance policies, information governance rules, and data rules, the query returns the following information:
- Overall number of entries
- Number of entries that are fully linked (R1 and R2)
- Number of entries that are partially linked (R1 or R2, but not both)
- Number of unlinked entries (neither R1 nor R2)
The notion of 'full linkage' requires all links from an object to anchor at a policy on one end and at a data rule on the other. In other words, a policy is fully linked if it has information governance rules that are assigned and if each of these information governance rules is implemented by at least one data rule. An information governance rule is fully linked if it is assigned to a policy and implemented by at least one data rule. A data rule is fully linked if it implements information governance rules each of which is assigned to a policy.
GlossaryObject | NumOverall | NumFullyLinked | NumberPartiallyLinked | NumUnlinked |
---|---|---|---|---|
Information governance policies | 69 | 5 | 1 | 63 |
Information governance rules | 43 | 9 | 1 | 33 |
Data rules | 28 | 23 | 0 | 5 |
A similar query might be used to determine the degree to which terms are linked with assets and the type of asset to which the linkage applies.
Reporting on the use of descriptions
A standard for entries in the catalog might require each entry to have a description. The following analytical query returns the overall percentage of entries that have a short description for all entries that are supported by the SQL views:
WITH
Hosts(num) AS (SELECT count(RID) FROM cmviews.PDRHostSystem),
DBCols(num) AS (SELECT count(RID) FROM cmviews.PDRDatabaseColumn),
DBConns(num) AS (SELECT count(RID) FROM cmviews.PDRDatabaseConnection),
DBs(num) AS (SELECT count(RID) FROM cmviews.PDRDatabase),
DBTabs(num) AS (SELECT count(RID) FROM cmviews.PDRDatabaseTable),
DBSchemas(num) AS (SELECT count(RID) FROM cmviews.PDRDatabaseSchema),
DBSProcs(num) AS (SELECT count(RID) FROM cmviews.PDRStoredProcedure),
DBViews(num) AS (SELECT count(RID) FROM cmviews.PDRDatabaseView),
IGRules(num) AS (SELECT count(RID) FROM igviews.IGBusinessRule),
IGLabels(num) AS (SELECT count(RID) FROM igviews.IGLabel),
IGExtAssts(num) AS (SELECT count(RID) FROM igviews.IGExternalAssetReference),
IGTerms(num) AS (SELECT count(RID) FROM igviews.IGBusinessTerm),
IGPolicies(num) AS (SELECT count(RID) FROM igviews.IGPolicy),
IGCats(num) AS (SELECT count(RID) FROM igviews.IGBusinessCategory),
IAMetrics(num) AS (SELECT count(RID) FROM iaviews.IAMetric),
IAProjects(num) AS (SELECT count(RID) FROM iaviews.IAProject),
IARSets(num) AS (SELECT count(RID) FROM iaviews.IARuleSetDefinition),
IAPubRules(num) AS (SELECT count(RID) FROM iaviews.IAPublicRule),
IARules(num) AS (SELECT count(RID) FROM iaviews.IARule),
IARuleDefs(num) AS (SELECT count(RID) FROM iaviews.IARuleDefinition),
HostsWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRHostSystem
WHERE shortDescription IS NOT NULL),
DBColsWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRDatabaseColumn
WHERE shortDescription IS NOT NULL),
DBConnsWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRDatabaseConnection
WHERE shortDescription IS NOT NULL),
DBsWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRDatabase
WHERE shortDescription IS NOT NULL),
DBTabsWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRDatabaseTable
WHERE shortDescription IS NOT NULL),
DBSchemasWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRDatabaseSchema
WHERE shortDescription IS NOT NULL),
DBSProcsWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRStoredProcedure
WHERE shortDescription IS NOT NULL),
DBViewsWithDescr (num) AS (SELECT count(RID)
FROM cmviews.PDRDatabaseView
WHERE shortDescription IS NOT NULL),
IGRulesWithDescr (num) AS (SELECT count(RID)
FROM igviews.IGBusinessRule
WHERE shortDescription IS NOT NULL),
IGLabelsWithDescr (num) AS (SELECT count(RID)
FROM igviews.IGLabel
WHERE shortDescription IS NOT NULL),
IGExtAsstsWithDescr (num) AS (SELECT count(RID)
FROM igviews.IGExternalAssetReference
WHERE shortDescription IS NOT NULL),
IGTermsWithDescr (num) AS (SELECT count(RID)
FROM igviews.IGBusinessTerm
WHERE shortDescription IS NOT NULL),
IGPoliciesWithDescr (num) AS (SELECT count(RID)
FROM igviews.IGPolicy
WHERE shortDescription IS NOT NULL),
IGCatsWithDescr (num) AS (SELECT count(RID)
FROM igviews.IGBusinessCategory
WHERE shortDescription IS NOT NULL),
IAMetricsWithDescr (num) AS (SELECT count(RID)
FROM iaviews.IAMetric
WHERE shortDescription IS NOT NULL),
IAProjectsWithDescr (num) AS (SELECT count(RID)
FROM iaviews.IAProject
WHERE shortDescription IS NOT NULL),
IARSetsWithDescr (num) AS (SELECT count(RID)
FROM iaviews.IARuleSetDefinition
WHERE shortDescription IS NOT NULL),
IAPubRulesWithDescr (num) AS (SELECT count(RID)
FROM iaviews.IAPublicRule
WHERE shortDescription IS NOT NULL),
IARulesWithDescr (num) AS (SELECT count(RID)
FROM iaviews.IARule
WHERE shortDescription IS NOT NULL),
IARuleDefsWithDescr (num) AS (SELECT count(RID)
FROM iaviews.IARuleDefinition
WHERE shortDescription IS NOT NULL)
SELECT AVG(Percentage) PercentageGlossaryShortDescription FROM (
SELECT CASE WHEN Hosts.num = 0 THEN 0
ELSE CAST(100 * HostsWithDescr.num /
Hosts.num AS INTEGER) END Percentage
FROM Hosts, HostsWithDescr
UNION SELECT CASE WHEN DBCols.num = 0 THEN 0
ELSE CAST(100 * DBColsWithDescr.num /
DBCols.num AS INTEGER) END Percentage
FROM DBCols, DBColsWithDescr
UNION SELECT CASE WHEN DBConns.num = 0 THEN 0
ELSE CAST(100 * DBConnsWithDescr.num /
DBConns.num AS INTEGER) END Percentage
FROM DBConns, DBConnsWithDescr
UNION SELECT CASE WHEN DBs.num = 0 THEN 0
ELSE CAST(100 * DBsWithDescr.num /
DBs.num AS INTEGER) END Percentage
FROM DBs, DBsWithDescr
UNION SELECT CASE WHEN DBTabs.num = 0 THEN 0
ELSE CAST(100 * DBTabsWithDescr.num /
DBTabs.num AS INTEGER) END Percentage
FROM DBTabs, DBTabsWithDescr
UNION SELECT CASE WHEN DBSchemas.num = 0 THEN 0
ELSE CAST(100 * DBSchemasWithDescr.num /
DBSchemas.num AS INTEGER) END Percentage
FROM DBSchemas, DBSchemasWithDescr
UNION SELECT CASE WHEN DBSProcs.num = 0 THEN 0
ELSE CAST(100 * DBSProcsWithDescr.num /
DBSProcs.num AS INTEGER) END Percentage
FROM DBSProcs, DBSProcsWithDescr
UNION SELECT CASE WHEN DBViews.num = 0 THEN 0
ELSE CAST(100 * DBViewsWithDescr.num /
DBViews.num AS INTEGER) END Percentage
FROM DBViews, DBViewsWithDescr
UNION SELECT CASE WHEN IGRules.num = 0 THEN 0
ELSE CAST(100 * IGRulesWithDescr.num /
IGRules.num AS INTEGER) END Percentage
FROM IGRules, IGRulesWithDescr
UNION SELECT CASE WHEN IGLabels.num = 0 THEN 0
ELSE CAST(100 * IGLabelsWithDescr.num /
IGLabels.num AS INTEGER) END Percentage
FROM IGLabels, IGLabelsWithDescr
UNION SELECT CASE WHEN IGExtAssts.num = 0 THEN 0
ELSE CAST(100 * IGExtAsstsWithDescr.num /
IGExtAssts.num AS INTEGER) END Percentage
FROM IGExtAssts, IGExtAsstsWithDescr
UNION SELECT CASE WHEN IGTerms.num = 0 THEN 0
ELSE CAST(100 * IGTermsWithDescr.num /
IGTerms.num AS INTEGER) END Percentage
FROM IGTerms, IGTermsWithDescr
UNION SELECT CASE WHEN IGPolicies.num = 0 THEN 0
ELSE CAST(100 * IGPoliciesWithDescr.num /
IGPolicies.num AS INTEGER) END Percentage
FROM IGPolicies, IGPoliciesWithDescr
UNION SELECT CASE WHEN IGCats.num = 0 THEN 0
ELSE CAST(100 * IGCatsWithDescr.num /
IGCats.num AS INTEGER) END Percentage
FROM IGCats, IGCatsWithDescr
UNION SELECT CASE WHEN IAMetrics.num = 0 THEN 0
ELSE CAST(100 * IAMetricsWithDescr.num /
IAMetrics.num AS INTEGER) END Percentage
FROM IAMetrics, IAMetricsWithDescr
UNION SELECT CASE WHEN IAProjects.num = 0 THEN 0
ELSE CAST(100 * IAProjectsWithDescr.num /
IAProjects.num AS INTEGER) END Percentage
FROM IAProjects, IAProjectsWithDescr
UNION SELECT CASE WHEN IARSets.num = 0 THEN 0
ELSE CAST(100 * IARSetsWithDescr.num /
IARSets.num AS INTEGER) END Percentage
FROM IARSets, IARSetsWithDescr
UNION SELECT CASE WHEN IAPubRules.num = 0 THEN 0
ELSE CAST(100 * IAPubRulesWithDescr.num /
IAPubRules.num AS INTEGER) END Percentage
FROM IAPubRules, IAPubRulesWithDescr
UNION SELECT CASE WHEN IARules.num = 0 THEN 0
ELSE CAST(100 * IARulesWithDescr.num /
IARules.num AS INTEGER) END Percentage
FROM IARules, IARulesWithDescr
UNION SELECT CASE WHEN IARuleDefs.num = 0 THEN 0
ELSE CAST(100 * IARuleDefsWithDescr.num /
IARuleDefs.num AS INTEGER) END Percentage
FROM IARuleDefs, IARuleDefsWithDescr
)
The results of the query might look like the data in the following table.
PercentageGlossaryShortDescription |
---|
87 |
A more elaborate version of this query might also check the length of the description.