Data protection with data source definitions

Use data source definitions to monitor where your data is stored and to select a data protection solution for the data assets in a data source.

A data source definition is an asset that functions as a unique stable identifier for the location of a data source such as a relational database. Data source definitions use endpoints to identify the data source. For most data source types, an endpoint is the combination of the hostname or IP address, the port number, and the database name or instance identifier. In a multinode environment, a single data source definition is defined by a list of endpoints.

Data source definitions include these benefits:

  • Identify multinode data sources. For example, in a multinode environment, users can create connections to a database with different hostnames and port numbers to connect to a data asset. You can create a data source definition for the multiple endpoints to treat all the data source’s assets the same.

  • Eliminate the ambiguity that can result from multiple enforcement methods. For example, if you have different enforcement methods that mask a numeric column with a different number, you might receive unpredictable results. With a data source definition, you specify a single protection solution that is based on the data source type.

  • Group and manage multiple connections that point to the same data source. For example, if you have five different connections in different projects and catalogs that point to the same data source, you can view all of them in one location.

  • Identify the underlying data source for ambiguous connection types. For example, when different kinds of Db2 data source types have the same endpoints or if you use a Generic JDBC connection, the data source definition can specify the data source type.

  • Apply the correct protection solution (enforcement engine) based on the data source definition.

Note:

Data source definitions do not support all data source types. For more information, see Connectors that support data source definitions.

You can create a new data source definition from scratch. Alternatively, you can view the connections on the platform, and then create data source definitions for those connections that do not have a data source definition assigned to them.

Data source definitions are automatically assigned to all the connections and connected data assets in the account that match a data source definition’s endpoints. Connections include:

  • Connections in catalogs, including platform connections in the Platform assets catalog.
  • Connections created in a project, including connections that reference a platform connection with the From platform selection.
  • Connections in deployment spaces.

Data source definitions are assigned to all associated connected data assets for the connection based on the same endpoints. If the connection already has an assigned data source definition, that data source definition is automatically assigned to the connected data asset when a user creates the connected data asset. This action ensures that the correct protection solution (enforcement engine) is used for policy evaluation.

Learn more

Parent topic: Connecting to data sources