CDC Replication security primer

You can operate CDC Replication in a secured environment by following these security standards and considerations. The security measures that you use should be based on your business requirements.

Work with your network administration team to evaluate and implement the appropriate method to connect with CDC Replication. Enabling encryption is outside of the scope of IBM® support for data replication.

Security is an important feature of an information system that protects its components from unauthorized access. The components include system and user data, installed software and hardware and the networks these components use.

Network security measures are needed to protect data during transmission (data in transit) and to guarantee that data transmissions are authentic.

Security measures are also required to prevent unauthorized access to systems and the data where application and associated data are residing (data at rest).

CDC Replication can be configured to provide TLS encryption for data in transit but can also be configured without network encryption. When CDC Replication is not configured to provide TLS encryption, various credentials including those for CDC Replication and database users are transmitted without strong encryption methods.

CDC Replication provides no in-product security for data at rest. Various credentials including those for CDC Replication and database users are stored without strong encryption methods.

CDC Replication is typically deployed in highly secured data centers and relies on standard best practices that are appropriate to such realms for security.

Data in transit (network) security

Firewall-based security

Firewalls protect a network of computers from being compromised or subjected to denial of service and other attacks from outside. A firewall can be in the form of hardware or software or a combination of both. A firewall needs to be connected to a minimum of two network Interfaces, one which is supposed to be protected (internal network) and other which is exposed to attacks (generally the Internet). A firewall can also be considered as a gateway that is deployed between the two networks.

Four types of firewalls are generally used:

Packet filters: Control network access by monitoring outgoing and incoming packets and allowing them to pass or halt based on the source and destination IP addresses, protocols, and ports.
Stateful inspection: Monitors the state of active connections and uses this information to determine which network packets to allow through the firewall.
Proxy: Protects network resources by filtering messages at the application layer. A proxy firewall is also referred to as an application firewall or gateway firewall.
Hybrid: Combines the elements of other types of firewalls (packet filters, proxy gateways, etc.) to effectively protect systems.

You can configure CDC Replication to use any of the above firewalls. Additionally, CDC Replication allows you to enable access controls at the port level. For more information, see https://ibm.ent.box.com/s/l96pfwu4bxvxpfyelymszcyik83hnpo8.

For more information about configuring firewalls for CDC Replication for IBM i, see https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_71/rzajb/rzajbrzajb0ippacketsecuritysd.htm.

For CDC for Db2® for z/OS® and Classic CDC for z/OS, you can enable firewalls by using IP filters. You can find more details on IP filtering and setup at https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.halz002/security_tcpip_resrcs_ip_filtering.htm.

Encryption-based security

Network encryption applies crypto services at the network transfer layer (above the data link level, but below the application level) on data and messages transmitted over a network. The encryption process is invisible to applications/end users and operates independently of any other encryption processes used. Data is encrypted only while in transit, existing as plaintext on the originating and receiving hosts.

Network layer security can also be accomplished by using popular frameworks like IPSec (Internet Protocol Security) that most of the VPNs use.

You can use the following methods to encrypt data that is being transmitted over a network:

TLS encryption

You can enable TLS encryption of communication between the source and target. This type of encryption requires a private keystore and a trust store. You can use self-signed certificates, a central self-signed certificate authority, or a public certificate authority to create these stores within CDC Replication. CDC Replication supports TLS 1.2. You can also use third-party tools such as openssl. For more information, see the "Creating a private keystore and a trust store for encryption" and "Managing encryption profiles" topics in the documentation for your CDC Replication engine. See also Enabling TLS encryption for communication between clients and Access Server and Enabling TLS encryption for communication between Access Server and datastores.

SSH tunnelling (port forwarding)

You can tunnel a TCP/IP session over SSH. The application data traffic is directed to flow inside an encrypted SSH connection so that it cannot be eavesdropped or intercepted while it is in transit. SSH tunnelling enables adding network security to legacy applications that do not natively support encryption.

To configure an SSH tunnel refer to https://www.ssh.com/ssh/tunneling/#sec-How-to-configure-an-SSH-tunnel. You can view an example SSH tunnel setup for CDC Replication at https://ibm.ent.box.com/s/mbehhvialwodry94zqbhp1mendabnq3s.

SSL tunnelling

This method encapsulates SSH traffic inside SSL by using OpenSSL library. Some software products are based on SSL/TLS protocols like STunnel, HAProxy, Nginx and AT/TLS. While all of these products are used for different use cases, this document focuses mainly on STunnel to encrypt CDC Replication connections. Customers often deploy more than one proxy (for example, AT/TLS to Stunnel) given the distributed nature of replication across platforms.

Stunnel creates a SSL tunnel to pass almost any traffic through it. It is free software that is distributed under GNU GPL version 2 or later with OpenSSL exception. Stunnel is not a community project. It is FIPS-140-2 compliant. You can find additional information on Stunnel and installation steps at the following links:

For CDC Replication for IBM i, many server applications are enabled for SSL/TLS. For external applications that do not have built-in SSL/TLS, you can use tools such as stunnel (only IBM i access for Linux®) that could be installed and configured to encrypt network traffic. More details and example SSL/TLS setup can be found at these links:

For CDC for Db2 for z/OS and Classic CDC for z/OS, z/OS has an encryption solution called Application Transparent TLS (AT-TLS) that you can leverage to secure TCP/IP connections. Many applications can run without even being aware that the connection is using TLS. Remote clients cannot distinguish between "normal" TLS (where the application is doing the socket calls that are needed for TLS) and AT-TLS (where the TCP layer handles the connection).

More details on AT-TLS and setup instructions are available at these links:

Virtual Private Network (VPN)

VPN creates a secure private connection, essentially by creating a private “tunnel” over a public network. It allows you to create a secure end-to-end path between any combination of host and gateway. VPN uses authentication methods, encryption algorithms, and other precautions to ensure that data sent between the two endpoints of its connection remains secure.

VPN includes different security protocols like IPSec, TLS/SSL, SSTP operating at different levels to secure data networks. VPN uses a combination of dedicated connections and encryption protocols to generate virtual P2P connections. Standard and dedicated VPNs are FIPS-140-2 compliant.

You can find examples to set up a VPN at these links:

Data at rest (disk) security

Disk encryption

Linux, UNIX, and Windows

Hardware or, in rare circumstances, software disk encryption is typically employed to secure data at rest on Linux, Unix, and Windows systems. Whole disk encryption protects not only the CDC Repication data at rest – transaction queues, staging store, trace files, configuration information, fast load files, flat files for DataStage® and related targets, and potentially some LOB data – but also the database logs that CDC Replication reads from to replicate transactions.

The event logs and trace files that are produced by CDC Replication on both the source and target could potentially contain sensitive data. As such, you can also secure these in the CDC instance directory. As mentioned previously, access to commands should be restricted. The dmshowevents and dmsupportinfo commands both provide access to the event logs.

Tools like IBM Guardium® (https://www.ibm.com/security/data-security/guardium) secure the data from databases to big data, cloud, file systems, and more with GDPR compliance.

z/OS

For the CDC for z/OS source engine, data is staged in a VSAM data set if you are using a log cache. You need to take appropriate measures to secure all the VSAM data sets.

During a refresh using the DB2® LOAD utility, CDC for z/OS on the target system writes the table data to disk data sets. In the event of a failed refresh using the DB2 LOAD utility, these data sets remain. Appropriate measures should be taken to secure these data sets.

The spooled output that is produced by the source and target CDC for z/OS engines could also contain sensitive information and should be secured.

Most of the CDC metadata is maintained in Db2 for z/OS, which also should be secured.

All of these data sets should be secured through any of the security mechanisms such as hardware encryption deployed at the data set level. z/OS has a new feature calleed "pervasive encryption" that can provide low-cost protected access to all the data sets.

IBM i

For the CDC Replication for IBM i source engine, data is staged for building the transactions in a *USRQ. As such, you need to take appropriate measures to secure the *USRQs in the product library. The event logs and trace files that CDC Replication produces on both the source and target could potentially contain sensitive data such as information on users who log in and table names. The events are contained in an *MSGQ in the product library and should be secured. When traces are produced, they are placed in IFS under /home/IBM/CDC/Trace, which should be secured.

Access to all the above data should be controlled through standard OS security controls in conjunction with database encryption.

Authentication/authorization-based security

LDAP

Many organizations manage their user credentials, security policies and access rights in a central repository by implementing a Lightweight Directory Access Protocol (LDAP) compliant directory service such as IBM Tivoli®, Microsoft Active Directory, and Apache Directory Services.

Organizations also prefer business software to leverage these directory services rather than using decentralized, individually managed user credentials, security policies, or access rights that could potentially be created for each piece of software deployed.

To help cater to the security needs of digital businesses, CDC Replication supports integration with LDAP directory services. Traditionally, the Access Server authenticates users, stores user credentials and data access information, and acts as the centralized communicator between all replication agents and Management Console clients.

You can now choose to have an LDAP server manage your CDC Replication user credentials, user authentication, and data store access information to help you conform to LDAP-based centralized security architecture.

For more information and how to configure LDAP with CDC, see https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.installingasandmc.doc/concepts/ldapovu.html.

Kerberos

Kerberos is a protocol for authenticating service requests between trusted hosts across an untrusted network, such as the internet. Kerberos authentication uses conventional shared secret cryptography to prevent packets traveling across the network from being read or changed and to protect messages from eavesdropping and replay attacks. Kerberos is built in to all major operating systems.

You can configure the CDC Replication Engine for Kafka to use Kerberos to communicate with a Kafka cluster that expects Kerberos authentication.

Operational considerations

System controls: Be careful to limit access to the operating system user under which CDC Replication runs to as few people as possible. On Linux, Unix, and Windows, CDC Replication stores metadata such as event messages and database connection information as well as potential customer data in staging stores and transaction queues. For this reason, file system level access should be limited to as few people as possible.
Password change management: If the passwords for users and databases that are associated with the CDC Replication engine on the source systems (z/OS and Linux, UNIX, and Windows) are changed, then the details must be changed in the engine before the next time the engine is restarted. Similarly, if the password for a user that is specified in the Access Server credentials is changed, you must update the connection credentials in Access Server. For further details for Linux, UNIX, and Windows see http://www-01.ibm.com/support/docview.wss?uid=swg22009709.
Notifications: You can configure CDC Replication to notify operators of abnormal events. When this function is used, notifications are typically enabled only for high-level messages. Lower-level messages can provide detailed information regarding specific rows that are failing to replicate. While useful for the purposes of problem identification and resolution, this detail might not be appropriate for a notification mechanism such as email.
Data store connection parameters: The default configuration for CDC Replication stores datastore connection credentials for users. You can configure CDC Replication to not store these credentials and instead request users to always provide their connection credentials. In Access Server this is the Always show connection dialog option.
CHCCLP variable substitution: CDC Replication provides the CHCCLP scripting environment to enable automation of configuration, operational, and monitoring tasks. CHCCLP variables are useful for externalizing both the differences between various deployments – development, staging, and production, for instance – as well as the credentials that are necessary to execute the operations. See Running a script file with substitution variables for examples of substituting credentials and deployment-specific configuration information.
Database and system access: CDC Replication requires specific operating system and database user account permissions to be successfully installed, configured, and started. For example, a new or existing UNIX user account (other than root) is required to manage the replication software. Similarly, depending on the database type, the database user should have system or database administrator privileges to read from the database logs and apply operations to the target database. For more information see User account access requirements.