Security mechanisms enable you to protect sources from
unauthorized searching and restrict administrative functions to specific
users.
With
IBM® Content
Analytics with Enterprise Search, users
can query a wide range of data sources. To ensure that only users
who are authorized to query content do so, and to ensure that only
authorized users are able to access the administration console, the
system coordinates and enforces security at several levels.
- Web application server
- The first level of security is the web application server,
either through the embedded web application server or through WebSphere® Application
Server global security settings.
You can configure the system to use an LDAP registry and allow only
registered users to log in to applications or the administration console.
You can also configure the system to use an LTPA key file to provide
single sign-on (SSO) authentication support to application users.
When
setting up security controls, different procedures are required if
your applications are supported through the embedded web application
server or WebSphere Application
Server, especially
if you plan to implement support for single sign-on (SSO) authentication.
- System-level security
- At the system level, you can assign users to administrative roles
and authenticate users who administer the system. When a user logs
in to the administration console, only the functions and collections
that the user is authorized to administer are available to that user.
You can also assign privileges to users and groups to control application
functions. For example, you can limit the ability to export documents
from an application to specific users.
You can also configure credentials
that enable crawlers to access the data sources that you include in
collections. Other system components also need these credentials.
For example, to verify that users are authorized to see documents
in the search results, the search servers can use the credentials
to connect to a data source and check the current access control lists.
- Collection-level security
- When you create a collection, you can enable security at the collection
level. You cannot change this setting after the collection is created.
If you do not enable collection-level security, you cannot later specify
document-level security controls.
When collection-level security
is enabled:
- The global analysis processes apply different rules for indexing
duplicate documents.
- You can configure options to enforce document-level security.
- You can enforce security by mapping applications (not
individual users) to the collections that they can access. You then
use standard access control mechanisms to permit or deny users access
to applications.
- You can configure the system to use the identity management
component, which enables application users to be authenticated without
configuring an application profile.
There is a trade-off between enabling collection security
and search quality. Enabling collection security reduces the information
that is indexed for each document. A side effect is that fewer results
will be found for some queries.
- Document-level security
- When you configure crawlers for a collection, you can enable document-level
security. For example, you can specify options to associate security
tokens with data as the data is collected by crawlers. Your applications
can use these tokens, which are stored with documents in the index,
to pre-filter the results and ensure that only users with the proper
credentials are able to query the data and view documents.
For certain
types of data sources, you can configure options to validate a user's
login credentials with current access controls during query processing.
This extra layer of post-filtering security ensures that a user's
privileges are validated in real time with the data source. This capability
can protect against instances in which a user's credentials change
after a document and its security tokens are indexed.
The anchor
text processing phase of global analysis normally associates text
that appears in one document (the source document) with another document
(the target document) in which that text does not necessarily appear.
When you configure a Web crawler, you can specify whether you want
to exclude the anchor text from the index if the link connects to
a document that the Web crawler is not allowed to crawl.
- Encryption
- To protect sensitive data, encryption is used to encode the authentication
data portion of all messages that are transmitted through the system.
The password for the default IBM Content
Analytics with Enterprise Search administrator
is stored in an encrypted format. Passwords that users specify in
user profiles and passwords that are stored by the system (in configuration
files, the internal databases, and so on) are also encrypted. Encryption
incurs little overhead because only the authentication IDs and passwords
are encrypted.
Security for your collections extends beyond the authentication
and access control mechanisms that the system can use to protect indexed
content. Safeguards also exist to prevent a malicious and unauthorized
user from gaining access to data while it is in transit. For example,
the search servers use protocols such as the Secure Sockets Layer
(SSL), the Secure Shell (SSH), and the Secure Hypertext Transfer Protocol
(HTTPS) to communicate with the controller server and your applications.
For increased security, you need to ensure that the server hardware
is appropriately isolated and secure from unauthorized intrusion.
By installing a firewall, you can protect the servers from intrusion
through another part of your network. Also ensure that there are no
open ports on the servers. Configure the system so that it listens
for requests only on ports that are explicitly assigned to IBM Content
Analytics with Enterprise Search activities and applications.