IBM® InfoSphere® Information
Server provides
a unified architecture that works with all types of information integration.
Common services, unified parallel processing, and unified metadata
are at the core of the server architecture.
The architecture is service oriented, enabling IBM InfoSphere Information
Server to
work within evolving enterprise service-oriented architectures. A
service-oriented architecture also connects the individual suite product
modules of InfoSphere Information Server.
By eliminating duplication of functions, the architecture efficiently
uses hardware resources and reduces the amount of development and
administrative effort that are required to deploy an integration solution.
Figure 1 shows the InfoSphere Information Server architecture.
Figure 1. InfoSphere Information Server high-level
architecture
- Unified parallel processing engine
- Much of the work that InfoSphere Information Server does
takes place within the parallel processing engine. The engine handles
data processing needs as diverse as performing analysis of large databases
for IBM InfoSphere Information Analyzer,
data cleansing for IBM InfoSphere QualityStage®,
and complex transformations for IBM InfoSphere DataStage®.
This parallel processing engine is designed to deliver the following
benefits:
- Parallelism and data pipelining to complete increasing volumes
of work in decreasing time windows
- Scalability by adding hardware (for example, processors or nodes
in a grid) with no changes to the data integration design
- Optimized database, file, and queue processing to handle large
files that cannot fit in memory all at once or with large numbers
of small files
- Common connectivity
- InfoSphere Information Server connects
to information sources whether they are structured, unstructured,
on the mainframe, or applications. Metadata-driven connectivity is
shared across the suite components, and connection objects are reusable
across functions.
- Connectors provide design-time importing of metadata, data browsing
and sampling, run-time dynamic metadata access, error handling, and
high functionality and high performance run-time data access. Prebuilt
interfaces for packaged applications called packs provide
adapters to SAP, Siebel, Oracle, and others, enabling integration
with enterprise applications and associated reporting and analytical
systems.
- Unified metadata
- InfoSphere Information Server is
built on a unified metadata infrastructure that enables shared understanding
between business and technical domains. This infrastructure reduces
development time and provides a persistent record that can improve
confidence in information. All functions of InfoSphere Information Server share
the same metamodel, making it easier for different roles and functions
to collaborate.
- A common metadata repository provides persistent storage for all InfoSphere Information Server suite
components. All of the products depend on the repository to navigate,
query, and update metadata. The repository contains two kinds of metadata:
- Dynamic
- Dynamic metadata includes design-time information.
- Operational
- Operational metadata includes performance monitoring, audit and
log data, and data profiling sample data.
Because the repository is shared by all suite components,
profiling information that is created by InfoSphere Information Analyzer is
instantly available to users of InfoSphere DataStage and InfoSphere QualityStage,
for example.The repository is a J2EE application that uses a standard
relational database such as IBM DB2®, Oracle,
or SQL Server for persistence (DB2 is
provided with InfoSphere Information Server).
These databases provide backup, administration, scalability, parallel
access, transactions, and concurrent access.
- Common services
- InfoSphere Information Server is
built entirely on a set of shared services that centralize core tasks
across the platform. These include administrative tasks such as security,
user administration, logging, and reporting. Shared services allow
these tasks to be managed and controlled in one place, regardless
of which suite component is being used. The common services also include
the metadata services, which provide standard service-oriented access
and analysis of metadata across the platform. In addition, the common
services tier manages how services are deployed from any of the product
functions, allowing cleansing and transformation rules or federated
queries to be published as shared services within an SOA, using a
consistent and easy-to-use mechanism.
- InfoSphere Information Server products
can access three general categories of service:
- Design
- Design services help developers create function-specific services
that can also be shared. For example, InfoSphere Information Analyzer calls
a column analyzer service that was created for enterprise data analysis
but can be integrated with other parts of InfoSphere Information Server because
it exhibits common SOA characteristics.
- Execution
- Execution services include logging, scheduling, monitoring, reporting,
security, and Web framework.
- Metadata
- Metadata services enable metadata to be shared across tools so
that changes made in one InfoSphere Information Server component
are instantly visible across all of the suite components. Metadata
services are integrated with the metadata repository. Metadata services
also enable you to exchange metadata with external tools.
The common services tier is deployed on J2EE-compliant
application servers such as IBM WebSphere® Application Server,
which is included with InfoSphere Information Server.
- Unified user interface
- The face of InfoSphere Information Server is
a common graphical interface and tool framework. Shared interfaces
such as the IBM InfoSphere Information
Server console and
the IBM InfoSphere Information Server Web console provide
a common interface, visual controls, and user experience
across products. Common functions such as catalog browsing, metadata
import, query, and data browsing all expose underlying common services
in a uniform way. InfoSphere Information Server provides
rich client interfaces for highly detailed development work and thin
clients that run in Web browsers for administration.
Application
programming interfaces (APIs) support a variety of interface styles
that include standard request-reply, service-oriented, event-driven,
and scheduled task invocation.