IBM InfoSphere Information Server architecture and concepts
IBM® InfoSphere® Information Server provides a unified architecture that works with all types of information integration. Common services, unified parallel processing, and unified metadata are at the core of the server architecture.
The architecture is service-oriented, enabling IBM InfoSphere Information Server to work within evolving enterprise service-oriented architectures. A service-oriented architecture also connects the individual suite product modules of InfoSphere Information Server.
By eliminating duplication of functions, the architecture efficiently uses hardware resources and reduces the amount of development and administrative effort that are required to deploy an integration solution.
The following diagram shows the InfoSphere Information Server architecture.
- Unified parallel processing engine
- Much of the work that InfoSphere Information Server does
takes place within the parallel processing engine. The engine handles
data processing needs as diverse as performing analysis of large databases
for IBM InfoSphere Information Analyzer,
data cleansing for IBM InfoSphere QualityStage®,
and complex transformations for IBM InfoSphere DataStage®.
This parallel processing engine is designed to deliver the following
benefits:
- Parallelism and data pipelining to complete increasing volumes of work in decreasing time windows
- Scalability by adding hardware (for example, processors or nodes in a grid) with no changes to the data integration design
- Optimized database, file, and queue processing to handle large files that cannot fit in memory all at once or with large numbers of small files
- Common connectivity
- InfoSphere Information Server connects to information sources whether they are structured, unstructured, on the mainframe, or applications. Metadata-driven connectivity is shared across the suite components, and connection objects are reusable across functions.
- Connectors provide design-time importing of metadata, data browsing and sampling, runtime dynamic metadata access, error handling, and high functionality and high performance runtime data access. Prebuilt interfaces for packaged applications that are called packs provide adapters to SAP, Siebel, Oracle, and others, enabling integration with enterprise applications and associated reporting and analytical systems.
- Unified metadata
- InfoSphere Information Server is built on a unified metadata infrastructure that enables shared understanding between business and technical domains. This infrastructure reduces development time and provides a persistent record that can improve confidence in information. All functions of InfoSphere Information Server share the same metamodel, making it easier for different roles and functions to collaborate.
- A common metadata repository provides persistent storage for all InfoSphere Information Server suite
components. All of the products depend on the repository to navigate,
query, and update metadata. The repository contains two kinds of metadata:
- Dynamic
- Dynamic metadata includes design-time information.
- Operational
- Operational metadata includes performance monitoring, audit and log data, and data profiling sample data.
The repository is a J2EE application that uses a standard relational database such as IBM DB2®, Oracle, or SQL Server for persistence (DB2 is provided with InfoSphere Information Server). These databases provide backup, administration, scalability, parallel access, transactions, and concurrent access.
- Common services
- InfoSphere Information Server is built entirely on a set of shared services that centralize core tasks across the platform. These include administrative tasks such as security, user administration, logging, and reporting. Shared services allow these tasks to be managed and controlled in one place, regardless of which suite component is being used. The common services also include the metadata services, which provide standard service-oriented access and analysis of metadata across the platform. In addition, the common services tier manages how services are deployed from any of the product functions, allowing cleansing and transformation rules or federated queries to be published as shared services within an SOA, using a consistent and easy-to-use mechanism.
- InfoSphere Information Server products
can access three general categories of service:
- Design
- Design services help developers create function-specific services that can also be shared. For example, InfoSphere Information Analyzer calls a column analyzer service that was created for enterprise data analysis but can be integrated with other parts of InfoSphere Information Server because it exhibits common SOA characteristics.
- Execution
- Execution services include logging, scheduling, monitoring, reporting, security, and web framework.
- Metadata
- Metadata services enable metadata to be shared across tools so that changes made in one InfoSphere Information Server component are instantly visible across all of the suite components. Metadata services are integrated with the metadata repository. Metadata services also enable you to exchange metadata with external tools.
- Unified user interface
- The face of InfoSphere Information Server is a common
graphical interface and tool framework. Shared interfaces such as the IBM InfoSphere Information Server console and the
IBM InfoSphere Information Server Web console
provide a common interface, visual controls, and user experience across products.
Common functions such as catalog browsing, metadata import, query, and data browsing all expose
underlying common services in a uniform way. InfoSphere Information Server provides
rich client interfaces for highly detailed development work and thin clients that run in web
browsers for administration.
Application programming interfaces (APIs) support a variety of interface styles that include standard request-reply, service-oriented, event-driven, and scheduled task invocation.