Managing data collectors
Monitor the status of data collectors and complete administrative tasks such as deploying more data collectors, upgrading your data collectors, and assigning data collectors to monitor specific devices.
What is a data collector?
Description of data collectors | Illustration of how metadata is collected |
---|---|
Data collectors are lightweight applications that are deployed on servers or virtual machines in your data centers. Data collectors collect capacity, configuration, and performance metadata about your monitored devices and send the metadata for analysis over HTTPS connections to your IBM Storage Insights service. In a matter of minutes, you can install the data collector and when you add the devices that you want to monitor, you get the capacity and performance insights that you need to monitor your data center. Because the metadata that IBM® Support needs to investigate and close tickets is also collected, you can also upload logs when you create or update tickets and IBM Support can access and investigate the metadata to resolve any issues that you might have. |
Security
Protecting information about your devices and storage metadata is critical. The data collector initiates outbound-only connections over HTTPS to transmit metadata to your unique instance of IBM Storage Insights in the IBM Cloud data center.
The data collector collects metadata about your devices and storage, but doesn’t access application, personal, or identity data.
Deployment planning
Ensuring that you meet the minimum RAM and disk space requirements on a server or virtual machine where you install a data collector is critical to helping ensure that metadata collection runs smoothly. In addition to the RAM and disk space requirements for running the operating system on the host, you must also provide at least 1 GB of RAM and 3 GB of disk space to run the data collector.
Learn more about how disk space is used during service outages.
- You need to be an administrator for deploying a data collector.
- To ensure the availability of metadata collection and to help balance workload, deploy two or more data collectors on separate servers in each of your data centers.
- The servers or virtual machines where you install data collectors must be able to access the devices that you want to monitor. Because data collectors connect directly to the devices, the servers or virtual machines must be able to support these connections.
- Your organization's firewall must be configured to allow the servers or virtual machines where you install data collectors to send outbound traffic on port 443 over TCP to IBM Storage Insights.
- When you install data collectors, you can connect to a server or a proxy server.
- The AIX-based data collector requires the installation of the IBM® XL C/C++ Runtime for AIX 16.1.0.1 or later, on the operating system.
- Windows Server 2016 or later
- POWER6 or later systems that use AIX® 7.x or later
- Red Hat® Enterprise Linux® 7 or later on x86-64 only
- CentOS Linux 7 or later on x86-64 only
- Red Hat Enterprise Linux 7.x on PPC64LE (POWER8 only)Restriction: The data collector on Linux PPC64LE has the additional limitation that you cannot monitor IBM FlashSystem A9000, XIV, IBM Storage Accelerate, and non-IBM devices.
- Schedule OS updates and restarts at different times on the servers or virtual machines where data collectors are installed. By staggering when these updates occur, you can better avoid interrupting metadata collection across all your monitored devices.
- You can install data collectors on virtual machines in a cloud environment if they meet the previous requirements. If your data collectors are installed in the cloud, it's recommended that you also install at least one data collector in your network for redundancy.
- It's recommended that you do not install a data collector on a laptop or personal workstation. Shutting down a laptop or personal workstation or putting it into sleep mode will interrupt data collection. The server or virtual machine where you install a data collector must be available 24X7.
- Don't install the data collector to a file system that was mounted or mapped under a user login session. These file systems can be temporary and might not persist after the user logs out of the network. Note that you can install a data collector to a system-mounted or mapped file system because those file systems are typically more permanent.
- Avoid using system-reserved directory paths to install or extract the data collectors. Always create a new directory to install the data collectors.
- If server change control is implemented within your organization, ensure that your server team has a process for updating data collectors. This process must include the ability to download the data collector package from IBM Storage Insights and deploy it within your environment.
Deployment best practices
- Redundancy
- To make your data collection services more robust, install two or more data collectors on separate servers or virtual machines in each of your data centers.
- Monitoring devices in multiple data centers
- To avoid high network latency and interruptions in the collection of metadata when you monitor devices in data centers in different locations, install two or more data collectors on separate servers in each data center.
- Large environments
- If your organization uses more than 25 devices (which can include a combination of storage
systems, VMware hosts, and switches), or your storage systems manage more than 50,000 volumes, then
your environment is considered "large".
- The best practice for large environments is to deploy one data collector for every 25 devices that you monitor.
- The number of volumes that your storage systems manage also determines the number of data collectors that you need to deploy. For example, if you add 10 storage systems that manage 50,000 volumes, you must to deploy more data collectors to manage the collection of metadata.
- Proxy servers
- When you install the data collector, you can connect to a proxy server.
To connect to the proxy server, you’ll need its host name and port number. If you connect to a secure proxy server, you’ll also need a user name and password credentials.
Information on the Data Collectors page
- Name
- The name of the data collector. You can change the default display name, which is the hostname
of the data collector, to a name that is easily identifiable within your organization. For example,
you can add a geographic component to the name, such as
chicago.storage.collector1
,chicago.storage.collector2
,newyork.storage.collector1
, and so on. For more information, see Renaming data collectors. - Status
-
- Connected
- The data collector is working correctly and is communicating with IBM Storage Insights.
- Not Connected
- The data collector is stopped on the server and cannot collect metadata. The status might be
shown if the data collector is not communicating with IBM Storage
Insights because of network or firewall issues. Start
the data collection service on the server to resume metadata collection.
This status might also be shown if the data collector is uninstalled on the server but has not been removed from the GUI.
- Failed
- The data collector is not working.
- Down-level
- The latest version of the data collector is not deployed on the server. Enable automatic upgrades to ensure that you always have the most up-to-date data collectors with the latest fixes.
- Downloading
- The current version of the data collector is in the process of being downloaded from IBM.
- Performance Manager caching upload
- The upgrade process is waiting for the Performance Manager data to finish uploading to IBM Storage Insights before continuing with the upgrade process.
- Extracting
- The downloaded data collector is in the process of being extracted onto the server or virtual machine.
- Replacing
- The old data collector files are being replaced by new ones before restarting the data collector.
- Upgrade failed
- The upgrade failed and IBM Storage Insights can't communicate with the data collector.
- Candidates
- The number of devices that the data collector is assigned to. On the Assignments tab, you can assign a data collector to one or more devices. If a data collector that is monitoring a device fails, the next available candidate starts monitoring the device in its place.
- Monitored devices
- The number of devices that the data collector is currently monitoring.
- Unreachable Devices
- The number of candidate devices that the data collector cannot reach because of connection issues.
- IP Address
- The IP address of the device that the data collector is assigned to.
- Version
- The version number of the data collector that is currently running.
- Time Zone
- The time zone where the device that the data collector is assigned to is located.
- Last Start Time
- The date and time that the data collector started.
- OS Type, OS Version and OS Architecture
- Details of the Operating System that is running on the device that the data collector is monitoring.
- Upgrade
- The date, time, and status of the last attempted upgrade of the data collector.
Information on the details page for a data collector
From the Configuration menu, click Data Collectors. Then, double-click the data collector that you want to view. The following information is shown for each data collector:
- Connection status
- The status of the connection between the data collector and the devices that are visible to it. Use this information to quickly verify which devices that the data collector can connect to. You can also identify the devices that are candidates to be monitored but are not yet monitored by the data collector. For example, devices with a Reachable status are candidates for monitoring.
- Last contact
- The most recent date and time when a device was successfully contacted by the data collector.
You can use this information to ensure that a connection status is up to date and identify when and
how long unreachable devices could not be contacted.
For devices that are monitored by the data collector, this value is updated frequently, such as when probes and performance monitors collect metadata, credentials are changed, connections are tested manually, and other operations.
For devices that are not monitored but are candidates to be monitored, the connection between a device and the data collector is tested every 10 minutes. Candidates are devices that can be accessed by the collector and have the value Allowed in the State column.Tip: For information about how to change frequency of the connection test between a candidate device and the data collector, see Configure how often connection status is updated.
- Name
- The name of a device that is visible to the data collector on your network. Only devices that are accessible to the data collector are shown. A data collector can monitor multiple devices at the same time, but those devices must be accessible from the server or virtual machine where the data collector is installed.
- State
- The most recent date and time when a device was successfully contacted by the data collector.
For devices that are monitored by the data collector, this value is updated frequently, such as when
probes and performance monitors collect metadata, credentials are changed, connections are tested
manually, and other operations. For devices that are not monitored but are candidates to be
monitored, the connection between a device and the data collector is tested every 10 minutes.
Candidates are devices with a State of Allowed.
You can use this information to ensure that a connection status is up to date and identify when and how long unreachable devices could not be contacted.