Home Topics Data storage What is data storage?
Explore IBM's data storage solutions Subscribe to the Think Newsletter
Illustration with collage of pictograms of computer monitor, server, clouds, dots

Updated: 15 July 2024
Contributors: Stephanie Susnjara, Ian Smalley

What is data storage?

Data storage refers to magnetic, optical or mechanical media that record and preserve digital information for ongoing or future operations.

There are two types of digital information: input and output data. Users provide the input data, and computers provide the output data. However, a computer's CPU can’t compute anything or produce output data without the user's input. 

Users can enter the input data directly into a computer. However, early on in the computer era, they found that continually entering data manually is time- and energy-prohibitive. One short-term solution is computer memory, also known as random access memory (RAM). However, its storage capacity and memory retention are limited. Read-only memory (ROM) is, as the name suggests, where data can only be read but not necessarily edited. It controls a computer's basic functions. 

Although computer scientists made significant advances in computer memory with the development of dynamic RAM (DRAM) and synchronous DRAM (SDRAM), they are still limited by cost, space and memory retention. When a computer powers down, so does the RAM's ability to retain data. The solution? Data storage. 

With data storage space, users can save data onto a device. Should the computer power down, the data is retained. Instead of manually entering data into a computer, users can instruct the computer to pull data from storage devices. Computers can read input data from various sources as needed, and they can then create and save the output to the same sources or other storage locations. Users can also share data storage with others. 

Today, organizations and users require data storage to meet high-level computational needs for big data analytics, artificial intelligence (AI), machine learning (ML) and the Internet of Things (IoT). The other side of requiring vast data storage is protecting against data loss due to disaster, failure or fraud. So, to avoid data loss, organizations can also employ data storage as a backup and restore solution.

IBM Storage: An effective line of defense against cyber attacks

Whether they are caused by human error, system glitches or malicious criminal acts, data breaches are among the gravest and most expensive threats to today’s businesses.

How does data storage work?

In simple terms, modern computers or terminals connect to storage devices either directly or through a network. Users instruct computers to access data from and store data on these storage devices. However, at a fundamental level, there are two foundations to data storage: the form in which data is taken and the devices on which it is recorded and stored.

Data storage devices

To store data, regardless of form, users need storage devices. Data storage devices come in two main categories: direct area storage and network-based storage. 

Direct area storage, also known as direct-attached storage (DAS), is as the name implies. This storage is often in the immediate area and directly connected to the computing machine accessing it. Often, it's the only machine connected to it. DAS can also provide decent local backup services, but sharing is limited. DAS devices include diskettes, optical discs—compact discs (CDs) and digital video discs (DVDs)—hard disk drives (HDD), flash drives and solid-state drives (SSD). 

Network-based storage allows multiple computers to access it through a network, making it better for data sharing and collaboration. Its off-site storage capability is also better suited for backups and data protection. Two standard network-based storage setups are network-attached storage (NAS) and storage area network (SAN). 

NAS is often a single device made up of redundant storage containers or a redundant array of independent disks (RAID). SAN storage can be a network of multiple devices of various types, including SSD and flash storage, hybrid storage, hybrid cloud storage, cloud storage and backup software and appliances.

What's the difference between NAS and SAN?

Here's how NAS and SAN differ:

NAS

  • Single storage device or RAI
  • File storage system
  • TCP/IP Ethernet network
  • Limited users
  • Limited speed
  • Limited expansion options
  • Lower cost and easy setup

SAN

  • Network of multiple devices
  • Block storage system
  • Fibre Channel network
  • Optimized for multiple users
  • Faster performance
  • Highly expandable
  • Higher cost and complex setup
Types of storage devices and systems
SSD and flash storage

Flash storage is a solid-state drive technology that uses flash memory chips to write and store data. A solid-state disk (SSD) flash drive stores data by using flash memory. Compared to hard-disk drives (HDDs), a solid-state system has no moving parts and less latency, so there are fewer SSDs. Because most modern SSDs are flash-based, flash storage is synonymous with a solid-state system.

Hybrid storage

SSDs and flash offer higher throughput than HDDs, but all-flash arrays can be more expensive. Many organizations adopt a hybrid approach, mixing the speed of flash with the storage capacity of hard disk drives. A balanced storage infrastructure enables companies to apply specific technology to meet different storage needs. Hybrid storage offers an economical way to transition from traditional HDDs without going entirely to flash.

Cloud storage

Cloud storage delivers a cost-effective, scalable alternative to storing files on-premises hard disks or storage networks. Cloud service providers (CSPs)—like Google Cloud, Microsoft Azure, IBM Cloud®, Amazon Web Services (AWS)—allow you to save data and files in an off-site location that you can access through the public internet or a dedicated private network connection. The provider hosts, secures, manages and maintains the servers and associated infrastructure and ensures you can access the data whenever needed. 

Hybrid cloud storage

Hybrid cloud storage combines private and public cloud elements. With hybrid cloud storage, organizations can choose which cloud to store data in. For instance, highly regulated data subject to strict archiving and replication requirements is more suited to a private cloud environment, while less sensitive data can be stored in the public cloud. Some organizations use hybrid clouds to supplement their internal storage networks with public cloud storage.

Storage backup software and appliances

Backup storage and appliances protect data loss from disaster, failure or fraud. They make periodic data and application copies to a separate, secondary device and then use those copies for disaster recovery. Backup appliances range from HDDs and SSDs to tape drives and servers.

Cloud service providers (CSPs) also offer backup storage as a service called backup-as-a-service (BaaS). Like most as-a-service solutions, BaaS provides a low-cost option to protect data, saving it in a remote location with scalability.

Forms of data storage

Data can be recorded and stored in three primary forms: file storage, block storage and object storage. 

For a deeper comparison of the types of data storage, see “Object versus File versus Block Storage: What’s the Difference?” and check out the following video.

File storage

File storage, or file-based storage, is a hierarchical storage methodology used to organize and store data. In other words, data is stored in files, which are organized in folders, which are organized under a hierarchy of directories and subdirectories.

Block storage

Block storage, sometimes called block-level storage, is a technology for storing data in blocks. The blocks are then stored as separate pieces, each with a unique identifier. Developers favor block storage for computing situations that require fast, efficient and reliable data transfer.

Object storage

Object storage, often called object-based storage, is a data storage architecture for handling large amounts of unstructured data. This data doesn't conform to—or can't be organized easily into—a traditional relational database with rows and columns. Examples include email, videos, photos, web pages, audio files, sensor data and other media and web content (textual or nontextual). Other use cases include building cloud-native applications or transforming legacy applications into next-generation cloud applications by using cloud-based object storage as a persistent data store.

Storage area networks and data storage for business

Computer memory and local storage might not provide enough storage, storage protection, multiple users' access, speed and performance for enterprise applications. So, most organizations employ some form of a storage area network (SAN) in addition to a network-attached storage (NAS) system.

Sometimes called the network behind the servers, a storage area network (SAN) is a specialized, high-speed network that attaches servers and storage devices. It consists of a communication infrastructure that provides physical connections, allowing an any-to-any device to bridge the network by using interconnected elements, such as switches and directors.

The SAN can also be viewed as an extension of the storage bus concept. This concept enables storage devices and servers to interconnect by using similar elements, like local area networks (LANs) and wide-area networks (WANs). A SAN also includes a management layer that organizes the connections, storage elements and computer systems. This layer ensures secure and robust data transfers. 

Traditionally, only a limited number of storage devices might attach to a server. Alternatively, a SAN introduced networking flexibility, enabling one server or many heterogeneous servers across multiple data centers to share a common storage utility. The SAN eliminates the traditional dedicated connection between a server and storage. It also eliminates the concept that the server effectively owns and manages the storage devices. So, a network might include many storage devices, including disks, magnetic tape and optical storage—and the storage utility might be located far from the servers it uses.

SAN components

The storage infrastructure is the foundation on which information relies. Therefore, it must support the company's business objectives and business model. A SAN infrastructure provides enhanced network availability, data accessibility and system manageability. In this environment, simply deploying more and faster storage devices is not enough. A good SAN begins with a good design. 

Fibre Channel

The first element to consider in any SAN implementation is the connectivity of the storage and server components, which typically use Fibre Channel—a high-speed data transfer technology. SANs, like LANS, interconnect the storage interfaces into many network configurations and across longer distances.

Server infrastructure

The server infrastructure is the underlying reason for all SAN solutions, and this infrastructure includes a mix of server platforms. Initiatives like server consolidation and ecommerce increase the need for SANs, making network storage more critical.

Storage system

A storage system can consist of disk systems and tape systems. The disk system can include HDDs, SSDs or flash drives. The tape system can consist of tape drives, tape autoloaders and tape libraries.

Network system

SAN connectivity comprises hardware and software components that interconnect storage devices and servers. Hardware can include hubs, switches, directors and routers.

Software-defined storage (SDS) and related technologies

Today, data storage has evolved toward a software approach that revolves around software-defined storage (SDS) and related technologies that increase agility and efficiency in data management. In a report from Technavio, the global software-defined storage (SDS) market size is estimated to grow by USD 105.07 billion in 2024–2029.1

Software-defined storage (SDS)

Software-defined storage (SDS) is a type of data storage in which a software layer decouples storage resources from their underlying physical storage hardware infrastructure. SDS uses virtualization to create a unified pool of storage resources that can be dynamically allocated through automation or manually through an API dashboard. 

Unlike traditional NAS or SAN systems, SDS offers the flexibility to respond to the complex digital transformation process. For instance, SDS can significantly streamline storage management-related tasks by automating workloads related to provisioning, monitoring and troubleshooting.

Storage virtualization

Storage virtualization refers to pooling physical storage resources from multiple storage systems so that it appears all storage is stored on one device. In contrast, SDS abstracts the storage services and separates them from the device itself. Users manage storage virtualization via a console to ensure the security, reliability and efficiency of their data and storage resources for virtualized server and desktop environments.

Hyperconverged storage

Hyperconverged storage is a data storage architecture in which SDS resources are pooled and managed within a hyperconverged infrastructure (HCI).

Hyperconverged storage integrates all storage directly into the HCI stack, along with computing and networking functions. Through virtualization, HCI untethers storage resources from individual pieces of hardware, making hyperconverged storage far more flexible and scalable than traditional storage solutions.

Data storage security

Data storage security protects data on-premises and in cloud-based environments against data breaches, cyberattacks and other security threats.

Data breaches are costly and present an ongoing for enterprise businesses. According to the IBM Cost of a Data Breach Report 2023, the global average data breach cost in that year was USD 4.45 million, a 15% increase over three years. The report also revealed that the average savings for organizations that use security AI and automation extensively is USD 1.76 million when compared to organizations that don't.

Enterprises deploy data security measures to enhance visibility into data storage. Storage security hardware and software features include special permissions,encryption, data masking and redaction of sensitive files. The latest security storage software solutions also help to automate reporting to streamline audits and adhere to regulatory requirements.

Moreover, cyber resilience—an organization's ability to prevent, withstand and recover from cybersecurity incidents—has become an integral part of data storage security. Cyber resilience takes data security to a new level by combining business continuity disaster recovery (BCDR), information systems security and organizational resilience to help organizations ward off threats and safeguard their data. 

Today, industries that need to preserve records and maintain data integrity (for example, healthcare, government) can opt for immutable storage, which protects stored data by preventing any changes or alterations for a set or indefinite amount of time. These file systems allow stored data to be accessed repeatedly once created, but not modified and can help protect data from tampering, cyberattacks and ransomware

Related solutions
IBM Storage solutions

IBM Storage is a family of data storage hardware, software-defined storage and storage management software.

Explore IBM Storage solutions
IBM Storage FlashSystem

IBM® Storage FlashSystem is a high-performance, all-flash storage solution that streamlines administration and operational complexity across on-premises, hybrid cloud, virtualized and containerized environments. It also uses AI to detect anomalies and provide real-time ransomware threat detection.

Explore IBM Storage FlashSystem
IBM storage area network (SAN) solutions

IBM's storage area network (SAN) solutions connect servers and storage with a high-speed, intelligent network fabric. 

IBM storage area network (SAN) solutions
IBM Storage for hybrid cloud

IBM hybrid cloud storage solution empowers organizations to deploy cloud architectures on-premises and extend them seamlessly to public environments, providing the agility and scalability needed to support portable workloads.

IBM Storage for hybrid cloud
Resources What is data security?

Data security is the practice of protecting digital information from unauthorized access, corruption or theft throughout its entire lifecycle.

Fight ransomware

Watch this on-demand webinar to learn how you can combine the power of IBM® Storage Defender and IBM FlashSystem® to fight ransomware.

Backup is not enough—it's time to move to data resilience

Watch the on-demand recording to learn practical steps you can take to build a more resilient operation and secure your data.

What is flash storage?

Learn what flash storage is and the main types of flash storage used in business. Read the use-case stories and how flash storage meets business demands. And revisit its history and upcoming trends.

Storage training and learn hub

Find storage-specific training in the IBM Training hub. Learn what's new in storage, start a learning path, earn badges or explore on your own. Read more articles about storage.

Simplify data resilience for enterprise data storage

Learn how IBM Storage Defender can help your business address its data resilience challenges.

Take the next step

Simplify data and infrastructure management with IBM Storage FlashSystem, a high-performance, all-flash storage solution that streamlines administration and operational complexity across on-premises, hybrid cloud, virtualized and containerized environments.

    Explore FlashSystem storage Take a tour