What is parallel computing?

Published: 3 July 2024
Contributors: Mesh Flinders, Ian Smalley

Parallel computing, also known as parallel programming, is a process where large compute problems are broken down into smaller problems that can be solved simultaneously by multiple processors.

The processors communicate using shared memory and their solutions are combined using an algorithm. Parallel computing is significantly faster than serial computing (also known as serial computation), its predecessor that uses a single processor to solve problems in sequence.

When computers were first invented in the late 1940s and 1950s, software was programmed to solve problems in sequence, which restricted processing speed. To solve problems faster, algorithms had to be built and implemented following a set of instructions on a central processing unit (CPU). Only after one instruction had been run might another one be solved.

Starting in the 1950s, parallel computing allowed computers to run code faster and more efficiently by breaking up compute problems into smaller, similar problems. These problems, which are known as parallel algorithms, were then distributed across multiple processors.

Today, parallel systems have evolved to the point where they are used in various computers, making everyday tasks like checking email or sending a text message hundreds of times faster than if they were performed using serial computing. In addition to powering personal devices like laptops and smartphones, parallel systems also power the most advanced supercomputers and cutting-edge technologies like artificial intelligence (AI) and the Internet of Things (IoT).

Serial computing versus parallel computing

Serial computing, also known as sequential computing, is a type of computing where instructions for solving compute problems are followed one at a time or sequentially. The fundamentals of serial computing require systems to use only one processor rather than distributing problems across multiple processors.

As computer science evolved, parallel computing was introduced because serial computing had slow speeds. Operating systems using parallel programming allow computers to run processes and perform calculations simultaneously, a technique known as parallel processing.

Parallel processing versus parallel computing

Parallel processing and parallel computing are very similar terms, but some differences are worth noting. Parallel processing, or parallelism, separates a runtime task into smaller parts to be performed independently and simultaneously using more than one processor. A computer network or computer with more than one processor is typically required to reassemble the data once the equations have been solved on multiple processors.

While parallel processing and parallel computing are sometimes used interchangeably, parallel processing refers to the number of cores and CPUs running alongside a computer, while parallel computing refers to what the software does to facilitate the process.

Subscribe to the IBM newsletter

Why is parallel computing important?

Parallel computing’s speed and efficiency power some of the most important tech breakthroughs of the last half century, including smartphones, high performance computing (HPC), AI and machine learning (ML). By enabling computers to solve more complex problems faster and with fewer resources, parallel computing is also a crucial driver of digital transformation for many enterprises.

History and development

Interest in parallel computing began as computer programmers and manufacturers started to look for ways to build more power-efficient processors. In the 1950s, 1960s and 1970s, science and engineering leaders built computers that used shared memory space and run parallel operations on datasets for the first time. These efforts culminated in the groundbreaking Caltech Concurrent Computation project, which introduced a new type of parallel computing in the 1980s using 64 Intel processors.

In the 1990s, the ASCI Red supercomputer, using massively parallel processors (MPPs), achieved an unprecedented trillion operations per second, ushering in an era of MPP dominance in computing power¹. At the same time, clusters—a type of parallel computing that links computer clusters or “nodes” on a commercial network—were introduced to the market, eventually superseding MPPs for many applications.

Parallel computing, in particular, multi-core processors and graphics processing units (GPUs), remains a critical part of computer science today. GPUs are often deployed concurrently with CPUs to expand data throughput and run more calculations at once, speeding many modern business applications.

Enterprise benefits of parallel computing

1. Cost reduction

Before parallel computing, serial computing forced single processors to solve complex problems one step at a time, adding minutes and hours to tasks that parallel computing might accomplish in a few seconds. For example, the first iPhones used serial computing and might take a minute to open an app or email. Today, parallel computing—first used in iPhones in 2011—significantly speeds those tasks.

2. Complex problem solving

As computing matures and tackles more complex problems, systems need to perform thousands, even millions, of tasks in an instant. Today’s ML models rely heavily on parallel computing, using highly complex algorithms deployed over multiple processors. Using serial computing, ML tasks would take much longer due to bottlenecks caused by only being able to perform one calculation at a time on a single processor.

3. Faster analytics

Parallel computing and parallel processing supercharge number crunching on large datasets, enabling the interactive queries behind data analysis. With more than a quintillion byte of information generated every day, companies can struggle to sift through digital information for relevant insights that are helpful. Parallel processing deploys computers with many cores to a data structure and sifts through data much faster than a serial computer could.

4. Increased efficiencies

Armed with parallel computing, computers can use resources far more efficiently than their serial computing counterparts. Today’s most cutting-edge computer systems deploy multiple cores and processors, enabling them to run multiple programs at once and perform more tasks concurrently.

How does parallel computing work?

Parallel computing refers to many kinds of devices and computer architectures, from supercomputers to the smartphone in your pocket. At its most complex, parallel computing uses hundreds of thousands of cores to tackle problems like finding a new cancer drug or aiding in the search for extraterrestrial intelligence (SETI). At its simplest, parallel computing helps you send an email from your phone faster than if you used a serial computing system.

Broadly speaking, three unique architectures are used in parallel computing: shared memory, distributed computing memory and hybrid memory. Each architecture operates on its own message-passing interface (MPI), a standard for all parallel computing architectures. MPI outlines protocols for message-passing programs in programming languages such as C++ and Fortran. Open-source MPI has been key to the development of new applications and software that rely on parallel computing capabilities.

Different parallel computing architectures

Shared memory

Shared memory architecture is used in common, everyday applications of parallel computing, such as laptops or smartphones. In a shared memory architecture, parallel computers rely on multiple processors to contact the same shared memory resource.

Distributed memory

Distributed memory is used in cloud computing architectures, making it common in many enterprise applications. In a distributed system for parallel computing, multiple processors with their own memory resources are linked over a network.

Hybrid memory

Today’s supercomputers rely on hybrid memory architectures, a parallel computing system that combines shared memory computers on distributed memory networks. Connected CPUs in a hybrid memory environment can access shared memory and tasks assigned to other units on the same network.

Specialized architectures

In addition to the three main architectures, other less common parallel computer architectures are designed to tackle larger problems or highly specialized tasks. These include vector processors—for arrays of data called “vectors”—and processors for general-purpose computing on graphics processing units (GPGCUs). One of these, CUDA, a proprietary GPGCU application programming interface (API) developed by Nvidia, is critical to deep learning (DL), the technology that underpins most AI applications.

Types of parallel computing

There are four types of parallel computing, each with its own specific purposes.

1. Bit-level parallelism

Bit-level parallelism relies on a technique where the processor word size is increased and the number of instructions the processor must run to solve a problem is decreased. Until 1986, computer architecture advanced by increasing the bit level parallelism from 4-bit processors to 8-bit, 16-bit, 32-bit and 64-bit, with each generation out-performing the last. Perhaps the most famous example of an advancement in bit-parallelism was the Nintendo 64, the first time a mainstream application used 64-bit.

2. Instruction-level parallelism

Instruction-level parallelism (ILP) is a type of parallel computing where the processor chooses which instructions it will run. In ILP, the processors are built to perform certain operations simultaneously to improve resource optimization and increase throughput.

3. Task parallelism

Task parallelism is a type of parallel computing that parallelizes code across several processors simultaneously running tasks on the same data. Task parallelism is used to reduce serial time by running tasks concurrently; in pipelining, for example, where a series of tasks is performed on a single set of data.

4. Superword-level parallelism

More advanced than ILP, superword-level parallelism (SLP) is a vectorization tactic that is used on inline code. Vectorization is a parallel computing process used to complete multiple, similar tasks at once, saving time and resources. SLP is used to identify scalar instructions that are redundant in a code block and combine them into a single, superword operation.

Parallel computing use cases

From blockchains to smartphones, gaming consoles and chatbots, parallel computing is an essential part of many of the technologies that drive our world. Here are some examples:

Smartphones

Many smartphones rely on parallel processing to accomplish their tasks faster and more efficiently. For example, the iPhone 14 has a 6-core CPU and a 5-core GPU that power its market-leading ability to run 17 trillion tasks per second, an unthinkable level of performance using serial computing.

Blockchains

Blockchain technology, the technology that underpins cryptocurrencies, voting machines, healthcare and many other advanced applications of digital tech, relies on parallel computing to connect multiple computers to validate transactions and inputs. Parallel computing enables transactions in a blockchain to be processed simultaneously instead of one at a time, increasing throughput and making it highly scalable and efficient.

Laptop computers

Today’s most powerful laptops—MacBooks, ChromeBooks and ThinkPads—use chips with multiple processing cores, a foundation of parallelism. Examples of multicore processors include the Intel Core i5 and the HP Z8, which allow users to edit video in real-time, run 3D graphics and perform other complex, resource-intensive tasks.

Internet of Things

The Internet of Things (IoT) relies on data that is collected from sensors, which are connected to the internet. Once that data has been collected, parallel computing is required to analyze it for insights and help complex systems, such as power plants, dams and traffic systems, function. Traditional serial computing can’t sift through data fast enough for IoT to work, making parallel computing crucial to the advancement of IoT technology.

Artificial intelligence and machine learning

Parallel computing plays a critical role in the training of ML models for AI applications, such as facial recognition and natural language processing (NLP). By performing operations simultaneously, parallel computing significantly reduces the time that it takes to train ML models accurately on data.

The space shuttle

The computer that runs the space shuttle relies on five IBM® AP-101 computers operating in parallel to control its avionics and monitor data in real-time. These powerful machines, which are also used in fighter jets, can run almost 500,000 instructions per second.

Supercomputers

Supercomputers rely so heavily on parallel computing that they are often called parallel computers. The American Summit supercomputer, for example, processes 200 quadrillion operations per second to help humans better understand physics and the natural environment.²

Footnotes

1 “Reflecting on the 25^th anniversary of ASCI Red” (link resides outside ibm.com), HPC Wire, 26 April 2022

2 “The US Again Has the World’s Most Powerful Supercomputer” (link resides outside ibm.com), Wired, 8 June 2018