Parallel computing

parallelparallel processingparallelismparallel computerparallelizationparallel programmingparallel computationparallel computersin parallelparallel processor
Parallel computing is a type of computation in which many calculations or the execution of processes are carried out simultaneously.wikipedia
1,054 Related Articles

Task parallelism

thread-level parallelismtask-paralleltasks
There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism.
Task parallelism (also known as function parallelism and control parallelism) is a form of parallelization of computer code across multiple processors in parallel computing environments.

Bit-level parallelism

bit-level
There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallel computing is closely related to concurrent computing—they are frequently used together, and often conflated, though the two are distinct: it is possible to have parallelism without concurrency (such as bit-level parallelism), and concurrency without parallelism (such as multitasking by time-sharing on a single-core CPU).
Bit-level parallelism is a form of parallel computing based on increasing processor word size.

Multi-core processor

dual-coremulti-corequad-core
As power consumption (and consequently heat generation) by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors. Parallel computers can be roughly classified according to the level at which the hardware supports parallelism, with multi-core and multi-processor computers having multiple processing elements within a single machine, while clusters, MPPs, and grids use multiple computers to work on the same task.
The instructions are ordinary CPU instructions (such as add, move data, and branch) but the single processor can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques.

Data parallelism

data paralleldata-paralleldata
There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism.
Data parallelism is parallelization across multiple processors in parallel computing environments.

Concurrent computing

concurrentconcurrent programmingconcurrency
Parallel computing is closely related to concurrent computing—they are frequently used together, and often conflated, though the two are distinct: it is possible to have parallelism without concurrency (such as bit-level parallelism), and concurrency without parallelism (such as multitasking by time-sharing on a single-core CPU).
The concept of concurrent computing is frequently confused with the related but distinct concept of parallel computing, although both can be described as "multiple processes executing during the same period of time".

Grid computing

gridgridscomputing grid
Parallel computers can be roughly classified according to the level at which the hardware supports parallelism, with multi-core and multi-processor computers having multiple processing elements within a single machine, while clusters, MPPs, and grids use multiple computers to work on the same task.
For certain applications, distributed or grid computing can be seen as a special type of parallel computing that relies on complete computers (with onboard CPUs, storage, power supplies, network interfaces, etc.) connected to a computer network (private or public) by a conventional network interface, such as Ethernet.

Distributed computing

distributeddistributed systemsdistributed system
In contrast, in concurrent computing, the various processes often do not address related tasks; when they do, as is typical in distributed computing, the separate tasks may have a varied nature and often require some inter-process communication during execution.

Speedup

speed upspeed-uplinear speedup
A theoretical upper bound on the speed-up of a single program as a result of parallelization is given by Amdahl's law.
The notion of speedup was established by Amdahl's law, which was particularly focused on parallel processing.

Embarrassingly parallel

embarrassingly parallel problemE'''mbarrassingly '''P'''arallelembarassingly parallel
An application exhibits fine-grained parallelism if its subtasks must communicate many times per second; it exhibits coarse-grained parallelism if they do not communicate many times per second, and it exhibits embarrassing parallelism if they rarely or never have to communicate.
Parallel computing, a paradigm in computing which has multiple tasks running simultaneously, might contain what is known as an embarrassingly parallel workload or problem (also called perfectly parallel, delightfully parallel or pleasingly parallel).

Thread (computing)

threadthreadsmultithreading
Subtasks in a parallel program are often called threads.
On a multiprocessor or multi-core system, multiple threads can execute in parallel, with every processor or core executing a separate thread simultaneously; on a processor or core with hardware threads, separate software threads can also be executed concurrently by separate hardware threads.

Barrier (computer science)

barrierrendezvousbarriers
For that, some means of enforcing an ordering between accesses is necessary, such as semaphores, barriers or some other synchronization method.
In parallel computing, a barrier is a type of synchronization method.

Operating system

operating systemsOScomputer operating system
An operating system can ensure that different tasks and user programmes are run in parallel on the available cores.
Hardware features were added, that enabled use of runtime libraries, interrupts, and parallel processing.

Deadlock

livelockdeadlocksdeadly embrace
Locking multiple variables using non-atomic locks introduces the possibility of program deadlock.
Deadlock is a common problem in multiprocessing systems, parallel computing, and distributed systems, where software and hardware locks are used to arbitrate shared resources and implement process synchronization.

Serial computer

bit-serialbit-serial'' modeserial computation
Traditionally, computer software has been written for serial computation.
Serial computers required much less hardware than their parallel computing counterpart, but were much slower.

Amdahl's law

applications do not scale horizontallyoverall speedupspeedup
A theoretical upper bound on the speed-up of a single program as a result of parallelization is given by Amdahl's law.
Amdahl's law is often used in parallel computing to predict the theoretical speedup when using multiple processors.

Systolic array

systolic arrayKressArraymulticellular
While computer architectures to deal with this were devised (such as systolic arrays), few applications that fit this class materialized.
In parallel computer architectures, a systolic array is a homogeneous network of tightly coupled data processing units (DPUs) called cells or nodes.

Process (computing)

processprocessesprocessing
Parallel computing is a type of computation in which many calculations or the execution of processes are carried out simultaneously.
A multitasking operating system may just switch between processes to give the appearance of many processes executing simultaneously (that is, in parallel), though in fact only one process can be executing at any one time on a single CPU (unless the CPU has multiple cores, then multithreading or other similar technologies can be used).

Parallel slowdown

This problem, known as parallel slowdown, can be improved in some cases by software analysis and redesign.
Parallel slowdown is a phenomenon in parallel computing where parallelization of a parallel algorithm beyond a certain point causes the program to run slower (take more time to run to completion).

Sequential algorithm

sequentialserial algorithm
In some cases parallelism is transparent to the programmer, such as in bit-level or instruction-level parallelism, but explicitly parallel algorithms, particularly those that use concurrency, are more difficult to write than sequential ones, because concurrency introduces several new classes of potential software bugs, of which race conditions are the most common.
In computer science, a sequential algorithm or serial algorithm is an algorithm that is executed sequentially – once through, from start to finish, without other processing executing – as opposed to concurrently or in parallel.

Central processing unit

CPUprocessorprocessors
These instructions are executed on a central processing unit on one computer.
These newer concerns are among the many factors causing researchers to investigate new methods of computing such as the quantum computer, as well as to expand the usage of parallelism and other methods that extend the usefulness of the classical von Neumann model.

Superscalar processor

superscalarsuperscalar architecturesuperscalar execution
These processors are known as superscalar processors.
A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor.

Frequency scaling

Dynamic frequency scaling (DFS)frequencyfrequency scaled
Parallelism has long been employed in high-performance computing, but it's gaining broader interest due to the physical constraints preventing frequency scaling.
The end of frequency scaling as the dominant cause of processor performance gains has caused an industry-wide shift to parallel computing in the form of multicore processors.

Dataflow architecture

Dataflowdata flowdata flow computer architecture
Dataflow theory later built upon these, and Dataflow architectures were created to physically implement the ideas of dataflow theory.
It is also very relevant in many software architectures today including database engine designs and parallel computing frameworks.

Uniform memory access

UMAHeterogenous Unified Memory AccesshUMA
Computer architectures in which each element of main memory can be accessed with equal latency and bandwidth are known as uniform memory access (UMA) systems.
Uniform memory access (UMA) is a shared memory architecture used in parallel computers.

Intel

Intel CorporationIntel Corp.Intel Inside
Increasing processor power consumption led ultimately to Intel's May 8, 2004 cancellation of its Tejas and Jayhawk processors, which is generally cited as the end of frequency scaling as the dominant computer architecture paradigm.
The Intel Scientific Computers division was founded in 1984 by Justin Rattner, to design and produce parallel computers based on Intel microprocessors connected in hypercube internetwork topology.