A parallel computer is a set of processors that are able to work
cooperatively to solve a computational problem. This definition is broad enough
to include parallel supercomputers that have hundreds or thousands of
processors, networks of workstations, multiple-processor workstations, and
embedded systems. Parallel computers are interesting because they offer the
potential to concentrate computational resources---whether processors, memory,
or I/O bandwidth---on important computational problems.
Parallelism has sometimes been viewed as a rare and exotic subarea of
computing, interesting but of little relevance to the average programmer. A
study of trends in applications, computer architecture, and networking shows
that this view is no longer tenable. Parallelism is becoming ubiquitous, and
parallel programming is becoming central to the programming enterprise.
Distributed Processing
Distributed computing is a computing concept that, in its most general
sense, refers to multiple computer systems working on a single problem. In
distributed computing, a single problem is divided into many parts, and each
part is solved by different computers. As long as the computers are networked,
they can communicate with each other to solve the problem. If done properly,
the computers perform like a single entity.
The ultimate goal of distributed computing is to maximize performance
by connecting users and IT resources in a cost-effective, transparent and
reliable manner. It also ensures fault tolerance and enables resource
accessibility in the event that one of the components fails.
Parallel Computer
The Flynn taxonomy of parallel machines: Flynn classified processors according to how many instruction and data streams they can handle simultaneously.
- Single or multiple instruction streams.
- Single or multiple data streams.
1. SISD machine: An ordinary serial computer.
At any given time, at most one instruction is being executed, and the
instruction affects at most one set of operands (data).
2. SIMD machine: At the right is a diagram of an array processor.
Several identical ALUs, may process, for example, a whole array at
once. However, the same instructions must be performed on all data items.
3.MISD machine: Several instructions operate simultaneously on each
operand.
Generally unrealistic for parallel computers!
4. MIMD machine: Several complete processors connected together to form a
multiprocessor.
The processors are connected together via an interconnection network
to provide a means of
cooperating during the computation. The processors need not be identical. Can handle a greater variety of tasks than an array processor.
Threads Programming
Threads are a relatively lightweight way to implement multiple paths
of execution inside of an application. At the system level, programs run side
by side, with the system doling out execution time to each program based on its
needs and the needs of other programs. Inside each program, however, exists one
or more threads of execution, which can be used to perform different tasks simultaneously
or in a nearly simultaneous manner. The system itself actually manages these
threads of execution, scheduling them to run on the available cores and
preemptively interrupting them as needed to allow other threads to run.
From a technical standpoint, a thread is a combination of the
kernel-level and application-level data structures needed to manage the
execution of code. The kernel-level structures coordinate the dispatching of
events to the thread and the preemptive scheduling of the thread on one of the
available cores. The application-level structures include the call stack for
storing function calls and the structures the application needs to manage and
manipulate the thread’s attributes and state.
In a non-concurrent application, there is only one thread of
execution. That thread starts and ends with your application’s main routine and
branches one-by-one to different methods or functions to implement the
application’s overall behavior. By contrast, an application that supports
concurrency starts with one thread and adds more as needed to create additional
execution paths. Each new path has its own custom start routine that runs
independently of the code in the application’s main routine. Having multiple
threads in an application provides two very important potential advantages:
Multiple threads can improve an application’s perceived
responsiveness. Multiple threads can improve an application’s real-time performance on
multicore systems.
Cuda GPU
CUDA (Compute Unified Device Architecture) is a parallel computing
platform and programming model created by NVIDIA and implemented by the
graphics processing units (GPUs) that they produce. CUDA gives program
developers direct access to the virtual instruction set and memory of the
parallel computational elements in CUDA GPUs.
Using CUDA, the GPUs can be used for general purpose processing (i.e.,
not exclusively graphics); this approach is known as GPGPU. Unlike CPUs,
however, GPUs have a parallel throughput architecture that emphasizes executing
many concurrent threads slowly, rather than executing a single thread very
quickly. Here are a few examples:
Analyze air traffic flow: The National Airspace System manages the
nationwide coordination of air traffic flow. Computer models help identify new
ways to alleviate congestion and keep airplane traffic moving efficiently.
Using the computational power of GPUs, a team at NASA obtained a large
performance gain, reducing analysis time from ten minutes to three seconds.
Reference :
http://www.mcs.anl.gov/~itf/dbpp/text/node7.html
http://www.techopedia.com/definition/7/distributed-computing-system
people.engr.ncsu.edu/efg/506/sum99/001/lec1-intro.pdf
https://developer.apple.com/library/mac/Documentation/Cocoa/Conceptual/Multithreading/AboutThreads/AboutThreads.html
http://en.wikipedia.org/wiki/CUDA
http://www.nvidia.com/object/cuda_home_new.html