Laman

Kamis, 19 Juni 2014

Parallel Processing

Parallelism Concept
A parallel computer is a set of processors that are able to work cooperatively to solve a computational problem. This definition is broad enough to include parallel supercomputers that have hundreds or thousands of processors, networks of workstations, multiple-processor workstations, and embedded systems. Parallel computers are interesting because they offer the potential to concentrate computational resources---whether processors, memory, or I/O bandwidth---on important computational problems.
Parallelism has sometimes been viewed as a rare and exotic subarea of computing, interesting but of little relevance to the average programmer. A study of trends in applications, computer architecture, and networking shows that this view is no longer tenable. Parallelism is becoming ubiquitous, and parallel programming is becoming central to the programming enterprise.

Distributed Processing
Distributed computing is a computing concept that, in its most general sense, refers to multiple computer systems working on a single problem. In distributed computing, a single problem is divided into many parts, and each part is solved by different computers. As long as the computers are networked, they can communicate with each other to solve the problem. If done properly, the computers perform like a single entity.
The ultimate goal of distributed computing is to maximize performance by connecting users and IT resources in a cost-effective, transparent and reliable manner. It also ensures fault tolerance and enables resource accessibility in the event that one of the components fails.

Parallel Computer
The Flynn taxonomy of parallel machines: Flynn classified processors according to how many instruction and data streams they can handle simultaneously.
  • Single or multiple instruction streams.
  • Single or multiple data streams.
1. SISD machine: An ordinary serial computer.
At any given time, at most one instruction is being executed, and the instruction affects at most one set of operands (data).

2. SIMD machine: At the right is a diagram of an array processor.
Several identical ALUs, may process, for example, a whole array at once. However, the same instructions must be performed on all data items.

3.MISD machine: Several instructions operate simultaneously on each operand.
Generally unrealistic for parallel computers!

4. MIMD machine: Several complete processors connected together to form a multiprocessor.
The processors are connected together via an interconnection network to provide a means of
cooperating during the computation. The processors need not be identical. Can handle a greater variety of tasks than an array processor.

Threads Programming
Threads are a relatively lightweight way to implement multiple paths of execution inside of an application. At the system level, programs run side by side, with the system doling out execution time to each program based on its needs and the needs of other programs. Inside each program, however, exists one or more threads of execution, which can be used to perform different tasks simultaneously or in a nearly simultaneous manner. The system itself actually manages these threads of execution, scheduling them to run on the available cores and preemptively interrupting them as needed to allow other threads to run.
From a technical standpoint, a thread is a combination of the kernel-level and application-level data structures needed to manage the execution of code. The kernel-level structures coordinate the dispatching of events to the thread and the preemptive scheduling of the thread on one of the available cores. The application-level structures include the call stack for storing function calls and the structures the application needs to manage and manipulate the thread’s attributes and state.
In a non-concurrent application, there is only one thread of execution. That thread starts and ends with your application’s main routine and branches one-by-one to different methods or functions to implement the application’s overall behavior. By contrast, an application that supports concurrency starts with one thread and adds more as needed to create additional execution paths. Each new path has its own custom start routine that runs independently of the code in the application’s main routine. Having multiple threads in an application provides two very important potential advantages:
Multiple threads can improve an application’s perceived responsiveness. Multiple threads can improve an application’s real-time performance on multicore systems.

Cuda GPU 

CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives program developers direct access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.

Using CUDA, the GPUs can be used for general purpose processing (i.e., not exclusively graphics); this approach is known as GPGPU. Unlike CPUs, however, GPUs have a parallel throughput architecture that emphasizes executing many concurrent threads slowly, rather than executing a single thread very quickly. Here are a few examples:
Analyze air traffic flow: The National Airspace System manages the nationwide coordination of air traffic flow. Computer models help identify new ways to alleviate congestion and keep airplane traffic moving efficiently. Using the computational power of GPUs, a team at NASA obtained a large performance gain, reducing analysis time from ten minutes to three seconds.

Reference :
http://www.mcs.anl.gov/~itf/dbpp/text/node7.html
http://www.techopedia.com/definition/7/distributed-computing-system
people.engr.ncsu.edu/efg/506/sum99/001/lec1-intro.pdf
https://developer.apple.com/library/mac/Documentation/Cocoa/Conceptual/Multithreading/AboutThreads/AboutThreads.html
http://en.wikipedia.org/wiki/CUDA
http://www.nvidia.com/object/cuda_home_new.html