Sunday

HIGH PERFORMANCE CLUSTER COMPUTING

HIGH PERFORMANCE CLUSTER COMPUTING

#The Need for Alternative Supercomputing Resources:
-Cannot afford to buy “Big Iron” machines
-due to their high cost and short life span.
-cut-down of funding
-don’t “fit” better into today's funding model.

#Paradox:
-time required to develop a parallel application for solving GCA is equal to:
half Life of Parallel Supercomputers.

#Clusters are best-alternative:


-Supercomputing-class commodity components are available
-They “fit” very well with today’s/future funding model.
-Can leverage upon future technological advances
-VLSI, CPUs, Networks, Disk, Memory, Cache, OS, programming tools, applications,...



#There are 3 ways to improve performance:

-Using faster hardware
-Optimized algorithms and techniques used to solve computational tasks
-Multiple computers to solve a particular task
..
# Rapid technical advances
-the recent advances in VLSI technology
- software technology
-OS, PL, development methodologies, & tools
-grand challenge applications have become the main driving force
.
#Parallel computing

-one of the best ways to overcome the speed bottleneck of a single processor
-good price/performance ratio of a small cluster-based parallel computer
.

#Scalable Parallel Computer Architectures:

-Massively Parallel Processors (MPP)
-Symmetric Multiprocessors (SMP)
-Cache-Coherent Nonuniform Memory Access (CC-NUMA)
-Distributed Systems
-Clusters
.
#MPP
-A large parallel processing system with a shared-nothing architecture
-Consist of several hundred nodes with a high-speed interconnection network/switch
-Each node consists of a main memory & one or more processors
-Runs a separate copy of the OS.

#SMP

-2-64 processors today
-Shared-everything architecture
-All processors share all the global resources available
-Single copy of the OS runs on these systems
.
#CC-NUMA
-a scalable multiprocessor system having a -cache-coherent nonuniform memory access architecture
-every processor has a global view of all of the memory
.
#Distributed systems

-considered conventional networks of independent computers
-have multiple system images as each node runs its own OS
-the individual machines could be combinations of MPPs, SMPs, clusters, & individual computers
.

#Clusters

-a collection of workstations of PCs that are interconnected by a high-speed network
-work as an integrated collection of resources
-have a single system image spanning all its nodes
.

#Towards Low Cost Parallel Computing:

#Parallel processing
:
-linking together 2 or more computers to jointly solve some computational problem
-since the early 1990s, an increasing trend to move away from expensive and specialized proprietary parallel supercomputers towards networks of workstations
-the rapid improvement in the availability of commodity high performance components for workstations and networks
-Low-cost commodity supercomputing

- from specialized traditional supercomputing platforms to cheaper, general purpose systems consisting of loosely coupled components built up from single or multiprocessor PCs or workstations
-need to standardization of many of the tools and utilities used by parallel applications (ex) MPI, HPF



#Cluster Computer and its Architecture:
-A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource.
#A node
:
-a single or multiprocessor system with memory, I/O facilities, & OS
-generally 2 or more computers (nodes) connected together
-in a single cabinet, or physically separated & connected via a LAN
-appear as a single system to users and applications
provide a cost-effective way to gain features and benefits
.

#Prominent Components of Cluster Computers :

A:

-Multiple High Performance Computers
PCs
-Workstations
-SMPs (CLUMPS)
-Distributed HPC Systems leading to Metacomputing
.

B:


#State of the art Operating Systems:

-OpenSolaris and solaris.
-Linux (Beowulf)
-Microsoft NT (Illinois HPVM)
-IBM AIX (IBM SP2)
-HP UX (Illinois - PANDA)
-Mach (Microkernel based OS) (CMU)
-Cluster Operating Systems (Solaris MC, SCO Unixware, MOSIX (academic project)
-OS gluing layers (Berkeley Glunix)


C:


#High Performance Networks/Switches
:
-Ethernet (10Mbps),
-Fast Ethernet (100Mbps),
-Gigabit Ethernet (1Gbps)
-SCI (Dolphin - MPI- 12micro-sec latency)
-ATM
-Myrinet (1.2Gbps)
-Digital Memory Channel
-FDDI


#Components for Clusters:

#Solaris


-UNIX-based multithreading and multiuser OS
support Intel x86 & SPARC-based platforms
-Real-time scheduling feature critical for multimedia applications
-Support two kinds of threads
-Light Weight Processes (LWPs)
-User level thread
-Support both BSD and several non-BSD file system
-CacheFS
-AutoClient
-TmpFS: uses main memory to contain a file system
-Proc file system
-Volume file system
-Support distributed computing & is able to store & retrieve distributed information
OpenWindows allows application to be run on remote systems
.

1 comments:

Power Protection said...

Thanks for sharing the useful information. Your article is very informative and very helpful for me. I have learned a lot going through your blog.

Post a Comment