HIGH PERFORMANCE CLUSTER COMPUTING
#The Need for Alternative Supercomputing Resources:
-Cannot afford to buy “Big Iron” machines
-due to their high cost and short life span.
-cut-down of funding
-don’t “fit” better into today's funding model.
#Paradox:
-time required to develop a parallel application for solving GCA is equal to:
half Life of Parallel Supercomputers.
#Clusters are best-alternative:
-Supercomputing-class commodity components are available
-They “fit” very well with today’s/future funding model.
-Can leverage upon future technological advances
-VLSI, CPUs, Networks, Disk, Memory, Cache, OS, programming tools, applications,...
#There are 3 ways to improve performance:
-Using faster hardware
-Optimized algorithms and techniques used to solve computational tasks
-Multiple computers to solve a particular task..
# Rapid technical advances
-the recent advances in VLSI technology
- software technology
-OS, PL, development methodologies, & tools
-grand challenge applications have become the main driving force.
#Parallel computing
-one of the best ways to overcome the speed bottleneck of a single processor
-good price/performance ratio of a small cluster-based parallel computer.
#Scalable Parallel Computer Architectures:
-Massively Parallel Processors (MPP)
-Symmetric Multiprocessors (SMP)
-Cache-Coherent Nonuniform Memory Access (CC-NUMA)
-Distributed Systems
-Clusters.
#MPP
-A large parallel processing system with a shared-nothing architecture
-Consist of several hundred nodes with a high-speed interconnection network/switch
-Each node consists of a main memory & one or more processors
-Runs a separate copy of the OS.
#SMP
-2-64 processors today
-Shared-everything architecture
-All processors share all the global resources available
-Single copy of the OS runs on these systems.
#CC-NUMA
-a scalable multiprocessor system having a -cache-coherent nonuniform memory access architecture
-every processor has a global view of all of the memory.
#Distributed systems
-considered conventional networks of independent computers
-have multiple system images as each node runs its own OS
-the individual machines could be combinations of MPPs, SMPs, clusters, & individual computers.
#Clusters
-a collection of workstations of PCs that are interconnected by a high-speed network
-work as an integrated collection of resources
-have a single system image spanning all its nodes.
#Towards Low Cost Parallel Computing:
#Parallel processing:
-linking together 2 or more computers to jointly solve some computational problem
-since the early 1990s, an increasing trend to move away from expensive and specialized proprietary parallel supercomputers towards networks of workstations
-the rapid improvement in the availability of commodity high performance components for workstations and networks
-Low-cost commodity supercomputing
- from specialized traditional supercomputing platforms to cheaper, general purpose systems consisting of loosely coupled components built up from single or multiprocessor PCs or workstations
-need to standardization of many of the tools and utilities used by parallel applications (ex) MPI, HPF
#Cluster Computer and its Architecture:
-A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource.
#A node:
-a single or multiprocessor system with memory, I/O facilities, & OS
-generally 2 or more computers (nodes) connected together
-in a single cabinet, or physically separated & connected via a LAN
-appear as a single system to users and applications
provide a cost-effective way to gain features and benefits.
#Prominent Components of Cluster Computers :
A:
-Multiple High Performance Computers
PCs
-Workstations
-SMPs (CLUMPS)
-Distributed HPC Systems leading to Metacomputing.
B:
#State of the art Operating Systems:
-OpenSolaris and solaris.
-Linux (Beowulf)
-Microsoft NT (Illinois HPVM)
-IBM AIX (IBM SP2)
-HP UX (Illinois - PANDA)
-Mach (Microkernel based OS) (CMU)
-Cluster Operating Systems (Solaris MC, SCO Unixware, MOSIX (academic project)
-OS gluing layers (Berkeley Glunix)
C:
#High Performance Networks/Switches:
-Ethernet (10Mbps),
-Fast Ethernet (100Mbps),
-Gigabit Ethernet (1Gbps)
-SCI (Dolphin - MPI- 12micro-sec latency)
-ATM
-Myrinet (1.2Gbps)
-Digital Memory Channel
-FDDI
#Components for Clusters:
#Solaris
-UNIX-based multithreading and multiuser OS
support Intel x86 & SPARC-based platforms
-Real-time scheduling feature critical for multimedia applications
-Support two kinds of threads
-Light Weight Processes (LWPs)
-User level thread
-Support both BSD and several non-BSD file system
-CacheFS
-AutoClient
-TmpFS: uses main memory to contain a file system
-Proc file system
-Volume file system
-Support distributed computing & is able to store & retrieve distributed information
OpenWindows allows application to be run on remote systems.
#The Need for Alternative Supercomputing Resources:
-Cannot afford to buy “Big Iron” machines
-due to their high cost and short life span.
-cut-down of funding
-don’t “fit” better into today's funding model.
#Paradox:
-time required to develop a parallel application for solving GCA is equal to:
half Life of Parallel Supercomputers.
#Clusters are best-alternative:
-Supercomputing-class commodity components are available
-They “fit” very well with today’s/future funding model.
-Can leverage upon future technological advances
-VLSI, CPUs, Networks, Disk, Memory, Cache, OS, programming tools, applications,...
#There are 3 ways to improve performance:
-Using faster hardware
-Optimized algorithms and techniques used to solve computational tasks
-Multiple computers to solve a particular task..
# Rapid technical advances
-the recent advances in VLSI technology
- software technology
-OS, PL, development methodologies, & tools
-grand challenge applications have become the main driving force.
#Parallel computing
-one of the best ways to overcome the speed bottleneck of a single processor
-good price/performance ratio of a small cluster-based parallel computer.
#Scalable Parallel Computer Architectures:
-Massively Parallel Processors (MPP)
-Symmetric Multiprocessors (SMP)
-Cache-Coherent Nonuniform Memory Access (CC-NUMA)
-Distributed Systems
-Clusters.
#MPP
-A large parallel processing system with a shared-nothing architecture
-Consist of several hundred nodes with a high-speed interconnection network/switch
-Each node consists of a main memory & one or more processors
-Runs a separate copy of the OS.
#SMP
-2-64 processors today
-Shared-everything architecture
-All processors share all the global resources available
-Single copy of the OS runs on these systems.
#CC-NUMA
-a scalable multiprocessor system having a -cache-coherent nonuniform memory access architecture
-every processor has a global view of all of the memory.
#Distributed systems
-considered conventional networks of independent computers
-have multiple system images as each node runs its own OS
-the individual machines could be combinations of MPPs, SMPs, clusters, & individual computers.
#Clusters
-a collection of workstations of PCs that are interconnected by a high-speed network
-work as an integrated collection of resources
-have a single system image spanning all its nodes.
#Towards Low Cost Parallel Computing:
#Parallel processing:
-linking together 2 or more computers to jointly solve some computational problem
-since the early 1990s, an increasing trend to move away from expensive and specialized proprietary parallel supercomputers towards networks of workstations
-the rapid improvement in the availability of commodity high performance components for workstations and networks
-Low-cost commodity supercomputing
- from specialized traditional supercomputing platforms to cheaper, general purpose systems consisting of loosely coupled components built up from single or multiprocessor PCs or workstations
-need to standardization of many of the tools and utilities used by parallel applications (ex) MPI, HPF
#Cluster Computer and its Architecture:
-A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource.
#A node:
-a single or multiprocessor system with memory, I/O facilities, & OS
-generally 2 or more computers (nodes) connected together
-in a single cabinet, or physically separated & connected via a LAN
-appear as a single system to users and applications
provide a cost-effective way to gain features and benefits.
#Prominent Components of Cluster Computers :
A:
-Multiple High Performance Computers
PCs
-Workstations
-SMPs (CLUMPS)
-Distributed HPC Systems leading to Metacomputing.
B:
#State of the art Operating Systems:
-OpenSolaris and solaris.
-Linux (Beowulf)
-Microsoft NT (Illinois HPVM)
-IBM AIX (IBM SP2)
-HP UX (Illinois - PANDA)
-Mach (Microkernel based OS) (CMU)
-Cluster Operating Systems (Solaris MC, SCO Unixware, MOSIX (academic project)
-OS gluing layers (Berkeley Glunix)
C:
#High Performance Networks/Switches:
-Ethernet (10Mbps),
-Fast Ethernet (100Mbps),
-Gigabit Ethernet (1Gbps)
-SCI (Dolphin - MPI- 12micro-sec latency)
-ATM
-Myrinet (1.2Gbps)
-Digital Memory Channel
-FDDI
#Components for Clusters:
#Solaris
-UNIX-based multithreading and multiuser OS
support Intel x86 & SPARC-based platforms
-Real-time scheduling feature critical for multimedia applications
-Support two kinds of threads
-Light Weight Processes (LWPs)
-User level thread
-Support both BSD and several non-BSD file system
-CacheFS
-AutoClient
-TmpFS: uses main memory to contain a file system
-Proc file system
-Volume file system
-Support distributed computing & is able to store & retrieve distributed information
OpenWindows allows application to be run on remote systems.
1 comments:
Thanks for sharing the useful information. Your article is very informative and very helpful for me. I have learned a lot going through your blog.
Post a Comment