Up: Designing and Building Parallel Programs
Previous: References
- *Lisp
- Chapter Notes
- Abstract processors in HPF
- 7.3.1 Processors
- Actor model
- Chapter Notes
- Agglomeration
- 2.1 Methodical Design, (, )
- and granularity
- 2.4.1 Increasing Granularity, Surface-to-Volume Effects.
- and granularity
- 2.4.1 Increasing Granularity, Surface-to-Volume Effects.
- design checklist
- 2.4.4 Agglomeration Design Checklist
- for atmosphere model
- Agglomeration.
- for floorplan optimization
- Agglomeration.
- for Fock matrix problem
- Communication and Agglomeration.
- in data-parallel model
- 7.1.3 Design
- AIMS performance tool
- 9.4.7 AIMS, Chapter Notes
- Amdahl's law
- application to HPF
- 7.7.2 Sequential Bottlenecks
- definition
- 3.2.1 Amdahl's Law, Chapter Notes
- definition
- 3.2.1 Amdahl's Law, Chapter Notes
- Applied Parallel Research
- Chapter Notes
- ARPANET
- Chapter Notes
- Asymptotic analysis
- limitations of
- 3.2.3 Asymptotic Analysis, 3.2.3 Asymptotic Analysis
- limitations of
- 3.2.3 Asymptotic Analysis, 3.2.3 Asymptotic Analysis
- reference
- Chapter Notes
- Asynchronous communication
- 2.3.4 Asynchronous Communication
- in CC++
- 5.6 Asynchronous Communication
- in FM
- 6.5 Asynchronous Communication
- in MPI
- 8.4 Asynchronous Communication
- Asynchronous Transfer Mode
- 1.2.2 Other Machine Models
- Atmosphere model
- basic equations
- 2.6.1 Atmosphere Model Background
- description
- (, )
- description
- (, )
- parallel algorithms
- (, )
- parallel algorithms
- (, )
- references
- Chapter Notes
- BBN Butterfly
- Chapter Notes
- Bisection bandwidth
- Exercises
- Bisection width
- Exercises, Chapter Notes
- Bitonic mergesort
- Chapter Notes
- Bottlenecks in HPF
- 7.7.2 Sequential Bottlenecks
- Branch-and-bound search
- description
- 2.7.1 Floorplan Background, Chapter Notes
- description
- 2.7.1 Floorplan Background, Chapter Notes
- in MPI
- 8.1 The MPI Programming
- Breadth-first search
- Partition.
- Bridge construction problem
- definition
- 1.3.1 Tasks and Channels
- determinism
- 1.3.1 Tasks and Channels
- in CC++
- 5.2 CC++
Introduction
- in Fortran M
- 6.1 FM Introduction, 6.1 FM Introduction, 6.4.3 Dynamic Channel Structures
- in Fortran M
- 6.1 FM Introduction, 6.1 FM Introduction, 6.4.3 Dynamic Channel Structures
- in Fortran M
- 6.1 FM Introduction, 6.1 FM Introduction, 6.4.3 Dynamic Channel Structures
- in MPI
- 8.2 MPI Basics
- Bubblesort
- Exercises
- Bucketsort
- Chapter Notes
- Bus-based networks
- Bus-based Networks.
- Busy waiting strategy
- 6.5 Asynchronous Communication
- Butterfly
- bandwidth competition on
- Multistage Interconnection Networks.
- description
- Replicating Computation.
- hypercube formulation
- Hypercube Network.
- C*
- Chapter Notes, 7 High Performance Fortran, Chapter Notes
- C++
- Chapter Notes
- classes
- 5.1.2 Classes
- constructor functions
- 5.1.2 Classes
- default constructors
- 5.1.2 Classes
- inheritance
- 5.1.3 Inheritance, 5.1.3 Inheritance
- inheritance
- 5.1.3 Inheritance, 5.1.3 Inheritance
- member functions
- 5.1.2 Classes
- overloading
- 5.1.1 Strong Typing and
- protection
- 5.1.2 Classes
- virtual functions
- 5.1.3 Inheritance
- Cache effect
- 3.6.2 Speedup Anomalies
- Cache memory
- 1.2.2 Other Machine Models, Bus-based Networks.
- CC++
- Part II: Tools
- asynchronous communication
- 5.6 Asynchronous Communication
- basic abstractions
- 5.2 CC++
Introduction
- channel communication
- 5.5.2 Synchronization
- communication costs
- 5.10 Performance Issues, 5.10 Performance Issues
- communication costs
- 5.10 Performance Issues, 5.10 Performance Issues
- communication structures
- 5.5 Communication
- compiler optimization
- 5.10 Performance Issues
- concurrency
- 5.3 Concurrency
- library building
- 5.11 Case Study: Channel
- locality
- 5.4 Locality
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- modularity
- 5.9 Modularity
- modularity
- 5.9 Modularity
- modularity
- 5.9 Modularity
- modularity
- 5.9 Modularity
- nondeterministic interactions
- 5.7 Determinism
- sequential composition
- 5.9 Modularity, 5.9 Modularity
- sequential composition
- 5.9 Modularity, 5.9 Modularity
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- threads
- 5.2 CC++
Introduction
- tutorial
- Chapter Notes
- unstructured parallelism
- 5.3 Concurrency
- CHAMMP climate modeling program
- Chapter Notes
- Channels
- 1.3.1 Tasks and Channels
- and data dependencies
- 1.3.1 Tasks and Channels
- connecting outport/inport pairs
- 1.3.1 Tasks and Channels
- creation in Fortran M
- 6.3.1 Creating Channels
- dynamic in Fortran M
- 6.4.3 Dynamic Channel Structures
- for argument passing in Fortran M
- 6.7 Argument Passing
- in communication
- 2.3 Communication
- in CSP
- Chapter Notes
- Checkpointing
- 3.8 Input/Output, Chapter Notes
- CHIMP
- Chapter Notes
- Classes in C++
- 5.1.2 Classes
- Climate modeling
- 1.1.1 Trends in Applications, 2.2.2 Functional Decomposition, 2.6 Case Study: Atmosphere , 9.4.1 Paragraph
- in CC++
- 5.8.2 Mapping Threads to
- in Fortran M
- 6.8.3 Submachines
- in MPI
- 8.8 Case Study: Earth
- Clock synchronization
- 9.3.2 Traces, Chapter Notes
- CM Fortran
- Chapter Notes
- Collaborative work environments
- 1.1.1 Trends in Applications
- Collective communication
- 8.3 Global Operations, 9.4.2 Upshot
- Collocation of arrays
- 7.3.2 Alignment
- Combining scatter
- 7.6.3 HPF Features Not
- Communicating Sequential Processes
- Chapter Notes
- Communication
- (, )
- and channels
- 2.3 Communication
- collective
- 8.1 The MPI Programming , 8.3 Global Operations
- collective
- 8.1 The MPI Programming , 8.3 Global Operations
- design checklist
- 2.3.5 Communication Design Checklist
- disadvantages of local
- 2.3.2 Global Communication
- for atmosphere model
- Communication.
- for floorplan optimization
- Communication.
- for Fock matrix problem
- Communication and Agglomeration.
- in CC++
- 5.5 Communication
- in data-parallel model
- 7.1.3 Design
- in Fortran M
- 6.3 Communication
- in MPI
- 8.1 The MPI Programming
- synchronous
- 6.4.3 Dynamic Channel Structures, 8.6.2 MPI Features Not
- synchronous
- 6.4.3 Dynamic Channel Structures, 8.6.2 MPI Features Not
- Communication costs
- Communication Time.
- bandwidth competition
- 3.7 A Refined Communication
- in CC++
- 5.10 Performance Issues
- in HPF
- 7.7.3 Communication Costs
- in MPI
- 8.7 Performance Issues, 8.7 Performance Issues
- in MPI
- 8.7 Performance Issues, 8.7 Performance Issues
- of unaligned array mapping
- 7.7.3 Communication Costs
- with cyclic distribution
- 7.7.3 Communication Costs
- Communication patterns
- 2.3 Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- dynamic
- 2.3 Communication, 2.3.3 Unstructured and Dynamic
- dynamic
- 2.3 Communication, 2.3.3 Unstructured and Dynamic
- local
- 2.3 Communication
- many-to-many
- 6.4.2 Many-to-Many Communication
- many-to-one
- 1.4.4 Parameter Study, 6.4.1 Many-to-One Communication
- many-to-one
- 1.4.4 Parameter Study, 6.4.1 Many-to-One Communication
- point-to-point
- 8.1 The MPI Programming
- static
- 2.3 Communication
- structured
- 2.3 Communication
- synchronous
- 2.3 Communication, 6.4.3 Dynamic Channel Structures
- synchronous
- 2.3 Communication, 6.4.3 Dynamic Channel Structures
- unstructured
- 2.3.3 Unstructured and Dynamic , 6.4 Unstructured Communication
- unstructured
- 2.3.3 Unstructured and Dynamic , 6.4 Unstructured Communication
- Communication time
- Communication Time.
- Communication/computation ratio
- Surface-to-Volume Effects.
- Communicators
- seeMPI
- Competition for bandwidth
- examples
- Multistage Interconnection Networks., Multistage Interconnection Networks.
- examples
- Multistage Interconnection Networks., Multistage Interconnection Networks.
- idealized model of
- 3.7.1 Competition for Bandwidth
- impact
- 3.7 A Refined Communication
- Compilers
- data-parallel
- 7.1.3 Design, 7.7.1 HPF Compilation
- data-parallel
- 7.1.3 Design, 7.7.1 HPF Compilation
- for CC++
- 5.10 Performance Issues
- for Fortran M
- 6.10 Performance Issues
- for HPF
- 7.7.1 HPF Compilation, Chapter Notes
- for HPF
- 7.7.1 HPF Compilation, Chapter Notes
- Composition
- concurrent
- 4.2 Modularity and Parallel , 4.2.4 Concurrent Composition
- concurrent
- 4.2 Modularity and Parallel , 4.2.4 Concurrent Composition
- definition
- 4 Putting Components Together
- parallel
- 4.2 Modularity and Parallel
- sequential
- 4.2 Modularity and Parallel , 4.2.2 Sequential Composition
- sequential
- 4.2 Modularity and Parallel , 4.2.2 Sequential Composition
- Compositional C++
- seeCC++
- Computation time
- Computation Time.
- Computational chemistry
- 2.8 Case Study: Computational , Chapter Notes
- Computational geometry
- 12 Further Reading
- Computer architecture
- 1.2.2 Other Machine Models, 3.7.2 Interconnection Networks
- references
- Chapter Notes, Chapter Notes, 12 Further Reading
- references
- Chapter Notes, Chapter Notes, 12 Further Reading
- references
- Chapter Notes, Chapter Notes, 12 Further Reading
- trends
- 1.1.4 Summary of Trends
- Computer performance improvement
- 1.1.2 Trends in Computer , 1.1.2 Trends in Computer
- Computer trends
- 1.1.4 Summary of Trends
- Computer vision
- Chapter Notes, 12 Further Reading
- Computer-aided diagnosis
- 1.1.1 Trends in Applications
- Concert C
- Chapter Notes
- Concurrency
- explicit vs. implicit
- 7.1.1 Concurrency
- in CC++
- 5.3 Concurrency
- in data-parallel programs
- 7.1.1 Concurrency
- in Fortran M
- 6.2 Concurrency
- parallel software requirement
- 1.1.2 Trends in Computer
- Concurrent C
- Chapter Notes
- Concurrent composition
- 4.2 Modularity and Parallel , 4.2.4 Concurrent Composition
- benefits
- 4.2.4 Concurrent Composition, 4.2.4 Concurrent Composition
- benefits
- 4.2.4 Concurrent Composition, 4.2.4 Concurrent Composition
- cost
- 4.2.4 Concurrent Composition
- example
- 4.2.4 Concurrent Composition
- in CC++
- 5.8.2 Mapping Threads to
- in Fortran M
- 6.8.3 Submachines
- tuple space example
- 4.5 Case Study: Tuple
- Concurrent Computation Project
- 12 Further Reading
- Concurrent data structures
- 12 Further Reading
- Concurrent logic programming
- Chapter Notes
- Conferences in parallel computing
- 12 Further Reading
- Conformality
- definition
- 7.1.1 Concurrency
- in Fortran M
- 6.3.1 Creating Channels
- of array sections
- 7.2.1 Array Assignment Statement
- Constructor functions in C++
- 5.1.2 Classes
- Convolution algorithm
- application in image processing
- 4.4 Case Study: Convolution
- components
- 4.4.1 Components
- parallel 2-D FFTs
- 4.4.1 Components
- parallel composition
- 4.4.2 Composing Components
- sequential composition
- 4.4.2 Composing Components
- COOL
- Chapter Notes
- Cosmic Cube
- Chapter Notes, Chapter Notes, Chapter Notes
- Counters
- 9.1 Performance Analysis, 9.2.2 Counters
- Cray T3D
- 1.2.2 Other Machine Models, Chapter Notes
- Crossbar switching network
- Crossbar Switching Network.
- Cycle time trends
- 1.1.2 Trends in Computer
- Cyclic mapping
- Cyclic Mappings., Mapping., Mapping., Chapter Notes
- in HPF
- 7.3.3 Distribution, 7.7.3 Communication Costs, 7.8 Case Study: Gaussian
- in HPF
- 7.3.3 Distribution, 7.7.3 Communication Costs, 7.8 Case Study: Gaussian
- in HPF
- 7.3.3 Distribution, 7.7.3 Communication Costs, 7.8 Case Study: Gaussian
- Data collection
- (, )
- basic techniques
- 9.1 Performance Analysis
- counters
- 9.2.2 Counters
- process
- 9.2.4 Summary of Data
- traces
- 9.2.3 Traces
- Data decomposition
- seeDomain decomposition
- Data dependency
- 1.3.1 Tasks and Channels
- Data distribution
- at module boundaries
- 4.2.1 Data Distribution
- dynamic
- 7.6.3 HPF Features Not
- in data-parallel languages
- 7.1.2 Locality
- in HPF
- (, )
- in HPF
- (, )
- Data distribution neutrality
- benefits
- 4.2.1 Data Distribution
- example
- (, )
- example
- (, )
- in ScaLAPACK
- 4.2.2 Sequential Composition
- in SPMD libraries
- Chapter Notes
- Data fitting
- 3.5.3 Fitting Data to
- Data parallelism
- 1.3.2 Other Programming Models, 7 High Performance Fortran
- and Fortran 90
- 7.1.4 Data-Parallel Languages, 7.2.2 Array Intrinsic Functions
- and Fortran 90
- 7.1.4 Data-Parallel Languages, 7.2.2 Array Intrinsic Functions
- and HPF
- 7.1.4 Data-Parallel Languages
- and modular design
- 7.1.3 Design
- and task parallelism
- Chapter Notes
- for irregular problems
- Chapter Notes
- languages
- 7.1.4 Data-Parallel Languages, 9.3.3 Data-Parallel Languages
- languages
- 7.1.4 Data-Parallel Languages, 9.3.3 Data-Parallel Languages
- Data reduction
- 9.3.1 Profile and Counts, 9.3.2 Traces
- Data replication
- 3.9.3 Shortest-Path Algorithms Summary
- Data transformation
- 9.1 Performance Analysis
- Data visualization
- 9.1 Performance Analysis, 9.3.2 Traces
- Data-parallel C
- Chapter Notes, Chapter Notes
- Data-parallel languages
- 7.1.4 Data-Parallel Languages, 9.3.3 Data-Parallel Languages
- Data-parallel model
- 1.3.2 Other Programming Models, 7.1.3 Design, 7.1.3 Design, 7.1.3 Design
- Databases
- Chapter Notes, Chapter Notes, 4.5.1 Application
- Deadlock detection
- 12 Further Reading
- Decision support
- 1.1.1 Trends in Applications
- Dense matrix algorithms
- 12 Further Reading
- Depth-first search
- Agglomeration.
- Design checklists
- agglomeration
- 2.4.4 Agglomeration Design Checklist
- communication
- 2.3.5 Communication Design Checklist
- mapping
- 2.5.3 Mapping Design Checklist
- modular design
- Design checklist.
- partitioning
- 2.2.3 Partitioning Design Checklist
- Determinism
- 1.3.1 Tasks and Channels
- advantages
- 1.3.1 Tasks and Channels, Chapter Notes
- advantages
- 1.3.1 Tasks and Channels, Chapter Notes
- in CC++
- 5.7 Determinism
- in Fortran M
- 6.6 Determinism
- in MPI
- 8.2.2 Determinism
- Diagonalization
- Exercises, 9.4.2 Upshot
- Diameter of network
- 3.7.1 Competition for Bandwidth
- Dijkstra's algorithm
- 3.9.2 Dijkstra's Algorithm, 3.9.3 Shortest-Path Algorithms Summary
- DINO
- Chapter Notes
- DISCO
- Communication and Agglomeration.
- Distributed computing
- 1.1.3 Trends in Networking
- Distributed data structures
- Fock matrix
- 2.8 Case Study: Computational
- for load balancing
- Decentralized Schemes.
- implementation
- (, )
- implementation
- (, )
- in CC++
- 5.12 Case Study: Fock
- in Fortran M
- 6.11 Case Study: Fock
- in MPI
- 8.4 Asynchronous Communication
- tuple space
- 4.5 Case Study: Tuple
- Divide-and-conquer
- Uncovering Concurrency: Divide
- Domain decomposition
- 2.2 Partitioning, 2.2.1 Domain Decomposition
- communication requirements
- 2.3 Communication
- for atmosphere model
- 2.6 Case Study: Atmosphere
- for Fock matrix problem
- Partition.
- Efficiency
- 3.3.2 Efficiency and Speedup, 3.3.2 Efficiency and Speedup, 3.3.2 Efficiency and Speedup
- Embarrassingly parallel problems
- 1.4.4 Parameter Study
- Entertainment industry
- 1.1.1 Trends in Applications
- Environmental enquiry;tex2html_html_special_mark_quot;in MPI
- 8.6.2 MPI Features Not
- Ethernet
- 1.2.2 Other Machine Models, Chapter Notes
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- Event traces
- 9.1 Performance Analysis, 9.3.2 Traces
- Execution profile
- 3.4.3 Execution Profiles, 3.6 Evaluating Implementations
- Execution time
- (, )
- as performance metric
- 3.3 Developing Models
- limitations of
- 3.3.2 Efficiency and Speedup
- Exhaustive search
- 2.7.1 Floorplan Background
- Experimental calibration
- 3.5.1 Experimental Design, 3.5.1 Experimental Design, 3.5.3 Fitting Data to
- Express
- Part II: Tools, 8 Message Passing Interface, Chapter Notes, Chapter Notes
- Fairness
- in CC++
- 5.10 Performance Issues
- in Fortran M
- 6.10 Performance Issues
- Fast Fourier transform
- 4.4 Case Study: Convolution
- in convolution
- (, )
- in convolution
- (, )
- in HPF
- 7.4.2 The INDEPENDENT Directive
- performance
- Multistage Interconnection Networks.
- using hypercube
- Chapter Notes
- Fine-grained decomposition
- 2.2 Partitioning
- Finite difference algorithm
- computation cost
- 3.5.3 Fitting Data to
- efficiency
- 3.3.2 Efficiency and Speedup
- execution time
- Idle Time.
- in CC++
- 5.9 Modularity
- in Fortran 90
- 7.2.2 Array Intrinsic Functions
- in Fortran M
- 6.9 Modularity
- in HPF
- 7.3.3 Distribution
- in MPI
- 8.3.3 Reduction Operations
- isoefficiency analysis
- 3.4.2 Scalability with Scaled
- Finite element method
- 2.3.3 Unstructured and Dynamic
- Fixed problem analysis
- 3.4.1 Scalability with Fixed
- Floorplan optimization problem
- description
- (, )
- description
- (, )
- parallel algorithms
- (, )
- parallel algorithms
- (, )
- Floyd's algorithm
- (, ), (, )
- Fock matrix problem
- algorithms for
- Chapter Notes
- description
- (, )
- description
- (, )
- in CC++
- 5.12 Case Study: Fock
- in Fortran M
- 6.11 Case Study: Fock
- in MPI
- 8.4 Asynchronous Communication, 8.4 Asynchronous Communication, 8.6.1 Derived Datatypes
- in MPI
- 8.4 Asynchronous Communication, 8.4 Asynchronous Communication, 8.6.1 Derived Datatypes
- in MPI
- 8.4 Asynchronous Communication, 8.4 Asynchronous Communication, 8.6.1 Derived Datatypes
- performance
- 9.4.2 Upshot
- Fortran 90
- array assignment
- 7.2.1 Array Assignment Statement, 7.4 Concurrency
- array assignment
- 7.2.1 Array Assignment Statement, 7.4 Concurrency
- array intrinsics
- 7.2.2 Array Intrinsic Functions
- as basis for HPF
- 7.1.4 Data-Parallel Languages
- conformality
- 7.1.1 Concurrency, 7.2.1 Array Assignment Statement
- conformality
- 7.1.1 Concurrency, 7.2.1 Array Assignment Statement
- CSHIFT function
- 7.2.2 Array Intrinsic Functions
- explicit parallelism in
- 7.1.1 Concurrency
- finite difference problem
- 7.2.2 Array Intrinsic Functions
- implicit parallelism in
- 7.1.1 Concurrency
- inquiry functions
- 7.6.1 System Inquiry Intrinsic
- limitations as data-parallel language
- 7.2.2 Array Intrinsic Functions
- SIZE function
- 7.6.1 System Inquiry Intrinsic
- transformational functions
- 7.2.2 Array Intrinsic Functions
- WHERE
- 7.2.1 Array Assignment Statement
- Fortran D
- Chapter Notes
- Fortran M
- Part II: Tools, (, )
- and SPMD computations
- 6.9 Modularity
- argument passing
- 6.7 Argument Passing
- busy waiting strategy
- 6.5 Asynchronous Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- compiler optimization
- 6.10 Performance Issues
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- conformality
- 6.3.1 Creating Channels
- determinism
- 6.6 Determinism, 6.7.1 Copying and Determinism
- determinism
- 6.6 Determinism, 6.7.1 Copying and Determinism
- distribution of data
- 6.5 Asynchronous Communication
- list of extensions
- 6.1 FM Introduction
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- message passing
- 6.9 Modularity, 6.9 Modularity, 6.9 Modularity
- message passing
- 6.9 Modularity, 6.9 Modularity, 6.9 Modularity
- message passing
- 6.9 Modularity, 6.9 Modularity, 6.9 Modularity
- modularity
- 6.1 FM Introduction
- modularity
- 6.1 FM Introduction
- modularity
- 6.1 FM Introduction
- modularity
- 6.1 FM Introduction
- performance analysis
- 6.10 Performance Issues
- port variables
- 6.2.1 Defining Processes
- process creation
- 6.2.2 Creating Processes
- quick reference
- 6.12 Summary, 6.12 Summary
- quick reference
- 6.12 Summary, 6.12 Summary
- sequential composition
- 6.9 Modularity
- tree-structured computation
- 6.3.3 Receiving Messages
- Fujitsu VPP 500
- Crossbar Switching Network.
- Functional decomposition
- (, )
- appropriateness
- 2.2.2 Functional Decomposition
- communication requirements
- 2.3 Communication
- complement to domain decomposition
- 2.2.2 Functional Decomposition
- design complexity reduced by
- 2.2.2 Functional Decomposition
- for climate model
- 2.2.2 Functional Decomposition
- for Fock matrix problem
- Partition.
- Functional programming
- Chapter Notes, 12 Further Reading
- Gantt chart
- 9.3.2 Traces, 9.4.1 Paragraph, 9.4.2 Upshot
- Gauge performance tool
- 9.4.4 Gauge, Chapter Notes
- Gauss-Seidel update
- 2.3.1 Local Communication, 2.3.1 Local Communication
- Gaussian elimination
- 7.8 Case Study: Gaussian , 9.3.3 Data-Parallel Languages
- Genetic sequences
- 4.5.1 Application
- GIGAswitch
- Crossbar Switching Network.
- Global communication
- 2.3.2 Global Communication
- Grand Challenge problems
- Chapter Notes
- Granularity
- 2.2 Partitioning
- agglomeration used to increase
- 2.4 Agglomeration
- flexibility related to
- 2.2 Partitioning
- of modular programs
- 4.3 Performance Analysis
- Handles in MPI
- 8.2.1 Language Bindings
- Hash tables
- 4.5.2 Implementation
- High Performance Fortran
- seeHPF
- Histograms
- 9.3.1 Profile and Counts
- HPF
- Part II: Tools, (, )
- abstract processors
- 7.3.1 Processors
- advantages
- 7.9 Summary
- collocation of arrays
- 7.3.2 Alignment
- compilation
- 7.7.1 HPF Compilation
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- extrinsic functions
- 7.6.3 HPF Features Not
- language specification
- Chapter Notes
- mapping inquiry functions
- 7.6.3 HPF Features Not
- modularity
- 7.5 Dummy Arguments and
- modularity
- 7.5 Dummy Arguments and
- pure functions
- 7.6.3 HPF Features Not
- remapping of arguments
- Strategy 1: Remap
- sequence association
- 7.6.2 Storage and Sequence
- storage association
- 7.6.2 Storage and Sequence
- subset (official)
- 7.1.4 Data-Parallel Languages
- system inquiry functions
- 7.6.1 System Inquiry Intrinsic
- Hypercube algorithms
- (, )
- all-to-all communication
- 11 Hypercube Algorithms
- matrix transposition
- 11.3 Matrix Transposition
- parallel mergesort
- 11.4 Mergesort
- template for
- 11 Hypercube Algorithms
- vector broadcast
- 11.2 Vector Reduction
- vector reduction
- 11.2 Vector Reduction, 11.2 Vector Reduction
- vector reduction
- 11.2 Vector Reduction, 11.2 Vector Reduction
- Hypercube network
- Hypercube Network.
- I/O, parallel
- applications requiring
- 3.8 Input/Output, Chapter Notes
- applications requiring
- 3.8 Input/Output, Chapter Notes
- performance issues
- 3.8 Input/Output, 3.8 Input/Output
- performance issues
- 3.8 Input/Output, 3.8 Input/Output
- two-phase strategy
- 3.8 Input/Output, Chapter Notes
- two-phase strategy
- 3.8 Input/Output, Chapter Notes
- IBM RP3
- Chapter Notes
- IBM SP
- Chapter Notes
- Idle time
- Idle Time., 4.3 Performance Analysis
- Image processing
- Exercises, 4.4 Case Study: Convolution
- Immersive virtual environments
- 9.4.3 Pablo
- Incremental parallelization
- 3.2.1 Amdahl's Law
- Information hiding
- Ensure that modules
- Inheritance in C++
- 5.1.3 Inheritance
- Intel DELTA
- 3.6.2 Speedup Anomalies, Multistage Interconnection Networks., Multistage Interconnection Networks., Chapter Notes
- Intel iPSC
- Chapter Notes, Chapter Notes
- Intel Paragon
- 1.2.2 Other Machine Models, Chapter Notes, 9.4.5 ParAide
- Intent declarations
- 6.7.2 Avoiding Copying
- Interconnection Networks
- seeNetworks
- IPS-2 performance tool
- Chapter Notes
- Isoefficiency
- 3.4.2 Scalability with Scaled , Chapter Notes
- J machine
- Chapter Notes
- Jacobi update
- 2.3.1 Local Communication
- Journals in parallel computing
- 12 Further Reading
- Kali
- Chapter Notes
- Latency
- 3.1 Defining Performance
- Leapfrog method
- 10.3.2 The Leapfrog Method, 10.3.2 The Leapfrog Method, 10.3.3 Modified Leapfrog
- Least-squares fit
- 3.5.3 Fitting Data to , 3.5.3 Fitting Data to
- scaled
- 3.5.3 Fitting Data to
- simple
- 3.5.3 Fitting Data to
- Linda
- Chapter Notes
- and tuple space
- 4.5 Case Study: Tuple , Chapter Notes
- and tuple space
- 4.5 Case Study: Tuple , Chapter Notes
- types of parallelism with
- Chapter Notes
- Load balancing
- cyclic methods
- Cyclic Mappings.
- dynamic methods
- 2.5 Mapping
- local methods
- 2.5 Mapping, Local Algorithms.
- local methods
- 2.5 Mapping, Local Algorithms.
- manager/worker method
- Manager/Worker.
- probabilistic methods
- 2.5 Mapping, Probabilistic Methods.
- probabilistic methods
- 2.5 Mapping, Probabilistic Methods.
- recursive bisection methods
- Recursive Bisection.
- Local area network
- 1.2.2 Other Machine Models
- Local communication
- definition
- 2.3.1 Local Communication
- finite difference example
- (, )
- finite difference example
- (, )
- Locality
- and task abstraction
- 1.3.1 Tasks and Channels
- definition
- 1.2.1 The Multicomputer
- in CC++
- 5.4 Locality
- in data-parallel programs
- 7.1.2 Locality, 7.8 Case Study: Gaussian
- in data-parallel programs
- 7.1.2 Locality, 7.8 Case Study: Gaussian
- in multicomputers
- 1.2.1 The Multicomputer
- in PRAM model
- 1.2.2 Other Machine Models
- Locks
- 1.3.2 Other Programming Models
- Machine parameters
- Communication Time.
- Mapping
- 2.1 Methodical Design, (, )
- design rules
- 2.5.3 Mapping Design Checklist
- in CC++
- 5.8 Mapping, 5.8.2 Mapping Threads to
- in CC++
- 5.8 Mapping, 5.8.2 Mapping Threads to
- in data-parallel model
- 7.1.3 Design
- in Fortran M
- 6.8 Mapping
- Mapping independence
- 1.3.1 Tasks and Channels
- MasPar MP
- 1.2.2 Other Machine Models
- Matrix multiplication
- (, )
- 1-D decomposition
- 4.6.1 Parallel Matrix-Matrix Multiplication
- 2-D decomposition
- 4.6.1 Parallel Matrix-Matrix Multiplication, 4.6.1 Parallel Matrix-Matrix Multiplication
- 2-D decomposition
- 4.6.1 Parallel Matrix-Matrix Multiplication, 4.6.1 Parallel Matrix-Matrix Multiplication
- and data distribution neutral libraries
- 4.6 Case Study: Matrix
- communication cost
- 4.6.2 Redistribution Costs
- communication structure
- 4.6.1 Parallel Matrix-Matrix Multiplication
- systolic communication
- 4.6.3 A Systolic Algorithm
- Matrix transpose
- seeTranspose
- Meiko CS-2
- 1.2.2 Other Machine Models
- Member functions in C++
- 5.1.2 Classes
- Mentat
- Chapter Notes
- Mergesort
- (, )
- parallel
- Compare-Exchange.
- parallel algorithms
- (, )
- parallel algorithms
- (, )
- performance
- Performance
- references
- Chapter Notes
- sequential algorithm
- 11.4 Mergesort, 11.4 Mergesort
- sequential algorithm
- 11.4 Mergesort, 11.4 Mergesort
- Mesh networks
- Mesh Networks.
- Message Passing Interface
- seeMPI
- Message-passing model
- description
- Chapter Notes
- in HPF
- 7.7 Performance Issues
- task/channel model comparison
- 1.3.2 Other Programming Models
- MIMD computers
- 1.2.2 Other Machine Models
- Modular design
- and parallel computing
- 1.3 A Parallel Programming , (, )
- and parallel computing
- 1.3 A Parallel Programming , (, )
- and parallel computing
- 1.3 A Parallel Programming , (, )
- design checklist
- Design checklist.
- in CC++
- 5.9 Modularity
- in Fortran M
- 6.9 Modularity
- in HPF
- 7.1.3 Design, 7.5 Dummy Arguments and
- in HPF
- 7.1.3 Design, 7.5 Dummy Arguments and
- in MPI
- 8.5 Modularity
- in task/channel model
- 1.3.1 Tasks and Channels
- performance analysis
- 4.3 Performance Analysis
- principles
- (, )
- principles
- (, )
- Monte Carlo methods
- Chapter Notes
- MPI
- Part II: Tools, (, )
- basic functions
- 8.2 MPI Basics
- C binding
- C Language Binding.
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- derived datatypes
- 8.6.1 Derived Datatypes, 8.6.1 Derived Datatypes
- derived datatypes
- 8.6.1 Derived Datatypes, 8.6.1 Derived Datatypes
- determinism
- 8.2.2 Determinism, 8.2.2 Determinism
- determinism
- 8.2.2 Determinism, 8.2.2 Determinism
- environmental enquiry
- 8.6.2 MPI Features Not
- Fortran binding
- Fortran Language Binding.
- handles
- 8.2.1 Language Bindings
- message tags
- 8.2.2 Determinism
- modularity
- (, )
- modularity
- (, )
- modularity
- (, )
- modularity
- (, )
- MPMD model
- 8.1 The MPI Programming
- performance issues
- 8.7 Performance Issues
- probe operations
- 8.4 Asynchronous Communication
- starting a computation
- 8.2 MPI Basics
- MPI Forum
- Chapter Notes
- MPMD model
- 8.1 The MPI Programming
- MPP Apprentice
- Chapter Notes
- Multicomputer model
- 1.2.1 The Multicomputer, 3.3 Developing Models
- and locality
- 1.2.1 The Multicomputer
- early examples
- Chapter Notes
- Multicomputer Toolbox
- Chapter Notes
- Multiprocessors
- 1.2.2 Other Machine Models, 1.2.2 Other Machine Models
- Multistage networks
- Multistage Interconnection Networks., Multistage Interconnection Networks.
- nCUBE
- 1.2.2 Other Machine Models, Chapter Notes, Chapter Notes
- NESL
- Chapter Notes
- Networks
- ATM
- 1.2.2 Other Machine Models
- bus-based
- Bus-based Networks.
- crossbar switch
- Crossbar Switching Network.
- Ethernet
- Ethernet.
- hypercube
- Hypercube Network.
- LAN
- 1.2.2 Other Machine Models
- shared memory
- Bus-based Networks.
- torus
- Mesh Networks.
- trends in
- 1.1.3 Trends in Networking
- WAN
- 1.2.2 Other Machine Models
- Nondeterminism
- from random numbers
- 3.5.2 Obtaining and Validating
- in Fortran M
- 6.6 Determinism
- in message-passing model
- 8.2.2 Determinism
- in MPI
- 8.2.2 Determinism
- in parameter study problem
- 1.4.4 Parameter Study, 1.4.4 Parameter Study
- in parameter study problem
- 1.4.4 Parameter Study, 1.4.4 Parameter Study
- Notation
- Terminology
- Numerical analysis
- 12 Further Reading
- Object-oriented model
- 1.3.1 Tasks and Channels
- Objective C
- Chapter Notes
- Out-of-core computation
- 3.8 Input/Output
- Overhead anomalies
- 3.6.1 Unaccounted-for Overhead
- Overlapping computation and communication
- 2.4.2 Preserving Flexibility, Idle Time.
- Overloading in C++
- 5.1.1 Strong Typing and
- Owner computes rule
- 7 High Performance Fortran, 7.1.1 Concurrency, 7.8 Case Study: Gaussian
- P++ library
- Chapter Notes
- p4
- Part II: Tools, 8 Message Passing Interface, Chapter Notes
- Pablo performance tool
- 9.4.3 Pablo, Chapter Notes
- Pairwise interactions
- (, )
- in Fortran M
- 6.3.3 Receiving Messages
- in HPF
- 7.3.3 Distribution
- in MPI
- Fortran Language Binding., Fortran Language Binding., 8.2.2 Determinism
- in MPI
- Fortran Language Binding., Fortran Language Binding., 8.2.2 Determinism
- in MPI
- Fortran Language Binding., Fortran Language Binding., 8.2.2 Determinism
- Paragraph performance tool
- 9.4.1 Paragraph, Chapter Notes
- ParAide performance tool
- 9.4.5 ParAide, Chapter Notes
- Parallel algorithm design
- bibliography
- 12 Further Reading
- and performance
- 3.10 Summary
- case studies
- (, )
- case studies
- (, )
- methodology
- 2.1 Methodical Design, 2.9 Summary
- methodology
- 2.1 Methodical Design, 2.9 Summary
- Parallel algorithms
- branch and bound search
- 2.7.1 Floorplan Background
- convolution
- 4.4 Case Study: Convolution
- fast Fourier transform
- 4.4 Case Study: Convolution
- Gaussian elimination
- 7.8 Case Study: Gaussian , 9.3.3 Data-Parallel Languages
- Gaussian elimination
- 7.8 Case Study: Gaussian , 9.3.3 Data-Parallel Languages
- matrix multiplication
- (, )
- matrix multiplication
- (, )
- mergesort
- 11.4 Mergesort
- parallel prefix
- 7.6.3 HPF Features Not
- parallel suffix
- 7.6.3 HPF Features Not
- quicksort
- Chapter Notes
- random number generation
- 10 Random Numbers
- reduction
- 2.3.2 Global Communication
- search
- Chapter Notes
- shortest paths
- 3.9 Case Study: Shortest-Path
- spectral transform
- Multistage Interconnection Networks.
- transpose
- Exercises, 11.3 Matrix Transposition
- transpose
- Exercises, 11.3 Matrix Transposition
- vector reduction
- 11.1 The Hypercube Template
- Parallel composition
- 4.2 Modularity and Parallel , (, )
- in CC++
- 5.9 Modularity
- in convolution algorithm
- 4.4.2 Composing Components
- in Fortran M
- 6.1 FM Introduction
- in MPI
- 8.5 Modularity
- load imbalances due to
- 4.3 Performance Analysis
- task parallel approach
- 8.5.2 Partitioning Processes
- vs. SPMD model
- 4.2.3 Parallel Composition
- Parallel computers
- applications
- 1.1.1 Trends in Applications
- architecture
- (, ), (, )
- architecture
- (, ), (, )
- architecture
- (, ), (, )
- architecture
- (, ), (, )
- definition
- 1.1 Parallelism and Computing
- performance trends
- 1.1.2 Trends in Computer
- Parallel computing conferences
- 12 Further Reading
- Parallel computing journals
- 12 Further Reading
- Parallel database machines
- Chapter Notes
- Parallel I/O
- seeI/O, parallel
- Parallel prefix
- 7.6.3 HPF Features Not
- Parallel programming models
- message passing
- 1.3.2 Other Programming Models
- data parallelism
- 1.3.2 Other Programming Models
- MPMD
- 8.1 The MPI Programming
- shared memory
- 1.3.2 Other Programming Models
- SPMD
- 1.3.2 Other Programming Models
- survey
- Chapter Notes
- task/channel
- 1.3.1 Tasks and Channels
- Parallel software requirements
- concurrency
- 1.1.2 Trends in Computer , 1.1.4 Summary of Trends
- concurrency
- 1.1.2 Trends in Computer , 1.1.4 Summary of Trends
- locality
- 1.2.1 The Multicomputer
- modularity
- 1.3 A Parallel Programming
- scalability
- 1.1.4 Summary of Trends
- Parallel suffix
- 7.6.3 HPF Features Not
- Parallelism trends
- in applications
- 1.1.1 Trends in Applications
- in computer design
- 1.1.2 Trends in Computer
- Parameter study problem
- 1.4.4 Parameter Study, 1.4.4 Parameter Study, 1.4.4 Parameter Study, 1.4.4 Parameter Study
- PARMACS
- Part II: Tools, 8 Message Passing Interface, Chapter Notes
- Partitioning
- and domain decomposition
- 2.2 Partitioning
- and functional decomposition
- 2.2 Partitioning
- design checklist
- 2.2.3 Partitioning Design Checklist
- Partitioning algorithms
- 2.5.1 Load-Balancing Algorithms
- pC++
- Chapter Notes, 7 High Performance Fortran, 7.1.1 Concurrency, Chapter Notes
- PCAM
- 2.1 Methodical Design, 2.1 Methodical Design, 7.1.3 Design
- Per-hop time
- Exercises
- Per-word transfer time
- Communication Time.
- Performance modeling
- Amdahl's law
- 3.2.1 Amdahl's Law, 3.2.1 Amdahl's Law
- Amdahl's law
- 3.2.1 Amdahl's Law, 3.2.1 Amdahl's Law
- asymptotic analysis
- 3.2.3 Asymptotic Analysis
- design considerations
- 3.4 Scalability Analysis
- empirical studies
- 3.2.2 Extrapolation from Observations, 3.4 Scalability Analysis, 3.5 Experimental Studies
- empirical studies
- 3.2.2 Extrapolation from Observations, 3.4 Scalability Analysis, 3.5 Experimental Studies
- empirical studies
- 3.2.2 Extrapolation from Observations, 3.4 Scalability Analysis, 3.5 Experimental Studies
- for evaluation of algorithm implementation
- 3.6 Evaluating Implementations
- for I/O
- 3.8 Input/Output
- impact of interconnection networks
- 3.7.2 Interconnection Networks
- methodology
- Chapter Notes, 9.1 Performance Analysis, 9.2.2 Counters
- methodology
- Chapter Notes, 9.1 Performance Analysis, 9.2.2 Counters
- methodology
- Chapter Notes, 9.1 Performance Analysis, 9.2.2 Counters
- metrics
- 3.3 Developing Models, 3.3.2 Efficiency and Speedup
- metrics
- 3.3 Developing Models, 3.3.2 Efficiency and Speedup
- qualitative analysis
- 3.4 Scalability Analysis
- with multiple modules
- 4.3 Performance Analysis
- Performance tools
- seeTools, performance
- Performance trends
- in networking
- 1.1.3 Trends in Networking
- in parallel computers
- 1.1.1 Trends in Applications
- Performance, definition
- 3.1 Defining Performance
- Performance, metrics
- 3.1 Defining Performance
- PETSc
- Chapter Notes
- PICL
- Chapter Notes, Chapter Notes
- Pipelining
- Avoiding Communication., 4.4.1 Components
- Poison pill technique
- 4.5.1 Application
- Polling
- 2.3.4 Asynchronous Communication
- costs
- 2.3.4 Asynchronous Communication
- for load balancing
- Decentralized Schemes.
- in CC++
- 5.6 Asynchronous Communication
- in Fortran M
- 6.5 Asynchronous Communication
- in MPI
- (, )
- in MPI
- (, )
- Ports
- 1.3.1 Tasks and Channels, 6.2.1 Defining Processes
- PRAM model
- 1.2.2 Other Machine Models, Chapter Notes, Chapter Notes, 3.2.3 Asymptotic Analysis
- Prefetching
- 1.4.4 Parameter Study, Manager/Worker.
- Prefix product
- Communication.
- Prism performance tool
- 9.3.3 Data-Parallel Languages, Chapter Notes
- Probabilistic methods for load balancing
- Probabilistic Methods.
- Probe effect
- 9.2.3 Traces
- Processes in MPI
- 8.1 The MPI Programming
- Production systems
- 12 Further Reading
- Profiles
- 9.1 Performance Analysis, 9.2.1 Profiles
- advantages
- 9.1 Performance Analysis, 9.2.1 Profiles
- advantages
- 9.1 Performance Analysis, 9.2.1 Profiles
- data reduction techniques
- 9.3.1 Profile and Counts
- disadvantages
- 9.2.1 Profiles
- sampling approach
- Chapter Notes
- Protection in C++
- 5.1.2 Classes
- Pruning
- 2.7.1 Floorplan Background, 2.7.1 Floorplan Background
- Pseudo-random numbers
- seeRandom numbers
- PVM
- Part II: Tools, 8 Message Passing Interface, Chapter Notes
- Quick references
- for CC++
- 5.13 Summary, 5.13 Summary
- for CC++
- 5.13 Summary, 5.13 Summary
- for Fortran M
- 6.12 Summary, 6.12 Summary
- for Fortran M
- 6.12 Summary, 6.12 Summary
- Quicksort
- Chapter Notes
- Random numbers
- centralized generators
- 10.2 Parallel Random Numbers
- distributed generators
- 10.2 Parallel Random Numbers
- leapfrog method
- 10.3.2 The Leapfrog Method
- linear congruential generators
- 10.1 Sequential Random Numbers, 10.3.1 The Random Tree
- linear congruential generators
- 10.1 Sequential Random Numbers, 10.3.1 The Random Tree
- modified leapfrog method
- 10.3.3 Modified Leapfrog
- parallel
- 10 Random Numbers, 10.2 Parallel Random Numbers
- parallel
- 10 Random Numbers, 10.2 Parallel Random Numbers
- period of the generator
- 10.1 Sequential Random Numbers
- random tree method
- 10.3 Distributed Random Generators, 10.3.1 The Random Tree
- random tree method
- 10.3 Distributed Random Generators, 10.3.1 The Random Tree
- replicated generators
- 10.2 Parallel Random Numbers
- sequential
- 10 Random Numbers, 10.1 Sequential Random Numbers
- sequential
- 10 Random Numbers, 10.1 Sequential Random Numbers
- tests for generators
- Chapter Notes
- use with Monte Carlo methods
- Chapter Notes
- Random tree method
- 10.3.1 The Random Tree , 10.3.1 The Random Tree
- Real-time applications
- Chapter Notes
- Receiver-initiated strategy
- Chapter Notes
- Recursive bisection
- (, )
- coordinate
- Recursive Bisection., Chapter Notes
- coordinate
- Recursive Bisection., Chapter Notes
- graph
- Recursive Bisection.
- spectral
- Recursive Bisection.
- unbalanced
- Recursive Bisection.
- Recursive halving algorithm
- 11.2 Vector Reduction, 11.2 Vector Reduction
- Red-black algorithm
- 2.3.1 Local Communication
- Reduction
- 2.3.2 Global Communication
- in Fortran 90
- 7.2.2 Array Intrinsic Functions
- in MPI
- 8.3 Global Operations, 8.3.3 Reduction Operations
- in MPI
- 8.3 Global Operations, 8.3.3 Reduction Operations
- Remote procedure call
- 5.12 Case Study: Fock
- Replication
- of computation
- Replicating Computation., Replicating Computation.
- of computation
- Replicating Computation., Replicating Computation.
- of data
- Communication and Agglomeration., Communication and Agglomeration.
- of data
- Communication and Agglomeration., Communication and Agglomeration.
- Ring pipeline
- seePairwise interactions
- RPC
- 5.12 Case Study: Fock
- Scalability
- 1.3.1 Tasks and Channels
- Scalability analysis
- (, )
- ScaLAPACK
- 4.2.2 Sequential Composition, 4.2.2 Sequential Composition, Chapter Notes
- Scale analysis
- 3.3 Developing Models
- Scaled speedup
- Chapter Notes
- Search
- Chapter Notes
- Self Describing Data Format
- 9.2.3 Traces, 9.4.3 Pablo
- Self-consistent field method
- Exercises
- Semaphores
- 1.3.2 Other Programming Models
- Sender-initiated strategy
- Chapter Notes
- Sequence association
- 7.6.2 Storage and Sequence
- Sequent Symmetry
- 1.2.2 Other Machine Models
- Sequential bottlenecks in HPF
- 7.7.2 Sequential Bottlenecks
- Sequential composition
- 4.2 Modularity and Parallel , (, )
- advantages
- 4.2.2 Sequential Composition, 4.2.2 Sequential Composition
- advantages
- 4.2.2 Sequential Composition, 4.2.2 Sequential Composition
- and parallel libraries
- 4.2.2 Sequential Composition
- convolution example
- 4.4.2 Composing Components
- example
- 4.2.2 Sequential Composition
- in CC++
- 5.9 Modularity, 5.9 Modularity, 5.9 Modularity
- in CC++
- 5.9 Modularity, 5.9 Modularity, 5.9 Modularity
- in CC++
- 5.9 Modularity, 5.9 Modularity, 5.9 Modularity
- in Fortran M
- 6.9 Modularity, 6.9 Modularity
- in Fortran M
- 6.9 Modularity, 6.9 Modularity
- in HPF
- 7.5 Dummy Arguments and
- in MPI
- 8.5 Modularity
- Sets, distributed
- 4.5 Case Study: Tuple
- Shared-memory model
- 1.3.2 Other Programming Models, Chapter Notes, Bus-based Networks.
- Shortest-path problem
- (, )
- algorithm comparison
- 3.9.3 Shortest-Path Algorithms Summary
- all-pairs
- 3.9.1 Floyd's Algorithm
- Dijkstra's algorithm
- 3.9.2 Dijkstra's Algorithm
- Floyd's algorithm
- 3.9.1 Floyd's Algorithm
- requirements
- 3.9 Case Study: Shortest-Path
- single-source
- 3.9 Case Study: Shortest-Path , 3.9.2 Dijkstra's Algorithm
- single-source
- 3.9 Case Study: Shortest-Path , 3.9.2 Dijkstra's Algorithm
- Silicon Graphics Challenge
- 1.2.2 Other Machine Models
- SIMD computer
- 1.2.2 Other Machine Models, 1.2.2 Other Machine Models, Chapter Notes
- Single program multiple data
- seeSPMD model
- Single-assignment variable
- Chapter Notes
- SISAL
- 12 Further Reading
- Sorting
- 11.4 Mergesort, Chapter Notes
- Space-time diagrams
- 9.3.2 Traces
- Sparse matrix algorithms
- 12 Further Reading
- Spectral bisection
- Recursive Bisection., Chapter Notes
- Spectral transform
- Multistage Interconnection Networks.
- Speed of light
- 3.7.1 Competition for Bandwidth
- Speedup
- absolute
- 3.3.2 Efficiency and Speedup
- anomalies
- 3.6.2 Speedup Anomalies, Chapter Notes
- anomalies
- 3.6.2 Speedup Anomalies, Chapter Notes
- relative
- 3.3.2 Efficiency and Speedup
- superlinear
- 3.6.2 Speedup Anomalies, Chapter Notes
- superlinear
- 3.6.2 Speedup Anomalies, Chapter Notes
- SPMD model
- 1.4 Parallel Algorithm Examples
- agglomeration phase
- 2.4 Agglomeration
- and parallel composition
- 4.2.3 Parallel Composition
- and PCAM methodology
- 2.1 Methodical Design
- and sequential composition
- 4.2.2 Sequential Composition
- in CC++
- 5.8.2 Mapping Threads to
- in Fortran M
- 6.9 Modularity
- in HPF
- 7.1.1 Concurrency
- in MPI
- 8.1 The MPI Programming
- limitations
- 1.3.2 Other Programming Models
- Startup time
- Communication Time.
- Stencil of grid point
- 2.3.1 Local Communication
- Storage association
- 7.6.2 Storage and Sequence
- Superlinear speedup
- 3.6.2 Speedup Anomalies, 3.6.2 Speedup Anomalies
- arguments against
- Chapter Notes
- Surface-to-volume effect
- Surface-to-Volume Effects., Surface-to-Volume Effects., 3.4.2 Scalability with Scaled
- Synchronization
- 1.4.1 Finite Differences, 5.5.2 Synchronization, 8.3.1 Barrier, 11.2 Vector Reduction
- Systolic communication
- 4.6.3 A Systolic Algorithm
- t
- seestartup time
- t
- seeper-word transfer time
- t
- seeper-hop time
- Task parallelism
- 8.5.2 Partitioning Processes
- Task scheduling
- decentralized control
- Decentralized Schemes.
- for floorplan optimization
- Mapping.
- for short-lived tasks
- 2.5 Mapping
- hierarchical
- Hierarchical Manager/Worker.
- manager/worker
- Manager/Worker.
- problem allocation
- 2.5.2 Task-Scheduling Algorithms
- termination detection
- Termination Detection.
- with task pool
- 2.5.2 Task-Scheduling Algorithms
- Task/channel model
- (, )
- data-parallel model comparison
- 7.1.3 Design
- description
- Chapter Notes
- determinism
- 1.3.1 Tasks and Channels
- locality
- 1.3.1 Tasks and Channels
- mapping
- 1.3.1 Tasks and Channels, 1.3.1 Tasks and Channels
- mapping
- 1.3.1 Tasks and Channels, 1.3.1 Tasks and Channels
- message-passing model comparison
- 1.3.2 Other Programming Models
- modularity
- 1.3.1 Tasks and Channels
- object-oriented model comparison
- 1.3.1 Tasks and Channels
- performance
- 1.3.1 Tasks and Channels
- scalability
- 1.3.1 Tasks and Channels
- Template
- definition
- 11 Hypercube Algorithms
- for hypercube
- 11 Hypercube Algorithms
- in HPF
- 7.6.3 HPF Features Not
- Termination detection
- Decentralized Schemes., Chapter Notes
- Terminology
- Terminology
- Thinking Machines CM2
- Chapter Notes
- Thinking Machines CM5
- 1.2.2 Other Machine Models
- Threads in CC++
- 5.2 CC++
Introduction
- Throughput
- 3.1 Defining Performance
- Timers
- 9.2.2 Counters
- Timing variations
- 3.5.2 Obtaining and Validating
- Tools, performance
- AIMS
- Chapter Notes
- customized
- 9.4.8 Custom Tools
- Gauge
- 9.4.4 Gauge
- IPS-2
- Chapter Notes
- MPP Apprentice
- Chapter Notes
- Pablo
- 9.4.3 Pablo
- Paragraph
- 9.4.1 Paragraph
- ParAide
- Chapter Notes
- Prism
- 9.3.3 Data-Parallel Languages, Chapter Notes
- Prism
- 9.3.3 Data-Parallel Languages, Chapter Notes
- selection of
- 9.1 Performance Analysis
- standards lacking for
- 9 Performance Tools
- Upshot
- 9.4.2 Upshot
- VT
- 9.4.6 IBM's Parallel Environment
- Torus networks
- Mesh Networks.
- Traces
- (, ), 9.3.2 Traces
- disadvantages
- 9.2.3 Traces, 9.2.3 Traces
- disadvantages
- 9.2.3 Traces, 9.2.3 Traces
- standards lacking for
- 9.2.3 Traces
- Transformation of data
- 9.1 Performance Analysis
- Transpose
- Exercises
- hypercube algorithm
- 11.3 Matrix Transposition
- in convolution
- 4.4.1 Components
- Tree search
- 1.4.3 Search
- in CC++
- 5.4.3 Thread Placement
- in Fortran M
- 6.3.3 Receiving Messages
- Trends
- in applications
- 1.1.1 Trends in Applications
- in computer design
- 1.1.2 Trends in Computer
- Tuple space
- 4.5 Case Study: Tuple , 4.5.2 Implementation
- Ultracomputer
- Chapter Notes, Chapter Notes
- Unbalanced recursive bisection
- Recursive Bisection.
- Unity
- Chapter Notes
- Unstructured communication
- 2.3.3 Unstructured and Dynamic , 2.3.3 Unstructured and Dynamic
- Upshot performance tool
- Chapter Notes
- state data analysis
- 9.4.2 Upshot
- use with MPI
- 9.4.2 Upshot
- Vector broadcast algorithm
- 11.2 Vector Reduction
- Vector reduction
- 11.2 Vector Reduction
- Video servers
- 1.1.1 Trends in Applications
- Vienna Fortran
- Chapter Notes
- Virtual computers
- 6.8.1 Virtual Computers
- Virtual functions in C++
- 5.1.3 Inheritance
- Visualization of performance data
- 9.3.2 Traces, Chapter Notes
- VLSI design
- 1.1.2 Trends in Computer , 2.7.1 Floorplan Background, Chapter Notes, 12 Further Reading
- Von Neumann computer
- derivation
- Chapter Notes
- exposition on
- Chapter Notes
- illustration
- 1.2 A Parallel Machine
- model
- 1.2 A Parallel Machine
- program structure
- 1.3 A Parallel Programming
- VT performance tool
- 9.4.6 IBM's Parallel Environment
- Wide area network
- 1.2.2 Other Machine Models
- Zipcode
- Chapter Notes
© Copyright 1995 by Ian Foster