Next: Index
Up: Designing and Building Parallel Programs
Previous: 12 Further Reading
References
- 1
-
ACM.
Resources in Parallel and Concurrent Systems.
ACM Press, 1991.
- 2
-
G. Adams, D. Agrawal, and H. Siegel.
A survey and comparison of fault-tolerant multistage interconnection
networks.
IEEE Trans. Computs., C-20(6):14--29, 1987.
- 3
-
J. Adams, W. Brainerd, J. Martin, B. Smith, and J. Wagener.
The Fortran 90 Handbook.
McGraw-Hill, 1992.
- 4
-
A. Aggarwal and J. S. Vitter.
The input/output complexity of sorting and related problems.
Commun. ACM, 31(9):1116--1127, 1988.
- 5
-
G. Agha.
Actors.
MIT Press, 1986.
- 6
-
G. Agrawal, A. Sussman, and J. Saltz.
Compiler and runtime support for structured and block structured
applications.
In Proc. Supercomputing '93, pages 578--587, 1993.
- 7
-
A. Aho, J. Hopcroft, and J. Ullman.
The Design and Analysis of Computer Algorithms.
Addison-Wesley, 1974.
- 8
-
S. Akl.
The Design and Analysis of Parallel Algorithms.
Prentice-Hall, 1989.
- 9
-
S. G. Akl and K. A. Lyons.
Parallel Computational Geometry.
Prentice-Hall, 1993.
- 10
-
E. Albert, J. Lukas, and G. Steele.
Data parallel computers and the FORALL statement.
J. Parallel and Distributed Computing, 13(2):185--192, 1991.
- 11
-
G. S. Almasi and A. Gottlieb.
Highly Parallel Computing.
Benjamin/Cummings, second edition, 1994.
- 12
-
G. Amdahl.
Validity of the single-processor approach to achieving large-scale
computing capabilities.
In Proc. 1967 AFIPS Conf., volume 30, page 483. AFIPS Press,
1967.
- 13
-
S. Anderson.
Random number generators.
SIAM Review, 32(2):221--251, 1990.
- 14
-
G. R. Andrews.
Concurrent Programming: Principles and Practice.
Benjamin/Cummings, 1991.
- 15
-
G. R. Andrews and R. A. Olsson.
The SR Programming Language: Concurrency in Practice.
Benjamin/Cummings, 1993.
- 16
-
ANSI X3J3/S8.115.
Fortran 90, 1990.
- 17
-
S. Arvindam, V. Kumar, and V. Rao.
Floorplan optimization on multiprocessors.
In Proc. 1989 Intl Conf. on Computer Design, pages 109--113.
IEEE Computer Society, 1989.
- 18
-
W. C. Athas and C. L. Seitz.
Multicomputers: Message-passing concurrent computers.
Computer, 21(8):9--24, 1988.
- 19
-
J. Auerbach, A. Goldberg, G. Goldszmidt, A. Gopal, M. Kennedy, J. Rao, and
J. Russell.
Concert/C: A language for distributed programming.
In Winter 1994 USENIX Conference. Usenix Association, 1994.
- 20
-
A. Averbuch, E. Gabber, B. Gordissky, and Y. Medan.
A parallel FFT on an MIMD machine.
Parallel Computing, 15:61--74, 1990.
- 21
-
D. Bailey.
FFTs in external or hierarchical memory.
J. Supercomputing, 4:23--35, 1990.
- 22
-
J. Bailey.
First we reshape our computers, then they reshape us: The broader
intellectual impact of parallelism.
Daedalus, 121(1):67--86, 1992.
- 23
-
H. E. Bal, J. G. Steiner, and A. S. Tanenbaum.
Programming languages for distributed computing systems.
ACM Computing Surveys, 21(3):261--322, 1989.
- 24
-
V. Bala and S. Kipnis.
Process groups: A mechanism for the coordination of and
communication among processes in the Venus collective communication
library.
Technical report, IBM T. J. Watson Research Center, 1992.
- 25
-
V. Bala, S. Kipnis, L. Rudolph, and M. Snir.
Designing efficient, scalable, and portable collective communication
libraries.
Technical report, IBM T. J. Watson Research Center, 1992.
Preprint.
- 26
-
P. Banerjee.
Parallel Algorithms For VLSI Computer-Aided Design.
Prentice-Hall, 1994.
- 27
-
U. Banerjee.
Dependence Analysis for Supercomputing.
Kluwer Academic Publishers, 1988.
- 28
-
S. Barnard and H. Simon.
Fast multilevel implementation of recursive spectral bisection for
partitioning unstructured problems.
Concurrency: Practice and Experience, 6(2):101--117, 1994.
- 29
-
J. Barton and L. Nackman.
Scientific and Engineering C++.
Addison-Wesley, 1994.
- 30
-
K. Batcher.
Sorting networks and their applications.
In Proc. 1968 AFIPS Conf., volume 32, page 307. AFIPS Press,
1968.
- 31
-
BBN Advanced Computers Inc.
TC-2000 Technical Product Summary, 1989.
- 32
-
M. Ben-Ari.
Principles of Concurrent and Distributed Programming.
Prentice-Hall, 1990.
- 33
-
M. Berger and S. Bokhari.
A partitioning strategy for nonuniform problems on multiprocessors.
IEEE Trans. Computs., C-36(5):570--580, 1987.
- 34
-
F. Berman and L. Snyder.
On mapping parallel algorithms into parallel architectures.
J. Parallel and Distributed Computing, 4(5):439--458, 1987.
- 35
-
D. Bertsekas and J. Tsitsiklis.
Parallel and Distributed Computation: Numerical Methods.
Prentice-Hall, 1989.
- 36
-
D. P. Bertsekas, C. Ozveren, G. D. Stamoulis, P. Tseng, and J. N. Tsitsiklis.
Optimal communication algorithms for hypercubes.
J. Parallel and Distributed Computing, 11:263--275, 1991.
- 37
-
G. Blelloch.
Vector Models for Data-Parallel Computing.
MIT Press, 1990.
- 38
-
F. Bodin, P. Beckman, D. B. Gannon, S. Narayana, and S. Yang.
Distributed pC++: Basic ideas for an object parallel language.
In Proc. Supercomputing '91, pages 273--282, 1991.
- 39
-
S. Bokhari.
On the mapping problem.
IEEE Trans. Computs., C-30(3):207--214, 1981.
- 40
-
G. Booch.
Object-Oriented Design with Applications.
Benjamin-Cummings, 1991.
- 41
-
R. Bordawekar, J. del Rosario, and A. Choudhary.
Design and evaluation of primitives for parallel I/O.
In Proc. Supercomputing '93, pages 452--461. ACM, 1993.
- 42
-
Z. Bozkus, A. Choudhary, G. Fox, T. Haupt, and S. Ranka.
Fortran 90D/HPF compiler for distributed memory MIMD computers:
Design, implementation, and performance results.
In Proc. Supercomputing '93. IEEE Computer Society, 1993.
- 43
-
W. Brainerd, C. Goldberg, and J. Adams.
Programmer's Guide to Fortran 90.
McGraw-Hill, 1990.
- 44
-
R. Butler and E. Lusk.
Monitors, message, and clusters: The p4 parallel programming
system.
Parallel Computing, 20:547--564, 1994.
- 45
-
D. Callahan and K. Kennedy.
Compiling programs for distributed-memory multiprocessors.
J. Supercomputing, 2:151--169, 1988.
- 46
-
G. F. Carey, editor.
Parallel Supercomputing: Methods, Algorithms and Applications.
Wiley, 1989.
- 47
-
N. Carriero and D. Gelernter.
Linda in context.
Commun. ACM, 32(4):444--458, 1989.
- 48
-
N. Carriero and D. Gelernter.
How to Write Parallel Programs.
MIT Press, 1990.
- 49
-
N. Carriero and D. Gelernter.
Tuple analysis and partial evaluation strategies in the Linda
pre-compiler.
In Languages and Compilers for Parallel Computing.
MIT-Press, 1990.
- 50
-
R. Chandra, A. Gupta, and J. Hennessy.
COOL: An object-based language for parallel programming.
Computer, 27(8):14--26, 1994.
- 51
-
K. M. Chandy and I. Foster.
A deterministic notation for cooperating processes.
IEEE Trans. Parallel and Distributed Syst., 1995.
to appear.
- 52
-
K. M. Chandy, I. Foster, K. Kennedy, C. Koelbel, and C.-W. Tseng.
Integrated support for task and data parallelism.
Intl J. Supercomputer Applications, 8(2):80--98, 1994.
- 53
-
K. M. Chandy and C. Kesselman.
CC++: A declarative concurrent object-oriented programming
notation.
In Research Directions in Concurrent Object-Oriented
Programming. MIT Press, 1993.
- 54
-
K. M. Chandy and J. Misra.
Parallel Program Design.
Addison-Wesley, 1988.
- 55
-
K. M. Chandy and S. Taylor.
An Introduction to Parallel Programming.
Jones and Bartlett, 1992.
- 56
-
B. Chapman, P. Mehrotra, and H. Zima.
Programming in Vienna Fortran.
Scientific Programming, 1(1):31--50, 1992.
- 57
-
B. Chapman, P. Mehrotra, and H. Zima.
Extending HPF for advanced data-parallel applications.
IEEE Parallel and Distributed Technology, 2(3):15--27, 1994.
- 58
-
D. Y. Cheng.
A survey of parallel programming languages and tools.
Technical Report RND-93-005, NASA Ames Research Center, Moffett
Field, Calif., 1993.
- 59
-
J. Choi, J. Dongarra, and D. Walker.
PUMMA: Parallel Universal Matrix Multiplication
Algorithms on distributed memory concurrent computers.
Concurrency: Practice and Experience, 6, 1994.
- 60
-
A. Choudhary.
Parallel I/O systems, guest editor's introduction.
J. Parallel and Distributed Computing, 17(1--2):1--3, 1993.
- 61
-
S. Chowdhury.
The greedy load-sharing algorithm.
J. Parallel and Distributed Computing, 9(1):93--99, 1990.
- 62
-
M. Colvin, C. Janssen, R. Whiteside, and C. Tong.
Parallel Direct-SCF for large-scale calculations.
Technical report, Center for Computational Engineering, Sandia
National Laboratories, Livermore, Cal., 1991.
- 63
-
D. Comer.
Internetworking with TCP/IP.
Prentice-Hall, 1988.
- 64
-
S. Cook.
The classification of problems which have fast parallel algorithms.
In Proc. 1983 Intl Foundation of Computation Theory Conf.,
volume 158, pages 78--93. Springer-Verlag LNCS, 1983.
- 65
-
T. Cormen, C. Leiserson, and R. Rivest.
Introduction to Algorithms.
MIT Press, 1990.
- 66
-
B. Cox and A. Novobilski.
Object-Oriented Programming: An Evolutionary Approach.
Addison-Wesley, 1991.
- 67
-
D. Culler et al.
LogP: Towards a realistic model of parallel computation.
In Proc. 4th Symp. Principles and Practice of Parallel
Programming, pages 1--12. ACM, 1993.
- 68
-
G. Cybenko.
Dynamic load balancing for distributed memory multiprocessors.
J. Parallel and Distributed Computing, 7:279--301, 1989.
- 69
-
W. Dally.
A VLSI Architecture for Concurrent Data Structures.
Kluwer Academic Publishers, 1987.
- 70
-
W. Dally and C. L. Seitz.
The torus routing chip.
J. Distributed Systems, 1(3):187--196, 1986.
- 71
-
W. Dally and C. L. Seitz.
Deadlock-free message routing in multiprocessor interconnection
networks.
IEEE Trans. Computs., C-36(5):547--553, 1987.
- 72
-
W. J. Dally et al.
The message-driven processor.
IEEE Micro., 12(2):23--39, 1992.
- 73
-
C. R. Das, N. Deo, and S. Prasad.
Parallel graph algorithms for hypercube computers.
Parallel Computing, 13:143--158, 1990.
- 74
-
C. R. Das, N. Deo, and S. Prasad.
Two minimum spanning forest algorithms on fixed-size hypercube
computers.
Parallel Computing, 15:179--187, 1990.
- 75
-
A. L. DeCegama.
The Technology of Parallel Processing: Parallel Processing
Architectures and VLSI Hardware: Volume 1.
Prentice-Hall, 1989.
- 76
-
J. del Rosario and A. Choudhary.
High-Performance I/O for Parallel Computers: Problems and
Prospects.
Computer, 27(3):59--68, 1994.
- 77
-
J. W. Demmel, M. T. Heath, and H. A. van der Vorst.
Parallel numerical linear algebra.
Acta Numerica, 10:111--197, 1993.
- 78
-
P. M. Dew, R. A. Earnshaw, and T. R. Heywood.
Parallel Processing for Computer Vision and Display.
Addison-Wesley, 1989.
- 79
-
D. DeWitt and J. Gray.
Parallel database systems: The future of high-performance database
systems.
Commun. ACM, 35(6):85--98, 1992.
- 80
-
E. W. Dijkstra.
A note on two problems in connexion with graphs.
Numerische Mathematik, 1:269--271, 1959.
- 81
-
E. W. Dijkstra, W. H. J. Feijen, and A. J. M. V. Gasteren.
Derivation of a termination detection algorithm for a distributed
computation.
Information Processing Letters, 16(5):217--219, 1983.
- 82
-
J. Dongarra, I. Duff, D. Sorensen, and H. van der Vorst.
Solving Linear Systems on Vector and Shared Memory Computers.
SIAM, 1991.
- 83
-
J. Dongarra, R. Pozo, and D. Walker.
ScaLAPACK++: An object-oriented linear algebra library for scalable
systems.
In Proc. Scalable Parallel Libraries Conf., pages 216--223.
IEEE Computer Society, 1993.
- 84
-
J. Dongarra, R. van de Geign, and D. Walker.
Scalability issues affecting the design of a dense linear algebra
library.
J. Parallel and Distributed Computing, 22(3):523--537, 1994.
- 85
-
J. Dongarra and D. Walker.
Software libraries for linear algebra computations on high
performance computers.
SIAM Review, 1995.
to appear.
- 86
-
J. Drake, I. Foster, J. Hack, J. Michalakes, B. Semeraro, B. Toonen,
D. Williamson, and P. Worley.
PCCM2: A GCM adapted for scalable parallel computers.
In Proc. 5th Symp. on Global Change Studies, pages 91--98.
American Meteorological Society, 1994.
- 87
-
R. Duncan.
A survey of parallel computer architectures.
Computer, 23(2):5--16, 1990.
- 88
-
R. Duncan.
Parallel computer architectures.
In Advances in Computers, volume 34, pages 113--152. Academic
Press, 1992.
- 89
-
D. L. Eager, J. Zahorjan, and E. D. Lazowska.
Speedup versus efficiency in parallel systems.
IEEE Trans. Computs., C-38(3):408--423, 1989.
- 90
-
Edinburgh Parallel Computing Centre, University of Edinburgh.
CHIMP Concepts, 1991.
- 91
-
Edinburgh Parallel Computing Centre, University of Edinburgh.
CHIMP Version 1.0 Interface, 1992.
- 92
-
M. A. Ellis and B. Stroustrup.
The Annotated C++ Reference Manual.
Addison-Wesley, 1990.
- 93
-
V. Faber, O. Lubeck, and A. White.
Superlinear speedup of an efficient parallel algorithm is not
possible.
Parallel Computing, 3:259--260, 1986.
- 94
-
T. Y. Feng.
A survey of interconnection networks.
IEEE Computer, 14(12):12--27, 1981.
- 95
-
J. Feo, D. Cann, and R. Oldehoeft.
A report on the SISAL language project.
J. Parallel and Distributed Computing, 12(10):349--366, 1990.
- 96
-
M. Feyereisen and R. Kendall.
An efficient implementation of the Direct-SCF algorithm on
parallel computer architectures.
Theoretica Chimica Acta, 84:289--299, 1993.
- 97
-
H. P. Flatt and K. Kennedy.
Performance of parallel processors.
Parallel Computing, 12(1):1--20, 1989.
- 98
-
R. Floyd.
Algorithm 97: Shortest path.
Commun. ACM, 5(6):345, 1962.
- 99
-
S. Fortune and J. Wyllie.
Parallelism in random access machines.
In Proc. ACM Symp. on Theory of Computing, pages 114--118.
ACM, 1978.
- 100
-
I. Foster.
Task parallelism and high performance languages.
IEEE Parallel and Distributed Technology, 2(3):39--48, 1994.
- 101
-
I. Foster, B. Avalani, A. Choudhary, and M. Xu.
A compilation system that integrates High Performance Fortran
and Fortran M.
In Proc. 1994 Scalable High-Performance Computing Conf., pages
293--300. IEEE Computer Society, 1994.
- 102
-
I. Foster and K. M. Chandy.
Fortran M: A language for modular parallel programming.
J. Parallel and Distributed Computing, 25(1), 1995.
- 103
-
I. Foster, M. Henderson, and R. Stevens.
Data systems for parallel climate models.
Technical Report ANL/MCS-TM-169, Mathematics and Computer Science
Division, Argonne National Laboratory, Argonne, Ill., 1991.
- 104
-
I. Foster, C. Kesselman, and S. Taylor.
Concurrency: Simple concepts and powerful tools.
Computer J., 33(6):501--507, 1990.
- 105
-
I. Foster, R. Olson, and S. Tuecke.
Productive parallel programming: The PCN approach.
Scientific Programming, 1(1):51--66, 1992.
- 106
-
I. Foster, R. Olson, and S. Tuecke.
Programming in Fortran M.
Technical Report ANL-93/26, Mathematics and Computer Science
Division, Argonne National Laboratory, Argonne, Ill., 1993.
- 107
-
I. Foster and S. Taylor.
Strand: New Concepts in Parallel Programming.
Prentice-Hall, 1989.
- 108
-
I. Foster, J. Tilson, A. Wagner, R. Shepard, R. Harrison, R. Kendall, and
R. Littlefield.
High performance computational chemistry: (I) Scalable Fock
matrix construction algorithms.
Preprint, Mathematics and Computer Science Division, Argonne National
Laboratory, Argonne, Ill., 1994.
- 109
-
I. Foster and B. Toonen.
Load-balancing algorithms for climate models.
In Proc. 1994 Scalable High-Performance Computing Conf., pages
674--681. IEEE Computer Society, 1994.
- 110
-
I. Foster and P. Worley.
Parallel algorithms for the spectral transform method.
Preprint MCS-P426-0494, Mathematics and Computer Science Division,
Argonne National Laboratory, Argonne, Ill., 1994.
- 111
-
G. Fox et al.
Solving Problems on Concurrent Processors.
Prentice-Hall, 1988.
- 112
-
G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and M. Wu.
Fortran D language specification.
Technical Report TR90-141, Dept. of Computer Science, Rice
University, 1990.
- 113
-
G. Fox, R. Williams, and P. Messina.
Parallel Computing Works!
Morgan Kaufman, 1994.
- 114
-
P. Frederickson, R. Hiromoto, T. Jordan, B. Smith, and T. Warnock.
Pseudo-random trees in Monte Carlo.
Parallel Computing, 1:175--180, 1984.
- 115
-
H. J. Fromm, U. Hercksen, U. Herzog, K. H. John, R. Klar, and W. Kleinoder.
Experiences with performance measurement and modeling of a processor
array.
IEEE Trans. Computs., C-32(1):15--31, 1983.
- 116
-
K. Gallivan, R. Plemmons, and A. Sameh.
Parallel algorithms for dense linear algebra computations.
SIAM Review, 32(1):54--135, 1990.
- 117
-
N. Gehani and W. Roome.
The Concurrent C Programming Language.
Silicon Press, 1988.
- 118
-
G. A. Geist, M. T. Heath, B. W. Peyton, and P. H. Worley.
A user's guide to PICL: A portable instrumented communication
library.
Technical Report TM-11616, Oak Ridge National Laboratory,
1990.
- 119
-
A. Gibbons and W. Rytter.
Efficient Parallel Algorithms.
Cambridge University Press, 1990.
- 120
-
G. A. Gibson.
Redundant Disk Arrays: Reliable, Parallel Secondary Storage.
MIT Press, 1992.
- 121
-
H. Goldstine and J. von Neumann.
On the principles of large-scale computing machines.
In Collected Works of John von Neumann, Vol. 5. Pergamon, 1963.
- 122
-
G. H. Golub and J. M. Ortega.
Scientific Computing: An Introduction with Parallel
Computing.
Academic Press, 1993.
- 123
-
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and
M. Snir.
The NYU ultracomputer: Designing a MIMD, shared memory parallel
computer.
IEEE Trans. Computs., C-32(2):175--189, 1983.
- 124
-
S. Graham, P. Kessler, and M. McKusick.
gprof: A call graph execution profiler.
In Proc. SIGPLAN '92 Symposium on Compiler Construction, pages
120--126. ACM, 1982.
- 125
-
A. S. Grimshaw.
An introduction to parallel object-oriented programming with
Mentat.
Technical Report 91 07, University of Virginia, 1991.
- 126
-
W. Gropp, E. Lusk, and A. Skjellum.
Using MPI: Portable Parallel Programming with the Message
Passing Interface.
MIT Press, 1995.
- 127
-
W. Gropp and B. Smith.
Scalable, extensible, and portable numerical libraries.
In Proc. Scalable Parallel Libraries Conf., pages 87--93. IEEE
Computer Society, 1993.
- 128
-
A. Gupta.
Parallelism in Production Systems.
Morgan Kaufmann, 1987.
- 129
-
J. L. Gustafson.
Reevaluating Amdahl's law.
Commun. ACM, 31(5):532--533, 1988.
- 130
-
J. L. Gustafson, G. R. Montry, and R. E. Benner.
Development of parallel methods for a 1024-processor hypercube.
SIAM J. Sci. and Stat. Computing, 9(4):609--638, 1988.
- 131
-
A. Hac.
Load balancing in distributed systems: A summary.
Performance Evaluation Review, 16(2):17--19, 1989.
- 132
-
G. Haring and G. Kotsis, editors.
Performance Measurement and Visualization of Parallel Systems.
Elsevier Science Publishers, 1993.
- 133
-
P. Harrison.
Analytic models for multistage interconnection networks.
J. Parallel and Distributed Computing, 12(4):357--369, 1991.
- 134
-
P. Harrison and N. M. Patel.
The representation of multistage interconnection networks in queuing
models of parallel systems.
J. ACM, 37(4):863--898, 1990.
- 135
-
R. Harrison et al.
High performance computational chemistry: (II) A scalable SCF
code.
Preprint, Mathematics and Computer Science Division, Argonne National
Laboratory, Argonne, Ill., 1994.
- 136
-
P. Hatcher and M. Quinn.
Data-Parallel Programming on MIMD Computers.
MIT Press, 1991.
- 137
-
P. Hatcher, M. Quinn, et al.
Data-parallel programming on MIMD computers.
IEEE Trans. Parallel and Distributed Syst., 2(3):377--383,
1991.
- 138
-
M. Heath.
Recent developments and case studies in performance visualization
using ParaGraph.
In Performance Measurement and Visualization of Parallel
Systems, pages 175--200. Elsevier Science Publishers, 1993.
- 139
-
M. Heath and J. Etheridge.
Visualizing the performance of parallel programs.
IEEE Software, 8(5):29--39, 1991.
- 140
-
M. Heath, E. Ng, and B. Peyton.
Parallel algorithms for sparse linear systems.
SIAM Review, 33(3):420--460, 1991.
- 141
-
M. Heath, A. Rosenberg, and B. Smith.
The physical mapping problem for parallel architectures.
J. ACM, 35(3):603--634, 1988.
- 142
-
W. Hehre, L. Radom, P. Schleyer, and J. Pople.
Ab Initio Molecular Orbital Theory.
John Wiley and Sons, 1986.
- 143
-
R. Hempel.
The ANL/GMD macros (PARMACS) in Fortran for portable parallel
programming using the message passing programming model -- users' guide and
reference manual.
Technical report, GMD, Postfach 1316, D-5205 Sankt Augustin
1, Germany, 1991.
- 144
-
R. Hempel, H.-C. Hoppe, and A. Supalov.
PARMACS 6.0 library interface specification.
Technical report, GMD, Postfach 1316, D-5205 Sankt Augustin
1, Germany, 1992.
- 145
-
M. Henderson, B. Nickless, and R. Stevens.
A scalable high-performance I/O system.
In Proc. 1994 Scalable High-Performance Computing Conf., pages
79--86. IEEE Computer Society, 1994.
- 146
-
P. Henderson.
Functional Programming.
Prentice-Hall, 1980.
- 147
-
J. Hennessy and N. Joupp.
Computer technology and architecture: An evolving interaction.
Computer, 24(9):18--29, 1991.
- 148
-
V. Herrarte and E. Lusk.
Studying parallel program behavior with upshot.
Technical Report ANL-91/15, Mathematics and Computer Science
Division, Argonne National Laboratory, Argonne, Ill., 1991.
- 149
-
High Performance Fortran Forum.
High Performance Fortran language specification, version 1.0.
Technical Report CRPC-TR92225, Center for Research on Parallel
Computation, Rice University, Houston, Tex., 1993.
- 150
-
W. D. Hillis.
The Connection Machine.
MIT Press, 1985.
- 151
-
W. D. Hillis and G. L. Steele.
Data parallel algorithms.
Commun. ACM, 29(12):1170--1183, 1986.
- 152
-
S. Hiranandani, K. Kennedy, and C. Tseng.
Compiling Fortran D for MIMD distributed-memory machines.
Commun. ACM, 35(8):66--80, 1992.
- 153
-
C. A. R. Hoare.
Quicksort.
Computer J., 5(1):10--15, 1962.
- 154
-
C. A. R. Hoare.
Communicating Sequential Processes.
Prentice Hall, 1984.
- 155
-
G. Hoffmann and T. Kauranne, editors.
Parallel Supercomputing in the Atmospheric Sciences.
World Scientific, 1993.
- 156
-
K. Hwang.
Advanced Computer Architecture: Parallelism, Scalability,
Programmability.
McGraw-Hill, 1993.
- 157
-
J. JáJá.
An Introduction to Parallel Algorithms.
Addison-Wesley, 1992.
- 158
-
J. Jenq and S. Sahni.
All pairs shortest paths on a hypercube multiprocessor.
In Proc. 1987 Intl. Conf. on Parallel Processing, pages
713--716, 1987.
- 159
-
S. L. Johnsson.
Communication efficient basic linear algebra computations on
hypercube architectures.
J. Parallel and Distributed Computing, 4(2):133--172, 1987.
- 160
-
S. L. Johnsson and C.-T. Ho.
Optimum broadcasting and personalized communication in hypercubes.
IEEE Trans. Computs., C-38(9):1249--1268, 1989.
- 161
-
M. Jones and P. Plassmann.
Parallel algorithms for the adaptive refinement and partitioning of
unstructured meshes.
In Proc. 1994 Scalable High-Performance Computing Conf., pages
478--485. IEEE Computer Society, 1994.
- 162
-
R. Kahn.
Resource-sharing computer communication networks.
Proc. IEEE, 60(11):1397--1407, 1972.
- 163
-
M. Kalos.
The Basics of Monte Carlo Methods.
J. Wiley and Sons, 1985.
- 164
-
L. N. Kanal and V. Kumar.
Search in Artificial Intelligence.
Springer-Verlag, 1988.
- 165
-
A. Karp and R. Babb.
A comparison of twelve parallel Fortran dialects.
IEEE Software, 5(5):52--67, 1988.
- 166
-
A. H. Karp.
Programming for parallelism.
IEEE Computer, 20(9):43--57, 1987.
- 167
-
A. H. Karp and H. P. Flatt.
Measuring parallel processor performance.
Commun. ACM, 33(5):539--543, 1990.
- 168
-
R. Katz, G. Gibson, and D. Patterson.
Disk system architectures for high performance computing.
Proc. IEEE, 77(12):1842--1858, 1989.
- 169
-
W. J. Kaufmann and L. L. Smarr.
Supercomputing and the Transformation of Science.
Scientific American Library, 1993.
- 170
-
B. Kernighan and D. Ritchie.
The C Programming Language.
Prentice Hall, second edition, 1988.
- 171
-
J. Kerrigan.
Migrating to Fortran 90.
O'Reilly and Associates, 1992.
- 172
-
C. Kesselman.
Integrating Performance Analysis with Performance Improvement in
Parallel Programs.
PhD thesis, UCLA, 1991.
- 173
-
L. Kleinrock.
On the modeling and analysis of computer networks.
Proc. IEEE, 81(8):1179--1191, 1993.
- 174
-
D. Knuth.
The Art of Computer Programming: Volume 3, Sorting and
Searching.
Addison-Wesley, 1973.
- 175
-
D. Knuth.
The Art of Computer Programming: Volume 2, Seminumerical
Algorithms.
Addison-Wesley, 1981.
- 176
-
C. Koelbel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel.
The High Performance Fortran Handbook.
MIT Press, 1994.
- 177
-
S. Koonin and D. Meredith.
Computational Physics.
Addison-Wesley, 1990.
- 178
-
J. S. Kowalik.
Parallel Computation and Computers for Artificial Intelligence.
Kluwer Academic Publishers, 1988.
- 179
-
V. Kumar, A. Grama, A. Gupta, and G. Karypis.
Introduction to Parallel Computing.
Benjamin/Cummings, 1993.
- 180
-
V. Kumar, A. Grama, and V. Rao.
Scalable load balancing techniques for parallel computers.
J. Parallel and Distributed Computing, 22(1):60--79, 1994.
- 181
-
V. Kumar and V. Rao.
Parallel depth-first search, part II: Analysis.
Intl J. of Parallel Programming, 16(6):479--499, 1987.
- 182
-
V. Kumar and V. Singh.
Scalability of parallel algorithms for the all-pairs shortest-path
problem.
J. Parallel and Distributed Computing, 13(2):124--138, 1991.
- 183
-
T. Lai and S. Sahni.
Anomalies in parallel branch-and-bound algorithms.
Commun. ACM, 27(6):594--602, 1984.
- 184
-
S. Lakshmivarahan and S. K. Dhall.
Analysis and Design of Parallel Algorithms: Arithmetic and
Matrix Problems.
McGraw-Hill, 1990.
- 185
-
L. Lamport.
Time, clocks, and the ordering of events in a distributed system.
Commun. ACM, 21(7):558--565, 1978.
- 186
-
H. Lawson.
Parallel Processing in Industrial Real-time Applications.
Prentice Hall, 1992.
- 187
-
F. T. Leighton.
Introduction to Parallel Algorithms and Architectures.
Morgan Kaufmann, 1992.
- 188
-
M. Lemke and D. Quinlan.
P++, a parallel C++ array class library for
architecture-independent development of structured grid applications.
In Proc. Workshop on Languages, Compilers, and Runtime
Environments for Distributed Memory Computers. ACM, 1992.
- 189
-
E. Levin.
Grand challenges in computational science.
Commun. ACM, 32(12):1456--1457, 1989.
- 190
-
F. C. H. Lin and R. M. Keller.
The gradient model load balancing method.
IEEE Trans. Software Eng., SE-13(1):32--38, 1987.
- 191
-
V. Lo.
Heuristic algorithms for task assignment in distributed systems.
IEEE Trans. Computs., C-37(11):1384--1397, 1988.
- 192
-
C. Loan.
Computational Frameworks for the Fast Fourier Transform.
SIAM, 1992.
- 193
-
D. Loveman.
High Performance Fortran.
IEEE Parallel and Distributed Technology, 1(1):25--42, 1993.
- 194
-
E. Lusk, R. Overbeek, et al.
Portable Programs for Parallel Processors.
Holt, Rinehard, and Winston, 1987.
- 195
-
U. Manber.
On maintaining dynamic information in a concurrent environment.
SIAM J. Computing, 15(4):1130--1142, 1986.
- 196
-
O. McBryan.
An overview of message passing environments.
Parallel Computing, 20(4):417--444, 1994.
- 197
-
O. A. McBryan and E. F. V. de Velde.
Hypercube algorithms and implementations.
SIAM J. Sci. and Stat. Computing, 8(2):227--287, 1987.
- 198
-
S. McConnell.
Code Complete: A Practical Handbook of Software Construction.
Microsoft Press, 1993.
- 199
-
C. Mead and L. Conway.
Introduction to VLSI Systems.
Addison-Wesley, 1980.
- 200
-
P. Mehrotra and J. Van Rosendale.
Programming distributed memory architectures using Kali.
In Advances in Languages and Compilers for Parallel Computing.
MIT Press, 1991.
- 201
-
J. D. Meindl.
Chips for advanced computing.
Scientific American, 257(4):78--88, 1987.
- 202
-
Message Passing Interface Forum.
Document for a standard message-passing interface.
Technical report, University of Tennessee, Knoxville, Tenn.,
1993.
- 203
-
Message Passing Interface Forum.
MPI: A message passing interface.
In Proc. Supercomputing '93, pages 878--883. IEEE Computer
Society, 1993.
- 204
-
M. Metcalf and J. Reid.
Fortran 90 Explained.
Oxford Science Publications, 1990.
- 205
-
R. Metcalfe and D. Boggs.
Ethernet: Distributed packet switching for local area networks.
Commun. ACM, 19(7):711--719, 1976.
- 206
-
J. Michalakes.
Analysis of workload and load balancing issues in the NCAR
community climate model.
Technical Report ANL/MCS-TM-144, Mathematics and Computer Science
Division, Argonne National Laboratory, Argonne, Ill., 1991.
- 207
-
B. Miller et al.
IPS-2: The second generation of a parallel program measurement
system.
IEEE Trans. Parallel and Distributed Syst., 1(2):206--217,
1990.
- 208
-
E. Miller and R. Katz.
Input/output behavior of supercomputing applications.
In Proc. Supercomputing '91, pages 567--576. ACM, 1991.
- 209
-
R. Miller and Q. F. Stout.
Parallel Algorithms for Regular Architectures.
MIT Press, 1992.
- 210
-
R. Milner.
Calculi for synchrony and asynchrony.
Theoretical Computer Science, 25:267--310, 1983.
- 211
-
nCUBE Corporation.
nCUBE 2 Programmers Guide, r2.0, 1990.
- 212
-
nCUBE Corporation.
nCUBE 6400 Processor Manual, 1990.
- 213
-
D. M. Nicol and J. H. Saltz.
An analysis of scatter decomposition.
IEEE Trans. Computs., C-39(11):1337--1345, 1990.
- 214
-
N. Nilsson.
Principles of Artificial Intelligence.
Tioga Publishers, 1980.
- 215
-
Grand challenges: High performance computing and communications.
A Report by the Committee on Physical, Mathematical and
Engineering Sciences, NSF/CISE, 1800 G Street NW, Washington,
DC 20550, 1991.
- 216
-
D. Nussbaum and A. Agarwal.
Scalability of parallel machines.
Commun. ACM, 34(3):56--61, 1991.
- 217
-
R. Paige and C. Kruskal.
Parallel algorithms for shortest paths problems.
In Proc. 1989 Intl. Conf. on Parallel Processing, pages 14--19,
1989.
- 218
-
C. Pancake and D. Bergmark.
Do parallel languages respond to the needs of scientific programmers?
Computer, 23(12):13--23, 1990.
- 219
-
Parasoft Corporation.
Express Version 1.0: A Communication Environment for Parallel
Computers, 1988.
- 220
-
D. Parnas.
On the criteria to be used in decomposing systems into modules.
Commun. ACM, 15(12):1053--1058, 1972.
- 221
-
D. Parnas.
Designing software for ease of extension and contraction.
IEEE Trans. Software Eng., SE-5(2):128--138, 1979.
- 222
-
D. Parnas and P. Clements.
A rational design process: How and why to fake it.
IEEE Trans. Software Eng., SE-12(2):251--257, 1986.
- 223
-
D. Parnas, P. Clements, and D. Weiss.
The modular structure of complex systems.
IEEE Trans. Software Eng., SE-11(3):259--266, 1985.
- 224
-
J. Patel.
Analysis of multiprocessors with private cache memories.
IEEE Trans. Computs., C-31(4):296--304, 1982.
- 225
-
J. Pearl.
Heuristics---Intelligent Search Strategies for Computer Problem
Solving.
Addison-Wesley, 1984.
- 226
-
G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harey, W. J. Kleinfelder,
K. P. McAuliffe, E. A. Melton, V. A. Norlton, and J. Weiss.
The IBM research parallel processor prototype (RP3):
Introduction and architecture.
In Proc. 1985 Intl Conf. on Parallel Processing, pages
764--771, 1985.
- 227
-
P. Pierce.
The NX/2 operating system.
In Proc. 3rd Conf. on Hypercube Concurrent Computers and
Applications, pages 384--390. ACM Press, 1988.
- 228
-
J. Plank and K. Li.
Performance results of ickp---A consistent checkpointer on
the iPSC/860.
In Proc. 1994 Scalable High-Performance Computing Conf., pages
686--693. IEEE Computer Society, 1994.
- 229
-
J. Pool et al.
Survey of I/O intensive applications.
Technical Report CCSF-38, CCSF, California Institute of Technology,
1994.
- 230
-
A. Pothen, H. Simon, and K. Liou.
Partitioning sparse matrices with eigenvectors of graphs.
SIAM J. Mat. Anal. Appl., 11(3):430--452, 1990.
- 231
-
D. Pountain.
A Tutorial Introduction to OCCAM Programming.
INMOS Corporation, 1986.
- 232
-
A research and development strategy for high performance computing.
Office of Science and Technology Policy, Executive Office of the
President, 1987.
- 233
-
The federal high performance computing program.
Office of Science and Technology Policy, Executive Office of the
President, 1989.
- 234
-
M. Quinn.
Analysis and implementation of branch-and-bound algorithms on a
hypercube multicomputer.
IEEE Trans. Computs., C-39(3):384--387, 1990.
- 235
-
M. Quinn.
Parallel Computing: Theory and Practice.
McGraw-Hill, 1994.
- 236
-
M. Quinn and N. Deo.
Parallel graph algorithms.
Computing Surveys, 16(3):319--348, 1984.
- 237
-
M. Quinn and N. Deo.
An upper bound for the speedup of parallel best-bound
branch-and-bound algorithms.
BIT, 26(1):35--43, 1986.
- 238
-
S. Ranka and S. Sahni.
Hypercube Algorithms for Image Processing and Pattern
Recognition.
Springer-Verlag, 1990.
- 239
-
V. Rao and V. Kumar.
Parallel depth-first search, part I: Implementation.
Intl. J. of Parallel Programming, 16(6):501--519, 1987.
- 240
-
D. A. Reed.
Experimental Performance Analysis of Parallel Systems:
Techniques and Open Problems.
In Proc. 7th Intl Conf. on Modeling Techniques and Tools for
Computer Performance Evaluation, 1994.
- 241
-
D. A. Reed, R. A. Aydt, R. J. Noe, P. C. Roth, K. A. Shields, B. W. Schwartz,
and L. F. Tavera.
Scalable Performance Analysis: The Pablo Performance
Analysis Environment.
In Proc. Scalable Parallel Libraries Conf., pages 104--113.
IEEE Computer Society, 1993.
- 242
-
D. A. Reed and R. M. Fujimoto.
Multicomputer Networks: Message-Based Parallel Processing.
MIT Press, 1989.
- 243
-
A. Reinefeld and V. Schnecke.
Work-load balancing in highly parallel depth-first search.
In Proc. 1994 Scalable High-Performance Computing Conf., pages
773--780. IEEE Computer Society, 1994.
- 244
-
B. Ries, R. Anderson, W. Auld, D. Breazeal, K. Callaghan, E. Richards, and
W. Smith.
The Paragon performance monitoring environment.
In Proc. Supercomputing '93, pages 850--859. IEEE Computer
Society, 1993.
- 245
-
A. Rogers and K. Pingali.
Process decomposition through locality of reference.
In Proc. SIGPLAN '89 Conf. on Program Language Design and
Implementation. ACM, 1989.
- 246
-
K. Rokusawa, N. Ichiyoshi, T. Chikayama, and H. Nakashima.
An efficient termination detection and abortion algorithm for
distributed processing systems.
In Proc. 1988 Intl. Conf. on Parallel Processing: Vol. I,
pages 18--22, 1988.
- 247
-
M. Rosing, R. B. Schnabel, and R. P. Weaver.
The DINO parallel programming language.
Technical Report CU-CS-501-90, Computer Science Department,
University of Colorado at Boulder, Boulder, Col., 1990.
- 248
-
Y. Saad and M. H. Schultz.
Topological properties of hypercubes.
IEEE Trans. Computs., C-37:867--872, 1988.
- 249
-
Y. Saad and M. H. Schultz.
Data communication in hypercubes.
J. Parallel and Distributed Computing, 6:115--135, 1989.
- 250
-
P. Sadayappan and F. Ercal.
Nearest-neighbor mapping of finite element graphs onto processor
meshes.
IEEE Trans. Computs., C-36(12):1408--1424, 1987.
- 251
-
J. Saltz, H. Berryman, and J. Wu.
Multiprocessors and runtime compilation.
Concurrency: Practice and Experience, 3(6):573--592, 1991.
- 252
-
J. Schwartz.
Ultracomputers.
ACM Trans. Program. Lang. Syst., 2(4):484--521, 1980.
- 253
-
C. L. Seitz.
Concurrent VLSI architectures.
IEEE Trans. Computs., C-33(12):1247--1265, 1984.
- 254
-
C. L. Seitz.
The cosmic cube.
Commun. ACM, 28(1):22--33, 1985.
- 255
-
C. L. Seitz.
Multicomputers.
In C.A.R. Hoare, editor, Developments in Concurrency and
Communication. Addison-Wesley, 1991.
- 256
-
M. S. Shephard and M. K. Georges.
Automatic three-dimensional mesh generation by the finite octree
technique.
Int. J. Num. Meth. Engng., 32(4):709--749, 1991.
- 257
-
J. Shoch, Y. Dalal, and D. Redell.
Evolution of the Ethernet local computer network.
Computer, 15(8):10--27, 1982.
- 258
-
H. Simon.
Partitioning of unstructured problems for parallel processing.
Computing Systems in Engineering, 2(2/3):135--148, 1991.
- 259
-
J. Singh, J. L. Hennessy, and A. Gupta.
Scaling parallel programs for multiprocessors: Methodology and
examples.
IEEE Computer, 26(7):42--50, 1993.
- 260
-
M. Singhal.
Deadlock detection in distributed systems.
Computer, 22(11):37--48, 1989.
- 261
-
P. Sivilotti and P. Carlin.
A tutorial for CC++.
Technical Report CS-TR-94-02, Caltech, 1994.
- 262
-
A. Skjellum.
The Multicomputer Toolbox: Current and future directions.
In Proc. Scalable Parallel Libraries Conf., pages 94--103. IEEE
Computer Society, 1993.
- 263
-
A. Skjellum, editor.
Proc. 1993 Scalable Parallel Libraries Conf.
IEEE Computer Society, 1993.
- 264
-
A. Skjellum, editor.
Proc. 1994 Scalable Parallel Libraries Conf.
IEEE Computer Society, 1994.
- 265
-
A. Skjellum, N. Doss, and P. Bangalore.
Writing libraries in MPI.
In Proc. Scalable Parallel Libraries Conf., pages 166--173.
IEEE Computer Society, 1993.
- 266
-
A. Skjellum, S. Smith, N. Doss, A. Leung, and M. Morari.
The design and evolution of Zipcode.
Parallel Computing, 20:565--596, 1994.
- 267
-
J. R. Smith.
The Design and Analysis of Parallel Algorithms.
Oxford University Press, 1993.
- 268
-
L. Snyder.
Type architectures, shared memory, and the corollary of modest
potential.
Ann. Rev. Comput. Sci., 1:289--317, 1986.
- 269
-
H. S. Stone.
High-Performance Computer Architectures.
Addison-Wesley, third edition, 1993.
- 270
-
B. Stroustrup.
The C++ Programming Language.
Addison-Wesley, second edition, 1991.
- 271
-
C. Stunkel, D. Shea, D. Grice, P. Hochschild, and M. Tsao.
The SP1 high-performance switch.
In Proc. 1994 Scalable High-Performance Computing Conf., pages
150--157. IEEE Computer Society, 1994.
- 272
-
R. Suaya and G. Birtwistle, editors.
VLSI and Parallel Computation.
Morgan Kaufmann, 1990.
- 273
-
J. Subhlok, J. Stichnoth, D. O'Hallaron, and T. Gross.
Exploiting task and data parallelism on a multicomputer.
In Proc. 4th ACM SIGPLAN Symp. on Principles and Practice of
Parallel Programming. ACM, 1993.
- 274
-
X.-H. Sun and L. M. Ni.
Scalable problems and memory-bounded speedup.
J. Parallel and Distributed Computing, 19(1):27--37, 1993.
- 275
-
V. Sunderam.
PVM: A framework for parallel distributed computing.
Concurrency: Practice and Experience, 2(4):315--339, 1990.
- 276
-
Supercomputer Systems Division, Intel Corporation.
Paragon XP/S Product Overview, 1991.
- 277
-
P. Swarztrauber.
Multiprocessor FFTs.
Parallel Computing, 5:197--210, 1987.
- 278
-
D. Tabak.
Advanced Multiprocessors.
McGraw-Hill, 1991.
- 279
-
A. Tantawi and D. Towsley.
Optimal load balancing in distributed computer systems.
J. ACM, 32(2):445--465, 1985.
- 280
-
R. Taylor and P. Wilson.
Process-oriented language meets demands of distributed processing.
Electronics, Nov. 30, 1982.
- 281
-
Thinking Machines Corporation.
The CM-2 Technical Summary, 1990.
- 282
-
Thinking Machines Corporation.
CM Fortran Reference Manual, version 2.1, 1993.
- 283
-
Thinking Machines Corporation.
CMSSL for CM Fortran Reference Manual, version 3.0, 1993.
- 284
-
A. Thomasian and P. F. Bay.
Analytic queuing network models for parallel processing of task
systems.
IEEE Trans. Computs., C-35(12):1045--1054, 1986.
- 285
-
E. Tufte.
The Visual Display of Quantitative Information.
Graphics Press, 1983.
- 286
-
J. Ullman.
Computational Aspects of VLSI.
Computer Science Press, 1984.
- 287
-
Building an advanced climate model: Program plan for the CHAMMP climate
modeling program.
U.S. Department of Energy, 1990.
Available from National Technical Information Service, U.S. Dept of
Commerce, 5285 Port Royal Rd, Springfield, VA 22161.
- 288
-
L. Valiant.
A bridging model for parallel computation.
Commun. ACM, 33(8):103--111, 1990.
- 289
-
R. A. van de Geijn.
Efficient global combine operations.
In Proc. 6th Distributed Memory Computing Conf., pages
291--294. IEEE Computer Society, 1991.
- 290
-
E. F. van de Velde.
Concurrent Scientific Computing.
Number 16 in Texts in Applied Mathematics. Springer-Verlag, 1994.
- 291
-
Y. Wallach.
Parallel Processing and Ada.
Prentice-Hall, 1991.
- 292
-
W. Washington and C. Parkinson.
An Introduction to Three-Dimensional Climate Modeling.
University Science Books, 1986.
- 293
-
R. Williams.
Performance of dynamic load balancing algorithms for unstructured
mesh calculations.
Concurrency: Practice and Experience, 3(5):457--481, 1991.
- 294
-
S. Wimer, I. Koren, and I. Cederbaum.
Optimal aspect ratios of building blocks in VLSI.
In Proc. 25th ACM/IEEE Design Automation Conf., pages 66--72,
1988.
- 295
-
N. Wirth.
Program development by stepwise refinement.
Commun. ACM, 14(4):221--227, 1971.
- 296
-
M. Wolfe.
Optimizing Supercompilers for Supercomputers.
MIT Press, 1989.
- 297
-
P. H. Worley.
The effect of time constraints on scaled speedup.
SIAM J. Sci. and Stat. Computing, 11(5):838--858, 1990.
- 298
-
P. H. Worley.
Limits on parallelism in the numerical solution of linear PDEs.
SIAM J. Sci. and Stat. Computing, 12(1):1--35, 1991.
- 299
-
J. Worlton.
Characteristics of high-performance computers.
In Supercomputers: Directions in Technology and its
Applications, pages 21--50. National Academy Press, 1989.
- 300
-
X3J3 Subcommittee.
American National Standard Programming Language Fortran
(X3.9-1978).
American National Standards Institute, 1978.
- 301
-
J. Yan, P. Hontalas, S. Listgarten, et al.
The Automated Instrumentation and Monitoring System (AIMS)
reference manual.
NASA Technical Memorandum 108795, NASA Ames Research Center, Moffett
Field, Calif., 1993.
- 302
-
H. Zima, H.-J. Bast, and M. Gerndt.
SUPERB: A tool for semi-automatic MIMD/SIMD parallelization.
Parallel Computing, 6:1--18, 1988.
- 303
-
H. Zima and B. Chapman.
Supercompilers for Parallel and Vector Computers.
Addison-Wesley, 1991.
© Copyright 1995 by Ian Foster