next up previous contents
Next: Tools Up: An HPF Encyclopedia Previous: Parallel I/O

Runtime Libraries and HPF Language Extensions  

The last word on runtime support for HPF (besides that which handles I/O) must surely by the many-authored report by Fox et al. [73]. The (as yet unmet) goal of the Parallel Compiler Runtime Consortium is to develop a public domain library for different program transformation and runtime components supporting data parallelism. Address translations and data movements will be the new functions supported. (This BAA was funded and may produce results in the 1995-6 timeframe.) It appears that much of this work is being wrapped into the PORTS effort (Portable Run Time System ). Agrawal, Sussman and Saltz [11] (also discussed in [157]) describe the Multiblock Parti library, which helps to parallelize multiblock and multigrid codes on distributed memory machines. This is important because HPF does not allow you to map arrays (or templates) to subsets of the processor space. One needs to do this sometimes to keep communication overheads low while maintaining load balance. A related problem is mapping an array in non-constant chunks.

The PARTI library has evolved into a more dynamic library, called CHAOS [168]. This can be used for adaptive irregular problems. This used HPF/Fortran 90D at Syracuse as a testbed. They extend the language as well, with constructs like REDUCE(APPEND...).

A handy repository is LPAC which has coded some of the HPF intrinsics in Fortran 90. They have prefix_sum and grade_down, plus much more, but are not known to be very efficient.

An efficient, HPF-callable ScaLAPACK library is available from IBM (PESSL [100]). HPF is used to partition the arrays, and then USE PESSL_HPF lets you call PESSL which implements its functions efficiently using MPI [68].

Several extensions to the HPF language include task parallelism. One interesting one is Fx from CMU [82,173]. This uses a HeNCE-like graph connecting tasks (subroutines) to be run in parallel, and the extended HPF compiler takes care of shipping data among them.

Subhlok et al. describe an algorithm for mapping sequences of data parallel tasks onto a set of processors, which then executes in a pipelined fashion. This is a ``pre-tool'' for the Fx compiler[173]. In addition, the compiler, through analysis, can explose implicit task parallelism and schedule that [82].

Or, you could couple several different HPF programs, using the runtime library described in Ranganathan, et al. [160].

There is also a proposed Kernel HPF which contains ``only the most efficient language constructs of HPF'' [139]. Meltzer [138] subsequently proposed HPF_SPMD (later to become HPR_CRAFT), which extends the HPF_LOCAL extrinsic by incorporating kernel HPF features. Proposes HPF_SPMD as a new hybrid language, heavy on concept of private data.

Mahéo and Pazat [130] have built an interesting run time library that supports HPF data distributions via paging. Their compiler generates calls to runtime routines to manage arrays based on paging in order to achieve efficient execution on distributed systems.


next up previous contents
Next: Tools Up: An HPF Encyclopedia Previous: Parallel I/O
Donna Bergmark
2/18/1998