The basic idea underlying modular design is to organize a complex system (such as a large program, an electronic circuit, or a mechanical device) as a set of distinct components that can be developed independently and then plugged together. Although this may appear a simple idea, experience shows that the effectiveness of the technique depends critically on the manner in which systems are divided into components and the mechanisms used to plug components together. The following design principles are particularly relevant to parallel programming.
Simple interfaces reduce the number of interactions that must be considered when verifying that a system performs its intended function. Simple interfaces also make it easier to reuse components in different circumstances. Reuse is a major cost saver. Not only does it reduce time spent in coding, design, and testing, but it also allows development costs to be amortized over many projects. Numerous studies have shown that reusing software is by far the most effective technique for reducing software development costs.
As an example, a modular implementation of a climate modeling system (Figure 2.3) may define distinct modules concerned with atmosphere modeling, ocean modeling, etc. The interfaces to each module can comprise a small set of procedures that access boundary data, advance the simulation, and so on. Hence, there is no need for the user to become familiar with the implementation of the various modules, which collectively may comprise hundreds of procedures and tens of thousands of lines of code.
The benefits of modularity do not follow automatically from the act of subdividing a program. The way in which a program is decomposed can make an enormous difference to how easily the program can be implemented and modified. Experience shows that each module should encapsulate information that is not available to the rest of a program. This information hiding reduces the cost of subsequent design changes. For example, a module may encapsulate
Notice that we do not say that a module should contain functions that are logically related because, for example, they solve the same part of a problem. This sort of decomposition does not normally facilitate maintenance or promote code reuse.
While modular designs can in principle be implemented in any programming language, implementation is easier if the language supports information hiding by permitting the encapsulation of code and data structures. Fundamental mechanisms in this regard include the procedure (or subroutine or function) with its locally scoped variables and argument list, used to encapsulate code; the user-defined datatype, used to encapsulate data; and dynamic memory allocation, which allows subprograms to acquire storage without the involvement of the calling program. These features are supported by most modern languages (e.g., C++ , Fortran 90, and Ada) but are lacking or rudimentary in some older languages (e.g., Fortran 77).
The following design checklist can be used to evaluate the success of a modular design. As usual, each question should be answered in the affirmative.
We use a simple example to illustrate how information hiding considerations can influence design. To search a database for records matching a specified search pattern, we must read the database, search the database, and write any matching records found. One possible decomposition of this problem defines input, search, and output modules with the following interfaces.
input(in_file, database)search(database, search_pattern, matches)
output(out_file, database, matches)
An examination of what must be done to read a database, perform a search, and so on could then lead us to define the procedures that comprise the input, search, and output modules.
This design provides simple interfaces. However, all three modules depend on the internal representation used for the database, and hence must be modified if this representation is changed. In addition, each module probably duplicates database access functions.
An alternative decomposition, driven by information hiding concerns, focuses on the internal representation of the database as something potentially complex, likely to change, and common to many components. Hence, this information is hidden in a single database module that provides operations such as the following.
read_record(file, id, record)add_record(id, record, database)
get_record(id, record, database)
write_record(file, id, record)
The rest of the program can now be written without knowing anything about how the database is implemented. To change the internal representation of the database, we need simply to substitute a different implementation of the database module, which furthermore is ideally suited for reuse in other applications.
© Copyright 1995 by Ian Foster