Abstracts for Tech Reports for 1991 TR29-1991 Refinement of the Process-Channel-Process Model for Distributed Implementation of Hierarchical Flush Channels by Keith Shafer & Mohan Ahuja. PAPER COPY ONLY While communication in asynchronous distributed systems is often modeled as process-to-process communication, it actually occurs as process-to-channel(agent)-to-process (PCP) communication. The PCP model allows for the isolation of communication channel properties for implementation and comparison purposes while removing concerns for the implementation details from the communicating processes. In this paper, we continue refinement of the PCP model and explore distributed, numerical channel implementations based on the PCP model and augmented data structures. Key Words: communication channels, PCP communication model, Hierarchical F-Channels, numerical channel implementation. TR28-1991 Clarifying Some Fundamental Concepts in Software Testing by Allen S. Parrish & Stuart H. Zweben. PAPER COPY ONLY At the foundational level, software testing theory consists of a set of concepts and definitions that are generally accepted by the research community. Unfortunately, this classical framework does not properly support the introduction of certain assumptions about how the testing process is conducted. In this paper, we will show that certain extensions to the classical definitions admit proper formal analysis under these particular assumptions. We use this extended framework to motivate how certain properties of adequacy criteria should be developed and applied. This enables us to resolve a conflict from the testing literature between two apparently fundamental properties of criteria called monotonicity and the correctness condition. The framework allows the characterization of competing models of the testing process, and motivates differing properties of adequacy criteria under different models. In general, the extended framework should provide a more useful basis for future theoretical research in testing. Index Terms: Adequacy criteria, Formal systems, Fundamental properties, Software testing. TR27-1991 PR Banyan: A Packet Switch with a Pseudo Randomizer for Nonuniform Traffic by Young Man Kim & Kyungsook Lee. PAPER COPY ONLY Buffered banyan networks are very effective as a packet switch under uniform traffic. However, they are highly vulnerable to nonuniform traffic, due to path sharing as well as the provision of a single path per network input-output pair. Improving upon an earlier packet distribution network which is a banyan network itself [2,3,12] a single-stage packet-scattering hardware, called the pseudo randomizer (PR) is proposed. The PR-banyan, the PR followed by a buffered banyan, is analyzed under nonuniform traffic, and is shown to be highly effective under nonuniform traffic. The analytic results are shown to match the simulation results very closely. Key Words: Buffered banyan networks, Nonuniform traffic, Packet distribution, Randomization network, Pseudo Randomizer, Performance Analysis. TR26-1991 Functional Specification and Semantics of Nondeterministic Process Networks by Prasad Vishnubhotla. PAPER COPY ONLY We define a class of nondeterministic process networks and show that they can be specified by input-output functions, i.e., functions that map input channels to output channels. To program nondeterminism, we use the notion of a multiplexor channel which connects multiple senders to a single receiver so that the receiver gets a nondeterministic merge of the sequences sent by the senders. We model the history of a multiplexor channel using a tree structured trace which records messages in the order they are sent rather than in the order they are received. A tree structured trace does not hide the parallelism between the messages sent on the channel and provides an exact representation of their causal relationships. A process is specified by a set of equations of the form di = fi(c1, ... , cm) where ci's are input channels, di's are output channels and Fi's are continuous functions. The channel names represent tree structured traces. The system of equations has a unique minimal solution, the least fixpoint, which defines the semantics of the process. Key Phrases: Processor Networks, Nondeterminism, Specification, Verification, Semantics. TR25-1991 Benefits of Using an Integrated Architecture for Data Translation by Sandra Mamrak. PAPER COPY ONLY The Chameleon Research Project has demonstrated that for one particular class of encoding schemes the problem of data translation can be solved in a general, elegant way. A translation technology, the Integrated Chameleon Architecture, unique in its combined functionality and integration, automates the generation of translation code. The technology eliminates coding errors, reduces iterations to achieve correct translations, and increases productivity. In this manuscript we describe how the architecture is used to specify translators and we report on our experience using the technology to develop translators for a document of type 'book' and for bibliographic databases. Keywords: code generation, data translation, intermediate form, SGML. TR24-1991 Numerical Properties of the Matrix Sign Function Solution of Algebraic Riccati Equations by Judy Gardiner & Andy Pitonyak. PAPER COPY ONLY Because of its suitability for parallel computation the matrix sign function has received attention recently as an algorithm for solving the algebraic Riccati equation. It has, however, been considered numerical suspect and not competitive with the more mature orthogonal transformation methods. This paper presents some enhancements to the sign function algorithm, along with the results of extensive testing of the sign function algorithm and comparisons with the orthogonal transformation algorithm embodied in RICPACK. We conclude that when used with an iterative refinement loop and appropriate stopping criteria the sign function has numerical properties comparable to or better than those of the RICPACK algorithm. The sign function is also faster than RICPACK in most cases. We conclude that the matrix sign function provides an eficient and reliable method for solving algebraic Riccati equations on both standard and parallel computers. Key words: computational methods for control, parallel algorithms, algebraic Riccati equations, matrix sign function. TR23-1991 Query Processing by Cascading Predicates for Databases Supporting Procedures by Yao-Nan Lien and Shu-Shang Wei. PAPER COPY ONLY To process a query for databases supporting procedures may induce many secondary queries. A predicate-cascading approach that follows the "multiple query optimization" paradigm is proposed. This approach evaluates each data unit brought into internal memory against all applicable secondary queries simultaneously. Neither the implication relationship among queries nor a complicated searching process to determine the best query plan is required. With an appropriate concurrency control, this approach can be applied to the general multiple query optimization problem as well. Moreover, the proposed approach can easily be extended to a parallel or distributed version. The preliminary results of experiments show that this approach can effectively reduce the I/O overhead. TR22-1991 The Extended Low Hierarchy is an Infinite Hierarchy by Ming-Jye Sheu and Timothy J. Long NOTE: This abstract has a number of subscripts and superscripts, more than enough to make the more command and emacs give up. Much of it can't even be done on MacWrite. And putting it into TeX rather defeats the purpose of this abstracts list. To give you an idea of the content, what follows are a few selected parts of the abstract: Balcazar, Book, and Schoening introduced the extended low hierarchy based on the (Greek letter) Sigma-levels of the polynomial-time hierarchy. Allender and Hemachandra and Long and Sheu introduced refinements of the extended low hierarchy based on delta- and Theta-levels, respectively, of the polynomial-time hierarchy. In this paper we show that the extended low hierarchy is properly infinite. Our proofs use the circuit lower bound techniques of Hastad and Ko. Key words: the extended low hierarchy, circuit lower bounds, the polynomial-time hierarchy, relativizations. TR21-1991 Algorithms for Polynomial Real Root Isolation by Jeremy R. Johnson. PAPER COPY ONLY Ph.D. Dissertation: This thesis investigates algorithms for polynomial real root isolation of polynomials with integer and real algebraic number coefficients. A real root isolation algorithm computes isolating intervals (intervals containing exactly one root) for all the real roots of a polynomial. Improved algorithms are derived and implemented for both integral and algebraic polynomials. Some of the improvements include fast algorithms for polynomial permutation and transformation, improved algorithms for computing the sign of a real algebraic number, algorithms for computing a root bound of a polynomial with real algebraic number coefficients, the use of multiple extensions, and the use of interval arithmetic. Algorithms based on Sturm sequences, Rolle's theorem and the derivative sequence, and Descartes' rule of signs and polynomial trans- formations are considered. The algorithms are carefully compared both theoretically and empirically. Improved computing time bounds are obtained, strong conjectures are provided for the average computing times, and insight is given into the performance of the various algorithms. This thesis contains the first careful comparison and implementation of these algorithms for polynomials with real algebraic number coefficients. Real root isolation is an essential and time consuming part of G.E. Collins' cylindrical algebraic decomposition (CAD) based quantifier elimination (QE) algorithm. The improved algorithms presented in this thesis significantly reduce the computing time of Collins' algorithm. The different root isolation algorithms are compared using inputs that arise in the CAD algorithm. The CAD algorithm is also used to study the effect of perturbation of the coefficients of a polynomial on the number of real roots of the polynomial. Results from this study have implications for algorithms that use interval arithmetic. TR20-1991 Workstations in Engineering Education: Benefits and Required Infrastructure by Mervin E. Muller PAPER COPY ONLY Summary: The remarkable evolution of digital computers which has led to the availability of engineering workstations is summarized. A case is made that engineering workstations not only offer opportunities to enhance education in developed countries but also offer a practical way to enhance education in Third World Countries. The term "engineering workstation" is explained. Opportunities for substantially enhancing engineering education through the deployment of interconnected engineering workstations are identified. A description is given of the necessary infrastructure, and commitments needed to realize the benefits that can result from using workstations in the education of engineers. A description is given of the Interactive Instructional Computing Facilities (IICF), which are based upon the use of interconnected workstations, in the Department of Computer and Information Science in the College of Engineering at The Ohio State University. TR19-1991 Modeling Asynchronous Distributed Communication by Keith Shafer & Mohan Ahuja. PAPER COPY ONLY While communication in asynchronous distributed systems occurs as process-to-channel-to-process (PCP) communication, it is often modeled as process-to-process communication. That is, the communication channels are not explicitly mentioned. Here, we pursue the notion that asynchronous distributed communication should be modeled as PCP communication. In addition to providing a framework for implementation, this approach leads to a greater understanding of the power of different channel types. To demonstrate the model, we provide an implementation of a Hierarchical F-Channel, a channel based on the hierarchical model of system development, hierarchical communication speeds, and F-channels. Key Words: communication channels, PCP communication model, hierarchical communication channels. TR18-1991 Communication-Free Hyperplane Partitioning of Nested Loops by C.-H. Huang & P. Sadayappan. PAPER COPY ONLY This paper addresses the problem of partitioning the iteration of nested loops, and data arrays accessed by the loops. Hyperplane partitions of disjoint subsets of data arrays and loop iterations that result in the elimination of communication are sought. A characterization of necessary and sufficient conditions for communication-free hyperplane partitioning is provided. TR17-1991 A Study on the Structure of Linear Recursion by Wen Yu Lu and Dik Lun Lee. PAPER COPY ONLY We study a general class of single linear recursions and the properties of their expansions by analyzing the structures of the recursions. We show that the expansions of a linear recursion of this class are very regular in that the variable connections are heavily shared and change periodically with respect to the expansions. The variable connections can be precisely characterized as static bindings and chain connections. We conclude that a single linear recursion under our assumptions is either bounded or can be expressed as chain recursions. This study contributes to query processing, since it provides the basis for rule compilation as a general and powerful technique for query processing. Combined with query information, the expansion properties of the recursion provides optimized query processing plans. TR16-1991 Compiling Loops for Hierarchical Memory Multiprocessors by J.D. Martens and D.N.Jayasimha. PAPER COPY ONLY In this paper we investigate issues related to programming on and compiling for hierarchical shared memory multiprocessors. A method of parallelizing loops which takes advantage of hierarchical memory is introduced, and the expected performance of the hierarchical machine is compared with that of a conventional architectural model. The comparison emphasizes synchronization and communication costs. Key Words: parallelizing compilers, hierarchical memory, shared memory multiprocessing, iterative loops, latency. TR15-1991 What is an Effective Schedule? by D.R. Lutz & D.N. Jayasimha. PAPER COPY ONLY Parallel algorithms are more difficult to analyze than sequential algorithms, in part because we have to account for communication between the processors in addition to the usual measures of time and space. To find lower bounds on communication, we have to examine processor schedules, and to find meaningful lower bounds, we have to restrict the kinds of schedules that we examine. The traditional restriction placed on schedules is to require that all processors perform equal work. We show that this restriction is insufficient, but we use it as a starting point, and by successively refining our intuitive notions of what good and bad schedules are we come up with the concept of an effective schedule. We then apply the concept of effective schedules to the analysis of four parallel algorithms represented as directed acyclic graphs (DAGs): the binary tree, the diamond DAG, the triangular solver DAG, and the FFT DAG. We pay particular attention to a communication/time tradeoff result which has been shown by Papadimitriou and Ullman for the diamond DAG [PaUl87] and Jayasimha and Loui for the triangular solver DAG [JaLo88]. We find that all known examples of this tradeoff occur only with ineffective schedules. Key words: parallel algorithmic analysis, communication complexity, parallel time complexity, tradeoff, effective schedules. TR14-1991 Distributed Rule Processing in a Database Environment by Ing-Miin Hsu, Mukesh Singhal, Ming T. Liu. PAPER COPY ONLY Processing rules in a distributed data base environment involves three design issues: how to decompose rules, how to distribute rules to sites, and how to evaluate distributed rules correctly. In this paper, we study these three issues for complicated rules, which are complex and time consuming to evaluate. We propose a new relational operator, AND, and the associated algebraic manipulations of this operator to find independent parts of rule query, which can be distributed among sites. Due to problems introduced by geographical distribution of sites in a distributed system, such as lack of global clock and communication delay, correct evaluation of distributed rules is not an easy task. We propose a distributed evaluation algorithm which guarantees the correctness of the evaluation result of the distributed rule by collecting consistent local results from sites to form the global view. TR13-1991 Timestamping Events for Inferring "Affects" Relation and Potential Causality by Mohan Ahuja, Timothy Carlson, Ashwani Gahlot & Deborah Shands. PAPER COPY ONLY We define a relation "Affects" between every pair of events based on a relation "Locally Affects" for each process between each pair of events on the process. We define a mechanism for timestamping events such that "Affects" relation, and so potential concurrency, between events can be inferred from their timestamps. We give a timestamping mechanism such that potential concurrency can be inferred partially/compleltely and the extent of such inferring depends on the costs associated with the mechanism. Inferring "Affects" relationship bcan be used for debugging and for inferring potential concurrency between events in traces of executions. Key Words: Logical clocks, timestamps, event ordering, event causality, debugging and inferring causality. TR12-1991 A Framework for a Comprehensive Analysis of Navigation in Hierarchy Topologies by S.A. Mamrak and C.S. O'Connell. PAPER COPY ONLY Even though navigation strategies and mechanisms have been the target of much of the research and development for hyperbase applications, little synthesis or integration of results has been provided to the designer of hyperbase software. One effect of this lack of an encompassing evaluative framework is that little that is already known about navigation is making its way into the design of software packages in any generally-accepted or standard way. In this manuscript we propose a general, comprehensive framework for analyzing navigation in the topologies occurring in hyperbases and demonstrate a method for using this framework to evaluate navigation in hierarchy topologies. The approach appears to be promising as a foundation for providing guidance to designers of navigation mechanisms for hyperbase software. Index Terms: disorientation, electronic manuscripts, hierarchy, hypertext, navigation, topology. TR11-1991 On Multi-Threaded List-Processing and Garbage Collection by Wolfgang Kuechlin and Nicholas J. Nevin PAPER COPY ONLY We present the S-threads system for multi-threaded list-procesing in C, and we discuss its unique techniques of parallel garbage collection. The system is implemented on top of C Threads on an Encore Multimax running Mach. It was developed for the parallelization of the SAC-2 Computer Algebra system, but S-threads is general enough to support the parallelization of other C based list-processing systems. An S-thread is a C Thread extended by a private heap segment with a private list of available cells. Private heaps are dynamically allocated by claiming pages of cells from a global shared heap. S-threads can thus concurrently cons list cells and the applications library can build other parallel list-processing on top. Our main discovery is that the exclusive use of the portable C Threads system has profound implications for garbage collection. We can no longer efficiently stop all processing at a barrier, which precludes us from using many popular parallel garbage compaction algorithms. However, threads can now perform preventive garbage collection upon exiting, where they free their private heap segment at the small cost of copying their results. This proves very effective in our experience, and appears especially promising for real-time symbolic computation. Under some restrictions, threads can also run a sequential garbage collection algorithm, such as mark-and-sweep, on their private heaps. On separate threads, multiple garbage collections and ordinary processing can then occur concurrently. Under a largely functional style of programming, there is then no need for a new parallel garbage collection algorithm, which is important for full upward compatibility of sequential software. We present empirical results from testing our garbage collection techniques on a parallel algorithm for polynomial g.c.d. calculation. TR10-1991 Optimal Parallel Time Bounds for the Maximum Clique Problem on Intervals by Lin Chen. PAPER COPY ONLY In this paper, we give the optimal time bounds for solving the maximum clique problem on a set of intervals. The model of computation used is the parallel random access machine. A constant-time optimal parallel algorithm for finding the maximum of polynomially bounded integers is also included. Keywords: intervals, lower and upper bounds, maximum clique, parallel algorithms, PRAM. TR9-1991 NO LONGER AVAILABLE TR8-1991 A Transffer Policy for Global Scheduling Algorithms to Schedule Tasks with Deadlines by Niranjan G. Shivaratri & Mukesh Singhal. PAPER COPY ONLY Two important components of a global schduling algorithm are its transfer policy and its location policy. While the transfer policy determines whether a task should be transferred, the location policy determines where it should be transferred. Many global scheduling algorithms have been proposed to schedule tasks with deadline constraints. These algorithms try to transfer tasks only when task's deadlines cannot be met locally or local load is high (i.e., they take only corrective measures). It is more appropriate for a scheduling algorithm to take preventive measures in addition to corrective measures to avoid potential deadline misses. In this paper, we present: a) A new load index which characterizes the system state, which is more conducive to preventive and corrective measures. b) A new transfer policy which takes preventive measures by doing anticipatory task transfers in addition to corrective measures. Key features of our transfer policy are: 1) It is general and can be used in conjunction with a broad range of existing location policies. 2) It does not require the capability to transfer partially executed tasks. 3) It adapts better to the system stat by looking-ahead. A simulation study shows that an algorithm making use of the new transfer policy and the new load index reduces the number of deadline misses significantly when compared to algorithms taking only corrective measures. TR7-1991 Shape Transformation by Boundary Representation Interpolation: A Recursive Approach to Establishing Face Correspondences. Richard E. Parent. PAPER COPY ONLY The issues relating to the shape transformation problem are discussed and a new algorithm is presented for computing the transformation of one shape into another. In this algorithm, the boundary definitions of the two initial shapes are used and a mapping is established between the vertices and edges of the respective objects. New vertices and edges are introduced into the object definitions when necessary to establish a one-to-one vertex correspondence and to match connectivity relationships between vertices. These can then be used to do a vertex-to-vertex interpolation that maintains valid polyhedral topologies for all of the in-between shapes. The algorithm presented works for objects that are topologically equivalent to spheres and can be applied to other pairs of objects topologically equivalent to each other once the user specifies an initial decomposition of the surfaces of the objects. CR Categories: I.3.5 - Computational Geometry and Object Modeling (Surface and Object Representation). I.3.7 - Three Dimensional Graphics and Realism (Animation). Additional Keywords: shape change, shape interpolation, design, animation, solid modeling. TR6-1991 A Refinement of the Low and High Hierarchies by Tim Long and Ming-Jye Sheu. Based on the Sigma- and Delta-classes of the polynomial-time hierarchy, Schoening introduced low and high hierarchies within NP. Several classes of sets have been located in the bottom few levels of these hierarchies. Most results placing sets in the Delta-levels of the low hierarchy are related to sparse sets, and the proof techniques employed involve deterministic enumeration of sparse sets. Balcazar, Bok, and Schoening introduced extended low hierarchies, involving sets outside of NP, based on the Sigma and Delta-classs of the polynomial-time hierarchy. Several classes of sets have been located in the Delta-levels of these hierarchies as well, and once again most such results involve sparse sets. In this paper we introduce a refinement of the low and high hierarchies and of the extended low hierarchies. Our refinement is based on the Theta-classes of the polynomial-time hierarchy. We show that almost all of the classes of sets that are known to belong to the Delta-levels of the low and extended low hierarchies actually belong to the newly defined Theta-levels of the low and extended low hierarchies actually belong to the newly defined Theta-levels of these hierarchies. Our proofs use Kadin's technique of computing the census of a sparse set first, followed by a nondeterministic enumeration of the set. This leads to the sharper lowness results. We also consider the optimality of these new lowness results. For sets in the Theta-levels of the low hierarchy we have oracle results indicating that substantially stronger results are not possible through use of Kadin's technique. For sets in the Theta-classes of the extended low hierarchy we have tight absolute lower bounds: that is, lower obunds without oracles. These bounds are slightly stronger than similar bounds appearing in [AH92]. TR5-1991 The Power of Witness Reduction by Sanjay Gupta We identify "Witness reduction" as the underlying theme of several recent results in complexity theory. These include Toda's result and Toda and Ogiwara's results (the latter "collapses" PH into various counting classes with high probability); and hard functions for #P studied by Ogiwara and Hemachandra. Ogiwara and Hemachandra have shown that subtraction, division, etc. are hard for #P in the sense that, if #P is closed under composition with subtraction and division then, it is closed under composition with every other #P function. We identify witness reduction as the reason for the existence of hard functions. To support our thesis we define new function classes which are apriori closed under composition with subtraction and division implying that subtraction and division do not reduce witnesses in these classes. However, proper subtraction and integer division do reduce witnesses in the new classes and turn out to be hard for these classes. We also introduce and formalize the notion of random witness reduction and randomly hard functions and obtain results analogous to those of Ogiwara and Hemachandra. TR4-1991 Newton's Method by Anthony Leclerc. PAPER COPY ONLY. Newton's method has evolved considerably since its conception in 1669. This paper first considers the graphical and algebraic formulations for Newton's method including error analyses. Later, the issue of convergence is investigated from a novel graphical/fractal perspective. Weaknesses with the traditional point conception of Newton's method are pointed out and a completely reliable interval version of Newton's method is motivated and spotlighted. Finally, a problem with the naive extension of the one-dimensional interval Newton's method to multiple dimensions is revealed. This problem, arising from the difficulty in inverting an interval matrix, is circumvented by an ingenious variation due to Krawczyk. C++ code is included demonstrating the reliability of interval Newton's method on several classical test problems. TR3-1991 Flush Message Passing in Communicating Sequential Processes by Mohan Ahuja, Kannan Varadhan, & Amitabh Sinha. PAPER COPY ONLY In this paper we examine a model of concurrency amongst concurrent sequential processes with asynchronous message sending done using ordinary and Flush message passing primitives. We show using two paradigms that Flush primitives permit more accurate modeling of problem situations than synchronous message passing. We also show that Flush message passing primitives allow for a greater concurrency than synchronous message passing primitives. Keywords: F-channels, parallel architectures, synchronisation, asynchronous networks, concurrency, distributed systems, algorithms, measuring concurrency. TR2-1991 On the Multi-Threaded Computation of Integral Polynomial Greatest Common Divisors by Wolfgang Kuechlin PAPER COPY ONLY We report experiences and practical results from parallelizing the Brown-Collins polynomial g.c.d. algorithm, starting from Collins' SAC-2 implementation IPGCDC. The parallelization environment is PARSAC-2, a multi-threaded version of SAC-2 programmed in C with the parallelization constructs of the C Threads library. IPGCDC computes the g.c.d. and its co-factors of two polynomials in Z[z1, ... , zr]. and then recovering the result by Chinese remaindering. After studying timings of the SAC-2 algorithm, we first parallelize the Chinese remaindeer algorithm, and then we parallelize the main loop of IPGCDC by executing the modular g.c.d. computations concurrently. Finally, we determine speed-ups and speed-up efficiencies of our parallel algorithms over a wide range of polynomials. Our experiments were conducted on a 12 processor Encore Multimax 320 under the Mach operating system. TR1-1991 Hierarchy of Communication Speeds for Designing Concurrent Systems by Mohan Ahuja. PAPER COPY ONLY One reason for popularity of the layered approach to system design has been the ease in designing correct systems because of the ease of visualization of the execution, which becomes a sequence of processes each resulting from execution of a layer. We observe that sequentialization of processes is more heavy-weight approach than needed to gain the required ease of visualization and aim at a lighter-weight approach. We define notion of communication speed as it is relevant to distributed systems and then define and propose hierarchy of communication speeds as an alternative to sequentialization of processes. We illustrate how the concept of hierarchy of communication speeds and Flush-channels can be combined giving useful synchronization abstractions. These abstractions result in the same ease as sequentialization of processes and are expected to improve the system efficiency in a layered system and in this context they can be viewed as hierarchical Flush-channels. They are also useful for designing concurrent programs and in this context they can be viewed as Hierarchical-Flush-channels. Index terms: Flush-channels, Communication Speeds, Asynchronous Distributed Systems, Communication Architectures, Protocol Engineering, Distributed Algorithms, Programming Techniques for Protocols/Algorithms, and Synchronization.