Technical Report Series Computer and Information Science Research Center Department of Computer and Information Science The Ohio State University 2015 Neil Avenue, Columbus, OH 43210-1277 **********************NOTE********************* Abstracts for 1996 NOTE: some titles may be available ONLY via paper (This is noted after the title/author section.) If you have trouble accessing a file, please send email to tfletch@cis.ohio-state.edu 1996 TR67 Object Selection Based on Oscillatory Correlation by DeLiang Wang One of the classical topics in neural networks is winnter-take-all (WTA), which has been widely used in unsupervised (competitive) learning, cortical processing, and attentional control. Because of global connectivity, WTA networks, however, do not encode spatial relations in the input, and thus cannot support sensory and perceptual processing where spatial relations are important. We propose a new architecture that maintains spatial relations between input features. This selection network builds on LEGION (Locally Excitatory Globally Inhibitory Oscillator Networks) dynamics and slow inhibition. In an input scene with many objects (patterns), the network selects the largest object. This system can be easily adjusted to select several largest objects, which then alternate in time. We further show that a two-stage selection network gains efficiency by combining selection with parallel removal of noisy regions. The network is applied to select the most salient object in real images. As a special case, the selection network without local excitation gives rise to a new form of oscillatory WTA. TR66 Time Constrained Bandwidth Smoothing for Interactive Video-On- Demand Systems by Wu-chi Feng The use of a client-side buffer in the delivery of compressed prerecorded video can be an effective tool for removing the burstiness required of the underlying server and network by smoothing the bandwidth requirements for continuous delivery. Given a fixed-size smoothing buffer, several bandwidth smoothing algorithms have been introduced in the literature that are provably optimal under certain constraints, typically requiring large buffer residency times to realize their optimal properties. The large buffer residency times, however, make providing VCR functions harder to support. In this paper, we introduce two new algorithms that, in addition to the size of the client side buffer, use a time constraint as a parameter in the bandwidth plan creation, making the plans more amenable to supporting VCR interativity. Our results show that the buffer residency times can be reduced, while still allowing the bandwidth allocation to be smoother for continuous video delivery. TR63 Inheriting and Modifying Behavior by Neelam Soundararajan and Stephen Fridella In many languages, the mechanism of protected members is critical to making inheritance work. It allows access to some of the internals of a base class B to the designer of a derived class D, while denying it to clients of B. We develop an axiomatic approach in which class behavior is specified in terms of an abstract specification AND a concrete specification, the former for use by a client, the latter containing information about how the protected members change, for use by a client, the latter containing information about how the protected members change, for use by a derived class designer. We present proof rules that allow us to establish that a class meets both specifications. TR62 Logical Firewalls: A Mechanism for Security in Future Networking Environments by Jun Xu and Mukesh Singhal Although current firewalls serve the purpose of current network security needs, they have several drawbacks and are not suitable for future networking environments. This paper examines the state of the art of the firewall technology and points out deficiencies in the current designs. It advocates software firewalls for solving the security problems in future networking environements. It explains the design of such software firewalls, called logical firewalls. Keywords: Firewalls, network security, logical firewalls, protected group, protected network, protected host. TR61 Efficient Schemes for LImited Directory-based DSMs Using Multidestination Message Passing by Donglai Dai and Dhabaleswar K. Panda Many limited directory schemes have been proposed in the literature to build cost-effective DSM systems. However, all these schemes are based on networks supporting only point-to-point (unicast) message passing. New generation networks are providing architectural support to implement collective communication operations (broadcast, multicast, reduction, etc.) with reduced latency. This paper explores the impact of such architectural support in designing efficient and more cost-effective limited directory schemes. The study is carried out for womrhole k-ary n-cube networks supporting multidestination message passing mechanism. The study in this paper is carried out along two major directions. First, a variation of the diriB scheme to work with multidestination-based broadcast and gather operations is proposed. This scheme is defined as m_diriB scheme. Next, two coarse vector schemes (m_diriCB1 and m_diriCB2) are proposed to work efficiently with selective broadcast operations with multidestination messages. These schemes are evaluated for 32 and 64 processor systems with SPLASH2 benchmarks and compared with limited directory schemes using unicast message passing ((u_diriB and u_diriCV). It is shown that m_dir1B (with a single pointer can reduce the program execution time up to a factor of 3.47 compared to u_dir1B scheme. Similary m-Idr1B scheme performs closer to u_dir3CV scheme (with three pointers) and fully-mapped scheme. For 1-3 pointers, the m_diriCB1 and m_diriCB2 are shown to further reduce program execution time compared to the m_diriB scheme. For 1 pointer, m_dir1CB1 performs better than the u_dir1CV scheme by a factor of up to 1.66. These results indicate that limited directory schemes with only a single pointer can be designed to work efficiently on new generation wormhole networks with multidestination message passing. Such results provide strong guidelines to build future scalable DSM systems in a cost-effective manner. Keywords: Distributed shared memory, wormhole routing, directory-based protocols, cache coherency, meshes, parallel architecture, interprocessor communication, multidestination mechanism, and path-based routing. TR60 A Coherent Family of Analyzable Graphical Representations for Object-Oriented Software by Mary Jean Harold and Gregg Rothermel Many software engineering tools and techniques rely on graphical representations of software, such as control flow graphs, program dependence graphs, or system dependence graphs. To apply these tools and techniques to object-oriented software, we cannot necessarily use existing graphical representatios of procedural software. Representations of object-oriented software, like those for procedural software, must depict individual procedures (methods) and entire programs; however, they must also depict classes and their interactions, and account for the effects of inheritance, polymorphism, and aggregation. These representations should facilitate the use of existing program analysis tools and techniques, rather than requiring us to create new techniques. A system for constructing and managing these representatinos should take advantage of the code reuse inherent in object- oriented software, by reusing representational components. In this paper, we describe a coherent family of graphical representations of object-oriented software that meets the fore-going requirements, and we outline the architecture of an efficient, extensible system for constructing and managing those representations. TR59 Implementing Multidestination Worms in Switch Based Parallel Systems: Architectural Alternatives and their Impact by Craig B. Stunkel, Rajeev Sivaram, and Dhabaleswar K. Panda Multidestination message passing has been proposed as an attractive mechanism for efficiently implementing multicast and other collective operations on direct networks. However, applying this mechanism to switch based parallel systems is non-trivial. In this paper we propose alternative switch architectures with differing buffer organizations to implement multidestination worms on switch based parallel systems. First we discuss issues related to such implementation (deadlock-freedom, replication mechanisms, header encoding, and routing). Next, we demonstrate how an existing central queue based switch architecture supporting unicast message passing can be enhanced fairly easily to accommodate multidestination message passing. Similarly, implementing multidestination worms on an input buffer based switch architecture is discussed. Both these implementations are evaluated against each other as well as against a software-based scheme using the central queue organization. Simulation experiments under a range of traffic (multiple multicast, bimodal, varying degree of multicast, and message length) and system size are used for evaluation. The study demonstrates the superiority of the central queue based switch architecture for efficient implementation of multidestination worms. Such an implementation can perform well without saturating for applied loads of 0.8 or greater. It also indicates that under bimodal traffic the central-queue based hardware multicast implementation affects background unicast traffic less adversely compared to a software-based multicast implementation. Thus, multidestination message passing can easily be applied to switch-based parallel systems to deliver good collective communication performance. Keywords: Parallel computer architecture, wormhole routing, cut-through routing, multicast, broadcast, collective communication, multistage interconnection networks, performance evaluation. TR58 Analysis of Hierarchical Reliable Multicast Transport Protocols for Dissemination and Collaborative Communications by Walid Mostafa and Mukesh Singhal Sender-initiated reliable multicast protocols rely on positive acknowledge- ments (ACKs) that lead to an ACK implosion as the number of receivers increases. Receiver-initiated protocols rely on negative acknowledgements (NAKs) that result in longer delays to detect missing packets by a receiver. Hierarchical approaches realize scalable communication by distributing error control functions (e.g., packet loss detection and retransmission) among a hierarchy of nodes. In this report we derive analytical expressions for the throughput of different node types in hierarchical multicast control trees and analyze their sensitivity to configuration parameters. We also derive expressions for the maximum throughput of collaborative multicast communication among R nodes where each node is actively involved in sending and receiving data. We show that protocol variations that improve the throughput of 1xR dissemination communications can degrade the throughput of RxR collaborative communications. Keywords: Hierarchical Reliable Multicast, Transport Protocols, Restricted Multicast, Collaborative Communication, and Dissemination. TR57 A Distributed Fault-Detection and Recovery Protocol for Reliable Multicast Collaborative Communications by Walid Mostafa and Mukesh Singhal Reliable multicast transport protocols support dissemination communication and their throughputs are either limited by the sender or by intermediate nodes that consolidate acknowledgements and retransmissions. RMSP is a portable reliable multicast protocol that supports collaborative communication at the session-layer level and provides higher throughput by maintaining two transport connections per node irrespective of the number of nodes. However, the throughput only of RMSP may degrade if one or more nodes or the centralized connection manager that monitors them fails. In this report, we propose a fully distributed fault-tolerant protocol, DFT-RMSP, that detects and isolates faulty nodes and reinstates the multicast collaborative communications. We derive analytic expresions for throughput degradation and analyze its sensitivity to node-failure rates, network delay, and multicast group size. We show tha DFT-RMSP provides less down-time and achieves graceful throuput degradation under high failure rates compared with RMSP, simultaneous node failures. Keywords: Distributed Faul-Tolerance, Reliable Multicast, Session Protocols, and Collaborative Communication TR56 Distributed Dynamic Carrier Allocation in Mobile Cellular Networks: Search vs. Update by Xuefeng Dong and Ten H. Lai There are two approaches for distributed dynamic carrier allocation algorithms: update and search. This paper investigates the fundamental differences between the two approaches. We present two simple, representative schemes, namely the basic update scheme and the basic search scheme. We will argue that for two reasons the update approach is more appropriate for mobile cellular networks that support real-time communications. First, it is hard to support carrier reallocations with the search approach. Second, the search approach usually incurs longer delay during carrier acquisitions than the update approach. We then propose a new update scheme that is more efficient than the basic update scheme in terms of message complexity and carrier acquisition delay. Our simulation results show that, compared to the basic update scheme, the new update scheme not only cuts the average carrier acquisition delay by 60% to 95%, but also reduces the average number of messages required per carrier acquisition by almost 50%. TR55-DIR This entry is a number of slides used in lecturing on ATM. Each set of slides has its own filename within the subdirectory 1996/TR55-DIR. The abstracts for them appear below: ----------------------------------------------------------------------- o ATM Forum/96-1294: Performance of TCP over ABR on ATM backbone and with various VBR traffic patterns (October 1996) tcp_vbr.ps We extend our earlier study of buffer requirements of TCP over ABR in two directions. First, we study the performance of TCP over ABR in an ATM backbone. On the backbone, the TCP queues are at the edge router and not inside the ATM network. The router requires buffer equal to the sum of the receiver window sizes of the participating TCP connections. Second, we introduce various patterns of VBR background traffic. The VBR background introduces variance in the ABR capacity and the TCP traffic introduces variance in the ABR demand. Some simple switch schemes are unable to keep up with the combined effect of highly varying demands and highly varying ABR capacity. We present our experiences with refining the ERICA+ algorithm to handle these conditions. ----------------------------------------------------------------------- o ATM Forum/96-1270: Tutorial Paper on ABR Source Behavior (October 1996) atm96-1270.ps ----------------------------------------------------------------------- o ATM Forum/96-1267: ABR Switch Algorithm Testing: A Case Study with ERICA (October 1996): atm96-1267.ps This contribution discusses the requirements of ATM Available Bit Rate switch algorithms, and demonstrates how each of these requirements can be tested. As a case study, the performance of the ERICA switch algorithm is evaluated and the effect of some features and options of the algorithm is examined. The requirements tested include: efficiency, delay, fairness, transient and steady state performance, scalability, and adaptation to variable capacity and various source traffic models. The performance of the algorithm is evaluated for various configurations and background traffic patterns. ----------------------------------------------------------------------- o ATM Forum/96-1269: Performance of TCP over UBR+ (October 1996) : atm96-1269.ps ATM switches respond to UBR congestion by dropping cells when their buffers become full. TCP connections using the UBR service experience low throughput and low fairness. For 100% TCP throughput each switch needs buffers equal to the sum of the window sizes of all the TCP connections. Intelligent drop policies can improve the performance of TCP over UBR with limited buffers. The UBR+ service consists of enhancements to UBR for intelligent drop. We found that Early Packet Discard (EPD) improves throughput but does not improve fairness. Selective packet drop based on per-connection buffer occupancy improves fairness. The Fair Buffer Allocation scheme further improves both throughput and fairness. ----------------------------------------------------------------------- o ATM Forum/96-1172: ERICA Switch Algorithm: A Complete Description (August 1996) : atm96-1172.ps The ERICA switch algorithm has been discussed extensively in TM group in the past. However, over the last two years, the algorithm has been substantially modified. This contribution describes the current version of ERICA switch algorithm in complete detail. The algorithm achieves both efficiency and fairness, and exhibits a fast transient response. The development of the algorithm is traced, and the new approaches it uses to achieve its objectives are highlighted. Several design and implementation aspects of the algorithm are examined. In addition, several enhancements to the algorithm are proposed and studied. ----------------------------------------------------------------------- o ATM Forum/96-0517: Buffer Requirements for TCP over ABR (April 1996) : af_abr22.ps In our previous study, it was shown that cell loss due to limited buffering may degrade throughput considerably. The key question is how much buffering is required to avoid cell loss. This contribution attempts to answer that question. We show that the maximum buffers required at the switch is proportional to the maximum round trip time of any VC through the link. The number of round-trips depends upon the the switch algorithm used. With our ERICA (modified) switch algorithm, we found that the buffering required is independent of the number of TCP sources. These observations are valid even when VBR or two- way traffic is used. We substantiate our arguments with simulation results. Our simulations are carried out with a modified version of the ERICA algorithm. We also give a brief description of the modifications to ERICA which include avoidance of unnecessary spikes, correct counting of bursty sources and enhanced fairness. ----------------------------------------------------------------------- o ATM Forum/96-0518: Performance of TCP over UBR and buffer requirements (April 1996): af_ubr22.ps We study TCP throughput and fairness over UBR for several buffer and maximum window sizes. TCP performs best when there is no loss. The performance is severely degraded under loss (regardless of UBR or ABR). Therefore, we study the switch buffer requirements for zero loss. The required buffers are found to be the function of number of VCs and their round-trip times. ----------------------------------------------------------------------- o ATM Forum/96-0177: TBE and TCP/IP traffic (February 1996) : af_rl6b2.ps The effect of TCP traffic over ATM ABR is studied with the ERICA switch algorithm. ABR implements its rate-based traffic control at switches (via ER algorithms like ERICA) and at sources (via source rules, such as, Rule 6, which uses TBE parameter). TCP implements its own traffic controls via slow start window control. We study the interaction between the two mechanisms. In particular, this contribution concentrates on the effect of Transient Buffer Exposure (TBE) parameter and Source End System Rule 6 on TCP/IP connections in a Wide Area Network (WAN). ----------------------------------------------------------------------- o ATM Forum/96-0178: Comments on "Use-it or Lose-it" (Annex E of TM4.0) (February 1996) : af_rl5b2.ps Annex E of T4.0 describes several optional NIC implementations for the use-it or lose-it policies. A minor change in one of the options will result in significant performance improvements. ----------------------------------------------------------------------- o ATM Forum/95-1660: A Fix for Source End System Rule 5 (December 1995) : af_rl52.ps In the October 1995 meeting of the ATM Forum, it was found that source end system rule 5 does not handle ACR retention problem correctly. It causes undesirable oscillations in many cases and significantly reduces the performance of the bursty sources. In this contribution, we analyze the solution that was proposed in October 1995 meeting and finally present a better solution. While the case of small bursts needs to be tackled, the rule works well for other simple configurations. ----------------------------------------------------------------------- o ATM Forum/95-1661: More Strawvote Comments: TBE vs Queue sizes (December 1995) : af_rl62.ps We analyze the role of Source End System rule 6 in depth. We find that the queue lengths can grow considerably in some cases even if the value of TBE (a.k.a. CIF) is small. ----------------------------------------------------------------------- o ATM Forum/95-1346: ERICA+: Extensions to the ERICA Switch Algorithm (October 1995) : af_erc22.ps Several enhancements to ERICA are described. The resulting scheme, ERICA+ provides congestion avoidance with 100% utilization. Even at 100% utilization, the queue lengths are bounded. The scheme gives better control of end-to-end delay compared to other schemes and keeps link utilization of expensive links high despite idle periods in the input load. The scheme responds well to VBR traffic by estimating the ABR capacity and employing a scheduling strategy between ABR and VBR. Simulation results on a set of representative configurations are presented. ----------------------------------------------------------------------- o ATM Forum/95-1345: Bursty ABR Sources (October 1995) : af_brst2.ps Bursty traffic introduces idle times in the traffic pattern and this has interesting effects on the source rules. We examine this through a definition of a new bursty model and a new metric. ----------------------------------------------------------------------- o ATM Forum/95-1344: New Source Rules and Satellite links (October 1995) : af_cif2.ps The effect of new CIF and ICR formulae on satellite networks is studied. Source rule 5 (TOF dectease) seems to have mostly negative effect on performance. Rescheduling, on the other hand, has positive effect. ----------------------------------------------------------------------- o ATM Forum/95-1343: Straw-Vote comments on TM 4.0 R8 (October 1995): af_stvc2.ps Several problems with Xrm and ICR computation using RTT and use of rule 5 are pointed out. ----------------------------------------------------------------------- o ATM Forum/95-973R1: Out-of-Rate RM Cell Issues and Effect of Trm, TOF, and TCR (August 1995): af_trm2.ps In this contribution, we examine the transient response time in going from a very low rate to a high rate. The frequency of feedback is low for low rate sources and this affects transients. Parameter Trm can improve the frequency of feedback. The rise is also controlled by the Tof parameter which may trigger an unnecessary decrease. We also observe that out-of-rate cells are not optional for NICs because of the ACR=0 case. We propose corrections to the End System pseudo code to correct these and other discrepancies. Note: Most of these corrections have been adopted in revision 8 of the TM 4.0 spec. ----------------------------------------------------------------------- o ATM Forum/95-972R1: Parameter Values for Satellite Links (August 1995) : af_xrm2.ps In this contribution, we examine the effect of the source parameters Xrm and XDF on long delay paths like satellite paths. We observe low throughput in the absence of congestion even for a simple one source configuration. We propose an increase in the length of the Xrm parameter from 8 bits to 32 bits to account for large bandwidth-delay product links. Note: this proposal was accepted in the revision 8 of TM version 4.0 ----------------------------------------------------------------------- o ATM Forum/95-0467: Simulation Results for ERICA Switch Algorithm with VBR+ABR traffic (April 1995): af_vbr2.ps This contribution discusses the issues raised by the presence of VBR and presents simulation results for the ERICA switch algorithm. ----------------------------------------------------------------------- o ATM Forum/95-0178R1: A Sample Switch Algorithm, (February 1995): af_swco2.ps Describes the switch algorithm developed at OSU. The algorithm is fully compatible with the latest source and switch requirements and has been thoroughly tested. The text to be included in the TM document appendix is presented. ----------------------------------------------------------------------- o ATM Forum/95-0179: Simulation Results for the Sample Switch Algorithm (February 1995): af_swsi2.ps Simulation results for various configuration for the sample switch algorithm are presented. ----------------------------------------------------------------------- o ATM Forum/94-1175R1: Current Default Proposal: Unresolved Issues (November 94): af_iss22.ps Problems with the default baseline text as circulated after the October interim meeting are pointed out. ----------------------------------------------------------------------- o ATM Forum/94-1173: Transient Performance of EPRCA and EPRCA++ (November 94) : af_trans.ps Transient performance of EPRCA and EPRCA++ (a modification of EPRCA+) is compared. ----------------------------------------------------------------------- o ATM Forum/94-0987: BECN: Why we need a timestamp or sequence nuber in the RM Cell (October 1994): af_becn.ps BECN cells and FECN cells can arrive at the source out of order and the effect of a BECN can be nullified by subsequent FECN cells that actually bring older information. Knowing which RM Cell represents newer information helps get full advantage of BECN. This can be done by having a sequence number or time stamp field in the RM cell. The Ordered BECN option of the OSU scheme is described and the effect of BECN with and without this order is shown. ----------------------------------------------------------------------- o ATM Forum/94-0988: Simulation Results: The EPRCA+ Scheme (October 1994) : af_osuc2.ps Simulation results comparing EPRCA and EPRCA+ are presented. In particular, EPRCA+ has fast transient performance, offers congestion avoidance, has few parameters, and is parameter insensitive. ----------------------------------------------------------------------- o ATM Forum/94-0882: Rate Based Schemes: Mistakes to Avoid (September 1994): af_9409-mistakes2.ps One key advantage of the rate-based approach in general and the explicit-rate approach in particular is that it provides considerable freedom to switch manufacturers in terms of specific algorithms to use. However, it is easy to misdesign a scheme unless the overload is measured properly and the feedback is related to the control properly. This contribution points out examples of such mistakes. ----------------------------------------------------------------------- o ATM Forum/94-0883: The OSU Scheme for Congestion Avoidance using Explicit Rate Indication (September 1994): af_osu2.ps An explicit rate indication scheme for congestion avoidance in ATM networks is proposed. The sources monitor their load and provide the information periodically to the switches. The switches, in turn, compute the load level and ask the sources to adjust their rates up or down. The scheme achieves high link utilization, low delay, fair allocation of rates among contending sources, provides quick convergence and works for bursty traffic. TR54 Source Behavior for ATM ABR Traffic Management: An Explanation by Raj Jain, Shivkumar Kalyanaraman, Sonia Fahmy and Rohit Goyal This paper is a tutorial on TM4.0 source behavior. The Available Bit Rate (ABR) service has been developed to support data applications over Asynchronous Transfer Transfer Mode (ATM) networks. The network continuously monitors its traffic and provides feedback to the source end systems. TR53 ABR Switch Algorithm Testing: A Case Study with ERICA by Raj Jain, Sonia Fahmy, Shivkumar Kalyanaraman, and Rohit Goyal This contribution discusses the requirements of ATM Available Bit Rate switch algorithms, and demonstrates how each of these requirements can be tested. As a case study, the performance of the ERICA switch algorithm [1] is evaluated and the effect of some features and options of the algorithm is examined. The requirements tested include: efficiency, delay, fairness, transient and steady state performance, scalability, and adaptation to variable capacity and various source traffic models. The performance of the algorithm is evaluated for various configurations and background traffic patterns. TR52 Performance of TCP over UBR+ by Rohit Goyal, Raj Jain, Shiv Kalyanaraman, and Sonia Fahmy ATM switches respond to UBR congestion by dropping cells when their buffers become full. TCP connections using the UBR service experience low throughput and low fairness. For 100% TCP throughput each switch needs buffers equal to the sum of the window sizes of all the TCP connections. Intelligent drop policies can improve the performance of TCP over UBR with limited buffers. The UBR+ service consists of enhancements to UBR for intelligent drop. We found that Early Packet Discard (EPD) improves throughput but does not improve fairness. Selective packet drop based on per-connection buffer occupancy improves fairness. The Fair Buffer Allocation scheme further improves both throughput and fairness. TR51 ERICA Switch Algorithm: A Complete Description by Raj Jain, Shiv Kalyanaraman, Rohit Goyal, Sonia Fahmy, and Ram Viswanathan The ERICA switch algorithm has been discussed extensively in TM group in the past. However, over the last two years, the algorithm has been substantially modified. This contribution describes the current version of ERICA switch algorithm in complete detail. The algorithm achieves both efficiency and fairness, and exhibits a fast transient response. The development of the algorithm is traced, and the new approaches it uses to achieve its objectives are highlighted. Several design and implementation aspects of the algorithmn are examined. In addition, several enhancements to the algorithm are proposed and studied. TR50 Performance of TCP over ABR on ATM backbone and with various VBR traffic patterns by Shiv Kalyanaraman, Raj Jain, Sonia Fahmy, Rohit Goyal, and Jianping Jiang We extend our earlier studies of buffer requirements of TCP over ABR [10, 11,12] in two directions. First, we study the performance of TCP over ABR in an ATM backbone. On the backbone, the TCP queues are at the edge router and not inside the ATM network. The router requires buffer equal to the sum of the receiver window sizes of the participating TCP connections. Second, we introduce various patterns of VBR background trafic. The VBR background introduces variance in the ABR capacity and the TCP traffic introduces variance in the ABR demand. Some simple switch schemes are unable to keep up with the combined effect of highly varyijng demands and highly varying ABR capacity. We present our experiences with refining the ERICA + switch scheme to handle these conditions. TR49 Range Image Segmentation Using a LEGION Network by Xiuwen Liu and DeLiang Wang This paper presents a neural network approach for range image segmentation. We use a locally excitatory globally inhibitory oscillator network (LEGION) as the framework for range image segmentation. Each oscillator in the LEGION network has excitatory lateral connections to the oscillators in its neighborhood as well as a connection with a global inhibitor. The feature detector associated with each oscillator estimates the surface normal and curvature at its corresponding pixel location. The lateral connection between two oscillators is established based on a similarity measure of their feature vectors. The emegent behavior of the LEGION network gives rise to the segmentation result. Unlike other methods, our scheme needs no assumption about the underlying structures in image data and no prior knowledge regarding the number of regions. Experimental results using real range images are presented. TR48 Dynamic Carrier Allocation Strategies for Mobile Cellular Networks by Xuefeng Dong and Ten H. Lai In dynamic carrier allocation (DCA) strategies, carriers are assigned priorities. Based upon how priorities are assigned to carriers, DCA strategies are classified into three classes: static-priority, dynamic- praiority, and hybrid-priority strategies. In this paper, we lay out a theoretical foundation for DCA strategies: we first develop the concept of optimal reuse pattern and then propose a model for resource planning as well as three guiding principles for priority assignment in static- and hybrid- priority DCA strategies. These theoretical results will shed light on the strength and weakness of existing DCA strategies and lead us to developing more efficient strategies. Specificially, we propose three new DCA strategies: Column-Wise static-priority (CWSP), Column-Wise hybrid-priority (CWHP), and Two-Step dynamic-priority (TSDP). The CWSP amd CWHP strategies are obtained from the Geometric strategy by following our guiding principles for priority assignment. The TSDP strategy, on the other hand, was derived from the Nanda-Goodman strategy by applying our notion of optimal reuse patterns. It not only attempts to reuse carriers as compactly as possible (just as other dynamic-priority strategies do), but also goes one step further by bringing the actual carrier reuse pattern closer to the optimal pattern. The performance of the three new DCA strategies are simulated under both uniform and non-uniform traffic distributions. The simulation results show that each one of the three new DCA strategies outperforms existing strategies in its own class, and the TSDP strategy has the best performance among all DCA strategies. TR47 Relaxation Oscillators with Time Delay Coupling by Shannon R. Campbell and DeLiang Wang We study networks of relaxation oscillators coupled with time delay synapses. A pair of oscillators is analyzed and shown to attain loosely synchronous solutions for a wide range of initial conditions and time delays. Simulations of one and two dimensional oscillator networks indicate that locally coupled oscillators are also loosely synchronous. Desynchronous solutions are possible when system parameters are varied. To characterize loosely synchronous networks, we introduce a measure of synchrony, the maximum time difference between any two oscillators. In locally excitatory globally inhibitory oscillator networks with time delays, we find that desynchronous solutions for different groups of oscillators are maintained, and the number of groups that can be segregated is related to the maximum time difference within each group. To examine the maximum time difference, we display its histograms for oscillator networks in one and two dimensions. Also, a range of initial conditions is given so that the maximum time difference is contained as the system evolves. Keywords: neural networks, relaxation oscillators, time delays, biological oscillations, synchronization TR46 Efficient Scatter Communication in Wormhole k-ary n-cubes with Multidestination Message Passing by Mohammad Banikazemi and Dhabaleswar K. Panda This paper presents efficient ways to implement one-to-all scatter communication on k-ary n-cubes with multidestination message- passing. Earlier researchers have applied multidestination message-passing to support non- personalized communications. In this paper, for the first time in the literature, we demonstrate that multidestination message-passing can also be used to support efficient personalized communication. Current parallel systems use either a sequential tree-based (ST) scheme or a binomial tree- based (BT) scheme with unicast message-passing to implement scatter. First, we show that the ST scheme achieves the lower bound just for a narrow range of message lengths. Next, we show that the BT scheme has a better overall performance compared to ST but, it can never achieve the lower bound. Finally, we proposed a sequential multidestinatin tree-based (SMT) scheme using multidestination messages which minimizes the scatter latency as well as achieves the lower bound. This scheme allows data for multiple destinations to be grouped together and sent as a single message using multidestination multicast/broadcast worms. All three schemes are evaluated for a range of system sizes (k,n), communication start-up times (ts), propagation times (tp), and message lengths (l). For small message sizes on 8x8 torus, it is shown that the SMT scheme can reduce the scatter latency up to a factor of 8 compared to the ST scheme and up to a factor of 1.6 compared to the BT scheme. For a range of system sizes and (ts/tp) ratios, it is shown that the SMT scheme performs better than the ST and BT schemes for a wide range of message lengths. Thus, these results demonstrate significant potential to be applied to design scalable collective communication libraries for current and future generation wormhole systems. Keywords: Collecive communication, scatter, personalized communication, wormhole routing, k-ary n-cubes, path-based routing, and multidestination message passing. TR45 Reliable Hardware Barrier Synchronization Schemes by Rajeev Sivaram, Craig B. Stunkel and Dhabaleswar K. Panda Barrier synchronization is a crucial operation for parallel systems. Many schemes have been proposed in the literature to achieve fast barrier through software, hardware, or a combination of these mechanisms. However, none of these schemes emphasize fault-tolerant barrier operations. In this paper, we describe inexpensive support that can be added to network switches for achieving reliable hardware-based barrier synchronization while recovering from lost or corrupted messages. Necessary modifications to the switch architecture and the associated fault-tolerant message-passing protocols are presented. Thes protocols are optimized for the no-fault case while providing means to detect the failure of any step of the operation and to recover from it. The proposed scheme is evaluated with and without specialized support at the network interface and compared with similar approaches using software- based schemes. It promises significant potential to applied to switch-based parallel systems, espeically the emerging networks of workstations. Keywords: Parallel computer architecture, collective communication, barrier synchronization, interprocessor communication, fault-tolerance, and networks of workstations. TR44 Classes as Assertions by Neelam Soundararajan How do we formally specify the relation between a base class and a derived class? This question has two parts, a syntactic one, and a semantic one. The syntactic part is of course the easier of the two and the answer to that part is the standard contra/co- variance requirement on the arguments and result of any base class method redefined in the derived class. Our concern in the current paper is with the semantic part of the question, i.e., how do we specify the behavioral relation between the base class and the derived class? We show that the standard answer - which is the semantic counterpart of contra/co-variance - is too rigid, and does not allow some natural and common forms of inheritance. We then propose a more flexible way to specify the relation, and show how different types of behavioral relations between base classes and derived classes may be specified using our notation. TR43 An Efficient Coterie-Based Mutual Exclusion Scheme With Fault- Tolerance Capability by Guohong Cao and Mukesh Singhal The performance of a mutual exclusion algorithm is measured by the number of messages exchanged per critical section execution and the delay between successive executions of the critical section. There is a message complexity and synchronization delay trade-off in mutual exclusion algorithms. Maekawa's algorithm reduces message complexity to O(Square root of N); however, it increases the synchronization delay to 2T. This paper presents an algorithm which reduces the synchronization delay to T (average message delay) and still has the low message complexity O(K) (K is the size of the quorum, which can be as low as log N)> A correctness proof and detailed performance analysis are provided. The paper identifies one possible deadlock in Maekawa's algorithm and proposes a fix for it. Key words: Coteries, mutual exclusion, fault-tolerance. TR42 RMSP: A Reliable Multicast Session Protocol for Collaborative Continuous-Feed Applications by Walid Mostafa and Mukesh Singhal This report presents a portable reliable multicast session protocol for collaborative continuous-feed applications. The protocol supports NxN group multicast communications at the session layer and runs on the top of existing reliable transport protocols (e.g., TCP). It does not assume multicast support at the network layer. The protocol is portable as a network-layer-independent library that is suitable for both LAN and WAN environments. It multiplexes messages fropm different group members into transport packets to reduce the transport overhead per message. It requires only two transport-level connections per node, irrespective of the multicast group size, to reduce the overhead of acknowledgements and retransmissions per node. We compare the throughput of this protocol with the throughput of general sender-initiated and receiver-initiated reliable multicast transport protocols. Keywords: Reliable Multicast, Transport Protocols, Session Multiplexing, Group Communication, and Distributed Systems. TR41 Alleviating Consumpton Channel Bottleneck in Wormhole-Routed k-ary n-cube Systems by Debashis Basak and Dhabaleswar K. Panda NOTE: This tech report is substantially different from a tech report of the same name published in 9/95, TR36. This paper identifies performance degradation in wormhole routed k-ary n-cube networks due to limited number of router-to-processor consumption channels at each node. Many recent research in wormhole routing have advocated the advantages of adaptive routing and virtual channel flow control schemes to deliver better network performance. This paper indicates that the advantages associated with these schemes can not be realized with limited consumption capacity. To alleviate such performance bottleneck, a new network interface design using multiple consumption channels is proposed. To match virtual multiplexing on network channels, we also propose each consumption channel to support multiple virtual consumption channels. The impact of message arrival rate at a node on the required number of consumption channel is studied analytically. It is shown that wormhole networks with higher routing adaptivity, dimensionality, degree of hot-spot traffic, and number of virtual lanes have to take advantage of multiple consumption channels to deliver better performance. The interplay between system topology, routing algorithm, nuymber of virtual lanes, messaging overheads, and communication traffic is studied through simulation to derive the effective number of consumption channels required in a system. Using the on-going technological trend, it is shown that wormhole-routed systems can use up to 2-4 consumption channels per node to deliver better system performance. Keywords: Parallel computer architecture, wormhole routing, k-ary n-cube, consumption Channel, virtual Channel, deterministic routing, adaptive routing, hot-spot traffic, performance evaluation, and interprocessor communication. TR40 Multicast on Irregular Switch-based Networks with Wormhole Routing by Ram Kesavan, Kiran Bondalapati, and D.K. Panda This paper presents efficient multicasting with reduced contention on irregular networks with switch-based wormhole interconnection and unicast message passing. First, it is proved that for an arbitrary irregular network with a typical deadlock-free, adaptive routing, it may not be possible to create an ordered list of nodes to implement an arbitrary multicast in a contention-free manner with minimal number of communication steps. Next, three different multicast algorithms are proposed with their respective node orderings to reduce contention: switch-based ordering (SO), switch-based hierarchical ordering (SHO), and chain concatenation ordering (CCO). A variation of a binomial tree-based communication pattern with unicast message passing is used on the above ordered lists to implement multicast. The proposed multicast algorithms are compared with each other as well as with their naive random ordering (RO) algorithm for a range of system sizes, switch sizes, message lengths, degrees of connectivity, destination set sizes, and communication start-up times. With larger system size and larger destination set size, the CCO algorithm is shown to be the best to implement multicast with reduced contention and minimum latency. The SHO algorithm emerges as the second best. Such results related to multicast on irregular networks are the first of their kind in the wormhole literature. Thus, these results demonstrate significant potential to be applied to current and future generation NOW systems with irregular interconnection. Keywords: Collective communication, Multicase, Broadcast, Wormhole Routing, Irregular Networks, Link Concention, and Networks of Workstations. TR39 An Efficient Protocol for Call Setup and Path Migration in IEEE 802.6 Based Personal Communication Networks by Xuefeng Dong and Ten-Hwang Lai Recently, DQDB (IEEE 802.6) MAN has been proposed as a component of the Personal Communication Networks, in which base statinos of wireless infra- structures are connected by a number of DQDBs which in turn are connected by bridges. We propose a protocol for call setup and path migration in a cluster of DQDBs. To pursue simplicity, efficiency, and fastness in path setup, we use a link-state-like routing method for path selection, and a source-routing-based protocol for path establishment. The feasibility of the source routing protocol in a DQDB cluster is guaranteed by our novel labeling scheme. Each label represents the routing direction from one DQDB to another such that a path can be represented by a sequence of labels. Because of the locality property of the labels, a few bits are enough to denote the labels unambiguously. Using multi-bit labels instead of multi-bye physical addresses, a path can be specified in a 53-octet DQDB slot. TR38 Dependency Sequences and Hierarchical Clocks: Efficient Alternatives to Vector Clocks for Mobile Computing Systems by Ravi Prakash and Mukesh Singhal Vector clock has been used to capture causal dependencies between processes in distributed computing systems. It is not suitable for mobile computing systems due to (i) lack of scalability: its size is equal to the number of nodes, and (ii) its inability to cope with fluctuatinos in the number of nodes. This paper presents two efficient alternatives to vector clock, namely, sets of dependency sequences, and hierarchical clock. Both the alternatives are scalable, and are immune to fluctuations in the number of nodes in the systems. TR37 Multitolerance (Extended Abstract) by Anish Arora and Sandeep S. Kulkarni Multitolerance refers to the ability of a system to tolerate multiple classes of faults, each in a possibly different way. Achieving this ability is essential, not only because increasingly systems are subject to multiple classes of faults, but because designing a uniform type of tolerance to all their fault-classes is often inappropriate or inefficient. In this paper, we give a formal definition of what it means for a system to be multitolerant and we propose a method for the design of multitolerant systems. Our method simplifies the design task by proceeding in a stepwise fashion: In each step, tolerance to a new fault-class is added while preserving tolerance to the previously simple theory is exploited to ensure mutual interference-freedom among the detectors, correctors, and underlying system. We demonstrate the method by designing novel, fully distributed, multitolerant programs. Keywords: multiple fault-tolerances, design methods, detectors, correctors, interference-freedom, distributed systems software TR36 Failure Recovery based on Quasi-Syhnchronous Checkpointing in Mobile Computing Systems by D. Manivannan and M. Singhal Mobile computing systems are expected to revolutionize the way computers are used. Mobile hosts have small memory, a relatively slow processor and low power batteries, and commmunicate over low bandwidth wireless communica- tion links. In this paper, we address the problem of failure recovery in mobile computing systems. Any recovery method for mobile computing systems should take into consideration energy and communication bandwidth constraints under which mobile computers have to operate. Synchronous checkpointing is not suitable for mobile systems since it involves high communication cost over a low bandwidth network. Asynchronous checkpointing is not suitable because multiple checkpoints need to be stored in the stable storage and also some or all of the checkpoints taken may be useless for constructing consistent global checkpoints. In this paper, we propose a low-overhead recovery algorithm based on a quasi-synchronous checkpointing algorithm for mobile computing systems. The checkpointing algorithm preserves process autonomy by allowing them to take checkpoints asynchronously and uses communication-induced checkpoint coordination for the progression of the recovery line which helps bound rollback propagation during a recovery. Thus, it has the easeness and low overhead of asynchronous checkpointing and the recovery time advantages of synchronous checkpointing. The checkpointing algorithm ensures the existence of a recovery line consistent with any checkpoint of any process all the time. The recovery algorithm exploits this feature to restore the system to a state consistent with the latest checkpoint of a failed process, in the event of a failure. It uses selective pessimistic message logging at the receiver end to handle the messages lost due to rollback. Keywords: Distributed checkpointing, mobile computing, failure recovery, fault-tolerance. TR35 Benefits of Processor Clustering in Designing Parallel Systems: When and How? by Debahis Basak and Dhabaleswar K. Panda NOTE: A similar title was used for 1995/TR41 but this is a new report. Advances in multiprocessor interconnect technology are leading to high performance networks. however, software overheads associated with message passing are limiiting the processors to get maximum performance from these networks, leading to under-utilization of network resources. Several research studies are ongoing for designing messaging protocols and hardware to reduce such overheads. However, with such protocols messaging overheads cannot be eliminated. Even though the overheads can be lowered, these will continue to be reasonably high compared to the network speed. In this paper we take an alternative approach and analyze on how to exploit emerging processor-cluster technology for designing high-performance and cost-effective parallel systems given that messaging overheads can limit maximum performance. Though processor-clusters are being used in some systems in an ad hoc manner, there is no formal analysis in the literature to show when and how processor clusters benefit in designing high performance and scalable systems. In this paper we present a design-space-graph framework for analyzing and solving this problem by considering processor- clustering, messaging overheads, and network performance in an integrated manner. Our analysis establishes the following three design guidelines. With messaging overheads constraining performance in a system, processor clustering can be used to build a) an equal-sized system with a smaller netowrk or b) a larger system with an equal-sized network. With messaging overheads not being a constraint, a combination of processor clustering and wider channels can be used to build a range of larger-sized systems. These guidelines are validated through simulation and experimentation. All these guidelines lead to designing cost-effective and scalable parallel systems while delivering high performance. Keywords: parallel architectures, interconnection networks, clustered architectures, hierarchical architectures, scalability, messaging overheads. TR34 Modeling Modular Software Structure for Human Understanding by Stephen H. Edwards People form internal mental models of the things they interact with in order to understand those interactions. This psychological insight has been used by the human-computer interaction (HCI) community to build software systems that are more intuitive for end users, but it has only been informally applied to the problems of software designers, programmers, and maintainers. Conventional programming languages still do little to help client programmers develop good mental models of software subsystems. To address this problem, we have developed the Abstract and Concrete Temnplates and Instances (ACTI) model of modular, parameterized software subsystems. This model of software structuer addresses the needs of human software engineers who must reason about collections of interacting software parts during design, maintenance, and evolution. ACTI is different from other module systems and models of software in several ways. In ACTI, a subsystem never has any implicit dependencies, and never depends directly on any external definitions - all external dependencies are described through an explicit interface. In addition, a subsystem specitifcation is meaningful by iteself, even without respect to any implementation. Finally, a subsystem is more than just a collection of types and operations; it also includes: an explicit model of behavior, an explicit model of all external dependencies, a collection of definitions used to contstruct and describe these models, and (potentially complex) substruc- ture. There are strong parallels between ACTI and other research on the understanding of modularly structured physical devices, particularly Functional Representation. Keywords: Mental model, model-based specification, interfaces, bindings, generics. TR33 Quasi-Synchronous Checkpointing: Models, Characterization, and Classification by D. Manivannan and Mukesh Singhal Checkpointing algorithms are classified as synchronous and asynchronous in the literature. In synchronous checkpointing, processes synchronize their checkpointing activities so that a globally consistent set of checkpoints is always maintained in the system. Synchronizing checkpointing activity involves message overhead and process execution may have to be suspended during the checkpointing coordination, resulting in performance degradation. In asynchronous checkpointing, processes take checkpoints without any coordination with others. Asynchronous checkpointing provides maximum autonomy for processes to take checkpoints; however, some of the checkpoints taken may not lie on any consistent global checkpoint, thus making the checkpointing efforts useless. Asynchronous checkpointing algorithms in the literature can reduce the number of useless checkpoints by making processes take communication induced checkpoints, besides asynchronous checkpoints. We call such algorithms, quasi-synchronous. Quasi-synchronous checkpointing algorithms are attractive because they improve the performance without introducing undesirable effects. in this paper, we present a theoretical framework for characterizing and classifying such algorithms. The theory not only helps to classify and characterize the quasi-synchronous checkpointing algorithms, but also helps analyzing the properties and limitations of the algorithms belonging to each class; it also provides guidelines for designing and evaluating such algorithms. This classification also sheds light on some open problems that remain to be solved. Keywords: Causality, distributed checkpointing, consistent global checkpoint, failure recovery, fault-tolerance, zigzag paths. TR32 Syntax-Directed Construction of Program Dependence Graphs by Mary Jean Harrold and Gregg Rothermel We present an algorithm that constructs program dependence graphs as a program is parsed. For programs that contain only structured transfers of control, our algorithm does not require explicit control flow or postdominator information to compute exact control dependencies. For programs that contain explicit transfers of control, our algorithm can determine whether these transfers of control are used in a structured way, and if so, compute control dependencies without explicit control flow or postdominator information. When transfers of control are ill-behaved, our algorithm adjusts the control dependence information computed during the parse, to obtain exact control dependencies. For many programs, our algorithm provides savings in time and memory, because it does not require prior computation of control flow or postdominator information. However, our algorithm also calculates control flow information during the parse, and incorporates this information into the program dependence graphs that it constructs; the resulting graphs have a wider range of applicability than graphs that do not contain this information. Keywords: control dependence, control flow, data flow, program dependence graphs, software testing. TR31 Covariance, Contravariance, and Synchronization Constraints by Neelam Soundararajan Object Orientation and concurrent programming are a natural match. Objects correspond to processes in a concurrent program; a message from one object to another invoking a method of the latter, corresponds naturally to (message passing) interaction between processes in a concurrent program. Despite this close correspodence, progress in developing an OO approach to concurrency has been limited. One important problem has been the so-called inheritance anomaly which is concerned with how and how easily synchronization constraints specified in a base class may be modified in a derived class. Our concern in the current paper is more basic. We address the question of what types of modifications in the synchronization constraints inherited from the base class ought to be permitted to be made in the derived class. We identify the modifications that are reasonable, especially when the problem is viewed from the point of view of a client of the class. We point out that when viewed from this perspective, some of the current proposals for expressing synchronization constraints allow precisely the wrong types of modifications to be made in the derived classes. We also analyze the relation between this question and the problems that inheritance encounters already in the sequential case, such as the covariance versus contravariance conflict. TR30 Flexible Object Reconstruction from Temporal Image Series by Jeremy Loomis, Zhaohua Ding, Xiuwen Liu, Kikuo Fujimura, and Hideo Ishikawa The problem considered is flexible object reconstruction from a temporal series of orthogonal silhouette image pairs. The subject of study is a growing plant. Three approaches are examined, namely, (i) a bottom-up method, (ii) a spatial model-based method, and (iii) a spatiotemporal model-based method and their experimental results are presented. For all the three approaches, the plant is encoded by free-form splines, but each uses different construction methods. Merits of the approaches are examined in terms of robustness and accuracy. Keywords: Shape representation and recovery, image sequence analysis, nonrigid motion, model-based vision. TR29 Hardware Assisted Volume Rendering of Unstructured Grids by Incremental Slicing by Roni Yagel, David M. Reed, Asish Law, Po-Wen Shih, and Naeem Shareef Some of the more important research results in computational science rely on the use of simulation methods that operate on unstructured grids. However, these grids, composed of a set of convex polyhedra, introduce exceptional problems with respect to data visualization. Volume rendering techniques, originally developed to handle rectangular grids, show significant promise for general use with unstructured grids as well. The main disadvantage of this approach, compared to isosurfaces, particles or other visualization tools is its non-interactive performance. We describe an efficient method for rendering unstructured grids that is based on incremental slicing and hardware polygon rendering. For a given view direction, the grid vertices are transformed to image space using available graphics hardware. We then incrementally compute the 2D polygon-meshes that result from letting a set of equidistant planes, parallel to the screen plane, intersect (slice) the transformed grid. Finally, we use the graphics hardware to render (interpolate-fill) the polygon-meshes and composite them in a front-to-back order. We show that, in addition to being much faster than existing methods, our approach also provides adaptive control and progressive image generation. TR28 Potential Networking Applications of Global Positioning Systems (GPS) by Gopal Dommety and Raj Jain Global Positioning System (GPS) Technology allows precise determination of location, velocity, direction, and time. The price of GPS receivers is falling rapidly and the applications are growing. PCMCIA receivers that can connected to any notebook personal computers are available for $300-400 to end consumers. The main goal of this study was to survey current applications of GPS to distributed systems and networks. While GPS is appearing in the computer magazines very often and while many computer companies have announced GPS related efforts, most such efforts are in providing navigational guidance to drivers. Digitized city maps along with a GPS sensor on a mobile computer provide directions to drivers. A number of consortiums have been formed by companies such as IBM, Apple, Toshiba, Sony, and others. Currently, applications of GPS for distributed computation and networking are limited to measuring delays in Wolter and Golderman's DA-30 network analyzers and in clock synchronization in synchronous optical networks (SONET) used in telecommunication networks. We have identified twenty new applications of GPS for distributed computing and networking. These applications include circuit switching using syn- chronized clocks, synchronous slotted systems, clock synchronization in distributed systems, database synchronization, connectionless real-time communication, one-way delay, delay based routing, time to live, resource location, location adaptive protocols, home vs office vs car, electronic fence, handoffs in wireless networks, prescheduled handoffs based on velocity and direction, adaptive transmission power control algorithms, directional antennas, temporary cell partitioning for congestion avoidance, peer-to-peer routing with limited range receivers, email delivery based on geographic location, distributed robot control and navigation, and equipment location marking for maintenance crew. There two main obstacles to applications of GPS. First is that GPS antennas must point to open sky. They cannot be used directly inside a building. Unless this is avoided by new antenna designs or by rebroadcasting GPS data inside a building, the use of GPS techniques is limited to improving the performance of distributed systems rather than proper operation. This is similar to cache memories in computer systems. Systems can operate without cache but perform better with cache. The second obstacle is the pseudo- random noise introduced in the GPS signals by the defense department to disallow other governments from using the full precision of GPS. This is called selective availability. Fortunately, this can be easily overcome by differential GPS techniques. Detailed lists of GPS products, current applications, addresses of manufacturers, and sources for further information are included in this report. TR27 Classification and Local Error Estimation of Interpolation and Derivative Filters for Volume Rendering by Torsten Moeller, Raghu Machiraju, Klaus Mueller and Roni Yagel We describe a new method for analyzing, classifying, and evaluating filters, that can be applied to inerpolation filters, first derivative filters, and high order derivative filters. Our analysis is based on the Taylor series expansion of a convolution sum and some assumptions on the behavior of the data function. These assumptions are reasonable based on the frequency behavior of common sampled volume data, which can be approximated by the 1/[(omega)^r] in frequency space. As one result of our analysis, we derive the need and the method for normalization of derivative filter coefficients. As an example, we demonstrate the utilizatino of our methods to the analysis of the class of cardinal cubic filters. Since our technique is not restricted to interpolation filters, we can show that the Catmull-Rom spline filter and its derivative are the most accurate reconstruction and derivative filter among this class of filters. We show that the derivative filter has a much higher impact on the rendered volume than the interpolation filter. We demonstrate the use of these optimal filters for accurate interpolation and gradient estimation in volume rendering. TR26 Multiple Multicast with Minimized Node Contention on Wormhole k-ary n-cube Networks by Ram Kesavan and D.K. Panda This paper presents a new approach to minimize node contention while performing multiple multicast/broadcast on wormhole k-ary n-cube networks with overlapped destination sets. The existing multicast algorithms in the literature deliver poor performance underr multiple multicast because these algorithms have been designed with only single multicast in mind. The new algorithms introduced in this paper do not use any global knowledge about the respective destination sets of the concurrent multicasts. Instead, only local information and source-specific partitioning approach are used. For systems supporting unicast message-passing a new SPUmesh (Source-Partitioned- Umesh) algorithm is proposed and shown to be superior than the conventional Umesh algorithm [14] for multiple multicast. Two different algorithms, SQHL (Source-Quadrant Hierarchical Leader) and SCHL (Source-Centered Hierarchical Leader), are proposed for systems with multidestination message- passing and shown to be superior than the HL scheme [18]. All these algorithms perform 1) 5-10 times faster than the existing algorithms under multicast and 2) as fast as existing algorithms under single multicast. Furthermore, the SCHL scheme demonstrates that the latency of multiple multicast can, in fact, be reduced as the degree of multicast increases beyond a certain number. Such results related to multiple multicast are the first of their kind in the wormhole literature. Thus, these algorithms demonstrate significant potential to be used for designing fast and scalable collective communication libraries on current and future generation wormhole systems. Keywords: Collective communication, multicast, broadcast, wormhole routing, meshes, k-ary n-cube, unicast and multidestination message passing, and path-based routing. TR25 A Safe, Efficient Regressin Test Selection Technique by Gregg Rothermel and Mary Jean Harrold Regression testing is an expensive but necessary maintenance activity performed on modified software to provide confidence that changes are correct and do not adversely affect other portions of the software. A regression test selection technique chooses, from an existing test set, tests that are deemed necessary to validate modified software. Most regression test selection techniques depend on a particular test adequacy criterion or require prior knowledge of where code has been modified. We present a new technique for regression test selection that is neither adequacy-based, nor requires prior knowledge of modifications. Our algorithms construct control flow graphs for a procedure or program and its modified version, and use these graphs to select tests, from the original test set, that execute changed code. We prove that under certain conditions, the set of tests our algorithms select includes every test, from the original test suite, that can expose faults in the modified procedure or program. Thus, under these conditions, the algorithms are safe. Moreover, although our algorithms may select some tests that cannot expose faults, they are at least as precise as other safe regression test selection algorithms. Unlike many other regression test selection algorithms, our algorithms handle all language constructs, and all types of program modifications. We have implemented our algorithms; initial empirical studies indicate that our technique can significantly reduce the cost of regresion testing modified software. Keywords: software maintenance, regression testing, selective retest, regression test selection. TR24 Reducing Cache Invalidation Overheads in Wormhole Routed DSMs Using Multidestination Message Passing by Donglai Dai and D.K. Panda Directory-based distributed shared memory (DSM) systems have drawn high interests in parallel computing research and industry in recent years. Current generation systems are limited to using point-to-point messages for cache invalidation requests and associated acknowledgments. Such an approach incurs a large number of control messages, heavy network traffic, and high occupancy at home nodes. In this paper we present a new approach to reduce cache invalidation overheads. We introduce a set of multi- destination-based reservation and gather worms to distribute invalidation requests and collect acknowledgments. Six different grouping schemes are investigated in generating such multidestination worms in order to implement fully-mapped directory-based cache coherence on networks using deterministic (e-cube) or adaptive (turn-model) wormhole routing. These schemes are evaluated for different applications and 2D mesh configurations through execution-driven simulation. The interplay among overall execution time, the number of invalidation messages, total network traffic, routing adaptivity is studied. The results indicate that up to 15% reduction in overall execution time can be achieved by using multidestination messages. It is observed that turn-model routing with a density-dependent column grouping leads to maximum benefits. The results demonstrate that current and future DSM systems can take advantage of these schemes to deliver better performance. Keywords: Distributed shared memory systems, wormhole routing, directory- based protocols, cache coherence, meshes, multidestination mechanism, and path-based Routing. TR23 Analyzing Regression Test Selectino Techniques by Gregg Rothermel and Mary Jean Harrold Regression testing is a necessary but expensive maintenance activity aimed at showing that code has not been adversely affected by changes. Regression test selection techniques reuse tests from an existing test suite to test a modified program. Many regression test selection techniques have been proposed; however, it is difficult to compare and evaluate these techniques because they have different goals. This paper outlines the issues relevant to regression test selection techniques, and uses these issues as the basis for a framework within which to evaluate the techniques. We illustrate the application of our framework by using it to evaluate existing regression test selection techniques. The evaluation reveals the strengths and weakness of existing techniques, and highlights some problems that future work in this area should address. Keywords: software maintenance, regression testing, selective retest, regression test selection. TR22 A Dynamic Approach to Location Management in Mobile Computing Systems by Ravi Prakash and Mukesh Singhal Managing location information of mobile nodes is an important issue in mobile computing systems. There is a trade-off between location update effort (when a node moves) and node finding effort. In this paper we present a dynamic location management strategy that has the following features: (i) all location servers need not maintain location information about every mobile node (ii) a coterie based approach is adopted for location update and find, (iii) every node move does not result in location updates, (iv) location updates are done at a subset of location servers, (v) a subset of location servers, corresponding to a mobile node, for location update and find operations is dynamic, (vii) the dynamic nature of these sets helps alleviate situations of heavy on some location servers, when a large number of mobile nodes are concentrated in a small geographical area. Thus, location management is done efficiently, and responsibility is shard fairly among location servers. Keywords: mobile computing, location management, hashing. TR21 Building Efficient LImited Directory-Based DSMs: A Multidestination Message Passing Based Approach by Donglai Dai and D.K. Panda A cost-effective distributed shared memory (DSM) system typically uses a limited directory protocol to enforce cache coherence. This paper presents a new family of protocols, called Limited directory with Region- based Broadcast (Limited-RB), to efficiently implement cache coherence in wormhole routed DSM systems. This protocol family uses multidestination- based cache invalidation mechanisms to distribute invalidation requests to and collect the associated acknowledgments from separate regions. As a result, a write invalidation can be accomplished with fewer messages, less network traffic, and reduced occupancy at home nodes. These reductions contribute to decreasing invalidation latency and improving overall system performance. Directory organizatino under this new protocol is developed for 2D systems with e-cube routing and evaluated through simulations for a set of applications. The results indicate that with a small directory storage, the Limited-RB protocol using a directory storage equivalent to 3 pointers outperforms the LimitLESS protocol using 5 pointers. The results demonstrate that faster and less expensive DSM systems can be built using the new approach. Keywords: Distributed shared memory systems, wormhole routing, directory- based protocols, cache coherency, meshes, multidestination mechanism, and path-based routing. TR20 A Foundation for Designing Deadlock-free Routing Algorithms in Wormhole Networks by D.N. Jayasimha, D. Manivannan, Jeff A. May, Loren Schwiebert, and Stephen L. Hary This paper provides necessary and sufficient conditions for deadlock-free unicast and multicast routing with the path-based routing model in interconnection networks which use the wormhole as switching technique. The theory is developed around three central concepts: channel waiting, False Resource Cycles, and valid destination sets. The first two concepts are suitable extensions to those developed for unicast routing by two authors of this paper; the third concept has been developed by Lin and Ni. The necessary and sufficient conditions can be applied in a straightforward manner to prove deadlock freedom and to find more adaptive routing algorithms for collective communication. The latter point is illustrated by developing two routing algorithms for multicast communication in 2-D mesh architectures. The first algorithm uses fewer resources (channels) than a proposed algorithm in literature but achieves the same adaptivity. The second achieves full adaptivity for multicast routing. TR19 Universal Constructs in Distributed Computations Ajay D. Kshemkalyani and Mukesh Singhal This paper presents two classes of universal constructs that occur in distributed computations and explores their properties. It first examines a pair of universal constructs termed IO and OI intervals that occur at nodes in distributed computations. These universal constructs are used as building blocks to design two global universal constructs, termed segments and paths, that occur across nodes in distributed computations. These constructs are a generalization of causal chains in a distributed computation. While a causal chain only captures the causal relation, it turns out that message chains in a distributed computation that do not capture causality play a significant role in the analysis of a distributed computation. We show that a number of key concepts and structures characterizing distributed computations are special cases of and can be expressed in terms of the two proposed constructs. Key words: Causality, Crown, Distributed computation, Path, Segment, Zigzag path. TR17 Stepwise Design of Tolerances in Barrier Computations Sandeep S. Kulkarni and Anish Arora We design a multitolerant program for synchronizing the phases of processes in a distributed system. The tolerances of our program enable the system processes to: (i) compute all phases correctly in the presence of faults that corrupt variables in a detectable manner, and (ii) compute at most a bounded number of phases incorrectly in the presence of faults that corrupt variable sin an undetectable manner. Our design proceeds in a stepwise fashion: In each step, the states of the program are augmented - by adding new variables - and the actions of the program are transformed - by restricting old actions and adding new actions. We show that the program resulting from each step not only provides added functionality but also provides added (i) masking tolerance to the detectable corruptions of the newly added variables, and (ii) time-optimal nonmasking tolerance to undetectable corruptions of the newly added variables. Since the program thus tolerates the corruptions of all of its variables, it thus automatically possesses stabilizing tolerance. Keywords: multitolerance; masking, nonmasking, and stabilizing tolerance; detectable and undetectable faults; stepwise design; distributed systems. TR16 Finding Consistent Global Checkpoints in a Distributed Computation D. Manivannan, Robert H.B. Netzer, and Mukesh Singhal Finding consistent global checkpoints of a distributed computation is important for analyzing, testing, or verifying properties of these computations. In this paper we present a theoretical foundation for finding consistent global checkpoints. Given an arbitrary set S of local checkpoints, we prove exactly which sets of the local checkpoints can be combined with S to build consistent global checkpoints, and we present an algorithm for finding all such global checkpoints. The minimal and maximal consistent global checkpoints are presented as special cases. The results are based on the notion of zigzag paths introduced by Netzer and Xu [4]. We also present a method for finding zigzag paths using the rollback- dependency graph introduced by Wang [17, 16]. Keywords: casuality, distributed checkpointing, consistent global checkpoint collection, failure recovery, fault tolerance TR15 Efficient Broadcast and Multicast on Multistage Interconnection Networks using Multiport Encoding by Rajeev Sivaram, Dhabaleswar K. Panda, and Craig B. Stunkel This paper proposes a new approach to implement fast multicast and broadcast in multistage interconnection networks (MINs) with multiport encoded multidestination worms. For a MIN with k x k switches and n stages such worms use n header flits each. One flit is used for each stage of the network and it indicates the output ports to which a multicast message needs to be replicated. A multiport encoded worm with (d1, d2 . . ., dn with di in the closed interval 1 to k) degrees of replication for the respective stages is capable of covering (d1 x d2 x . . . x dn) destinations with a single communication start-up. Grouping algorithms are presented to derive the associated multiport encoded worms for any multicast with an arbitrary set of destinations. Using these worms a multinomial tree-based scheme is proposed to implement the multicast. This scheme significantly reduces broadcast/multicast latency compared to schemes using unicast messages [27]. Simulation studies indicate that improvement in broadcast/multicast latency up to a factor of 8 is feasible using the new approach. Interestingly, this approach is able to implement multicast with reduced latency as the number tof destinations increases beyond a certain number. Compared to implementing unicast messages this approach requires little additional logic at switches. Thus, the scheme demonstrates significant potential for implementing efficient collective communication operations on current and future MIN- based systems. Keywords: Parallel architectures, multistage interconnection networks, interprocessor communication, collective communication, broadcast and multicast algorithms, wormhole routing, and virtual cut-through. TR14 Bandwidth-Optimal Complete Exchange on Wormhole-Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach by Yu-Chee Tseng, Ting- Hsien Lin, Sandeep K.S. Gupta, and Dhabaleswar K. Panda All-to-all personalized communication, or complete exchange, is at the heart of numerous applications in parallel computing. Several complete exchange algorithms have been proposed in the literature for wormhole meshes. However, these algorithms, when applied to tori, can not take advantage of wrap-around interconnections to implement complete exchange with reduced latency. In this paper, a new diagonal-propagation approach is proposed to develop a set of complete exchange algorithms for 2D and 3D tori. This approach exploits the symmetric interconnections of tori and allows to develop a communication schedule consisting of several contention- free phases. These algorithms are indirect in nature and they use message combining to reduce the number of phases (message start-ups). It is shown that these algorithms effectively use the bisection bandwidth of a torus which is twice that for an equal sized mesh, to achieve complete exchange in time which is almost half of the best known complete exchange time on an equal sized mesh. The effectiveness of these algorithms are verified through simulation studies for varying system and technological parameters. It is also demonstrated that synchronous implementation of these algorithms (by introducing barriers between phases) leads to reduced latency for complete exchange with large messages. Asynchronous implementations are shown to be better for smaller messages. Key Words: Collective communication, complete exchange, distributed memory systems, interprocessor communication, parallel computing, torus, wormhole routing. TR13 Multitolerance in Distributed Reset by Sandeep S. Kulkarni and Anish Arora We design in this paper a multitolerant, distributed program that enables a process in a distributed system to reset the state of all system processes as and when necessary. More specifically, our distributed program exhibits two tolerances: (i) in the presence of any finite number of fail-stop and repair faults of processes, it is "masking" tolerant, by which we mean that it completes every distributed reset correctly; (ii) in the presence of transient faults, it is "stabilizing" tolerant, by which we mean that upon starting from an arbitrary state it eventually reaches a state from where it completes every distributed resset correctly. These tolerances are designed one by one: We first augment a fault-intolerant distributed reset program so that it becomes masking tolerant to fail-stops and repairs. We then augment the resulting program so that it becomes stabilizing tolerant to transients, while still retaining its masking tolerance to fail-stops and repairs. Of special note is the fact that our program admits a bounded space implementation. Keywords: robustness, distributed systems, masking, nonmasking, and stabilizing fault-tolerance. TR12 Performance Study of Buffer Management Schemes under Multicasting Traffic in ATM Switching Nodes by Khalid H. Sheta and Mukesh Singhal Multimedia applications in broadband ISDN present various types and classes of traffics such as video teleconferencing, broadband telephony and large file transfers. Multicasting is important in multimedia applications. Since cell loss in a multicast packet will cause the retransmission of the entire packet causing excessive load in the network, the ATM switch should provide for the reduction of cell loss of multicast traffic. The reduction inn cell loss of a particular type of traffic should be done in coordination with the QoS parameters required from other classes within each traffic type. Buffering schemes are used in ATM switches to achieve the required levels of QoS. In this paper, we develop a new buffering scheme which considers traffic types that include multicasting and nonmulticasting components with two existing schemes, No Priority and Push Out. We compare the new buffering scheme with two existing schemes, No Priority and Push Out. We show that the new buffering scheme performs better than the two other schemes by reducing cell loss of multicast cells. We also introduce a new cell scheduling policy that produces lower delay for the delay sensitive class. The new buffering and scheduling schemes provide the ability to tune the cell loss and delay of each class of cells within each traffic type. Key words: Multicast, ATM switching and Buffer Schemes. TR11 An Optimal Ray Traversal Scheme for Visualizing Colossal Medical Volumes by Asish Law and Roni Yagel Modern comptuers are unable to store the complete data of high resolution medical images in main memory. Even on secondary memory (disk), such large datasets are sometimes stored in a compressed form. At rendering time, parts of the volume are requested by the ray tracing algorithm and are loaded from disk. If one is not careful, the same regions may be (decom- pressed and) loaded to memory several times. Instead, a coherent algorithm should be designed that minimizes this thrashing and optimizes the time and effort spent to (uncompress and) load the volume. We present an algorithm that divides the volume into cubic cells, each (compressed and) stored on disk, in contrast to the more common slice-based storage. At rendering time, each cell is allocated a queue of rays. For a sequence of images, all rays are spawned and queued at the cells they intersect first. Cells are loaded, one at a time, in front-to-back (FTB) order. A loaded cell is rendered by all rays found in its queue. We analyze the algorithm in detail and demonstrate its advantages over existing ray casting volume rendering. TR10 The Active-Ray Approach to Rendering on Distributed Memory Multiprocessors by Asish Law and Roni Yagel Object dataflow is a popular approach used in parallel rendering. The data representing the 3D scene is statically distributed among processors and objects are fetched and cached only on demand. Most previous methods were implemented on shared memory architectures and exploited only object- space coherency to reduce cache misses. In this paper, we propose an efficient model for object dataflow on distributed memory machines. The 'Active Ray Tracing' algorithm is introduced and its ray-stacking mechanism is used to support latency hiding by postponing computation on inactive rays. We also optimize memory usage by letting objects migrate and replicate at different processors rather than the common static assignments. Our cache- only-memory approach uses a distributed directory scheme to trace the location of objects at other nodes. A mechanism to minimize network congestion was implemented which optimizes channel utilization. Unlike previous methods, our approach can benefit from temporal coherence and effecatively eliminates communication costs in successive frames. We implemented a volume ray casting instance of the algorithm on the Cray T3D and achieved higher efficiency and scalability than existing algorithms. We achieve interactive frame rates of 20 Hz for 128 (to the third power) volume, and 3 Hz for 256 (to the third power) volume on 128 processors. Keywords: parallel rendering, distributed systems, CRAY T3D, ray tracing volume rendering. TR09 Wavelet Based Feature Driven Identification and Enhancement of Medical Images by Raghu Machiraju, Ajeetkumar Gaddipati, and Roni Yagel Much information in an image or a volume is concentrated in a few regions. These regions are dominated by discontinuities in the intensity of the image. Much has been done to analyze and represent images by multiresolution approaches. Wavelet based approaches are gaining popularity although they are far from bein extremely viable. In this paper we present a wavelet based image analysis algorithm that identifies features persisting through the scales. This is achieved through the notion of sub-band combining which essentially computes a function of the sub-band (Details) information at each point. This sub-band information is obtained through the use of the wavelet transform. We then show how the method can be employed to enhance or smooth features in an image. Also we allude to the use of this method to compression and volume rendering. TR08 Reconstruction Error Characterization and Control: A Sampling Theory Approach by Raghu Machiraju and Roni Yagel Reconstruction is prerequisite whenever a discrete signal needs to be resampled as a result of transformation such as texture mapping, image manipulation, volume slicing, and rendering. We present a new method for the characterization and measurement of reconstruction error in spatial domain. Our method uses the Classical Shannon's Sampling Theorem as a basis to develop error bounds. We use this formulation to provide, for the first time, an efficient way to guarantee an error bound at every point by varying the size of the reconstruction filter. We go further to support position-adaptive reconstruction and data-adaptive reconstruction which adjust filter size to the location of reconstruction point and to the data values in its vicinity. We demonstrate the effectiveness of our methods with 1D signals, 2D signals (images), and 3D signals (volumes). TR07 Multimedia Database Systems: Challenges and Opportunities by Yelena Yesha and Mukesh Singhal Since multimedia systems typically require storage, retrieval, and manipulation of huge amounts of information, database systems play a key role in the design of high-performance multimedia systems. This article examines the issues in the design of multimedia database systems and explores the current state of the art. Key Words: Multimedia systems, multimedia databases, information systems. TR06 The Intractability of Face Reachability in 3D Visit-Once Piecewise- Constant Derivative Systems by John F. Kolen and Feng Zhao We consider a class of three-dimensional dynamical systems whose state spaces are defined by piecewise constant vector fields and whose trajectories never return to a state-space region once they exit the region. We demonstrate that the reachability problem for this class of systems is decidable while the computation is provably intractable. We prove the intractability (i.e., PSPACE-completeness) via reduction of satisfiability of quantified boolean formulas to this reachability problem. This result extends the work of Asarin, Maler, and Pnueli (1995) that proves computational undecidability for three-dimensional constant-derivative systems. TR05 Designing Processor-cluster Based Systems: Interplay Between Cluster Organizations and Collective Communication Algorithms by D. Basak and D.K. Panda Advancements in VLSI has made it attractive to package multiple processors into a single multichip or a board module. There is an increasing trend towards using such processor-clusters in large multiprocessor design. Past research on designing processor-cluster based systems has focused mainly in studying the packaging technologies affecting the inter-cluster network. To make processor-cluster based multiprocessor design more attractive, there is a strong need to understand the details about the topology inside the cluster, its memory organization, and the impact of this organization on system performance. In this paper we focus on such aspects of processor- cluster design with an overall objective to support a logically shared address programming model. We analyze the communication costs for accessing inter- cluster and intra-cluster memories under different cluster organizations. The merits of these organizations are evaluated based on the performance of collective communication algorithms, which occur frequently in applications. In this paper we focus on implementing the broadcast collective communication algorithm, Umesh, on clustered systems. Our results indicate that cluster organizations like bus and crossbar which allow memory inside a cluster to be accessed without messaging overheads, outperform other organizations because of faster intra-cluster access. We also demonstrate that such faster access can be exploited to design better algorithms on clustered systems. We propose a new algorithm - clus_mesh for broadcasting on clustered meshes. For reasonably faster communication within clusters, this algorithm can outperform the existing umesh algorithm by up to 20%. Keywords: parallel architectures, interconnection networks, clustered architectures, hierarchical architectures, collective communication, broadcasting. TR04 Spatial Aggregation: language and applications by Christopher Bailey-Kellogg, Feng Zhao, and Kenneth Yip. This paper describes the spatial aggregation language and its applications. Spatial aggregation is a framework for organizing computations around image-like, analogue representations of physical processes in data interpretation and control tasks; it transforms a numerical input field to successively higher-level descriptions by applying a small, identical set of operators to each layer given a metric, neighborhood relation and equiv- alence relation. The spatial aggregation language provides two abstract data types (ADTs) - neighborhood graph and field - and a set of interface operators for con- structing the transformations of the field. The language consists of a library of component implementations from which a user can mix-and-match and specialize for a particular application. The modular design of the ADTs supports language extensions and user control over tradeoffs such as efficiency vs. generality. We illustrate the use of the language with several examples ranging from region growing in image analysis to trajectory grouping in dynamics interpretation. Programs for these different task domains can be written in a modular, concise fashion in the spatial aggregation language. The language allows users to isolate and express important computational ideas while hiding low-level details. Content Areas: Qualitative Reasoning, geometric/spatial reasoning, programming language, ontologies, applications. TR03 An Optimal Algorithm for Generalized Causal Message Ordering by Ajay D. Kshemkalyani and Mukesh Singhal This paper presents an optimal algorithm for enforcing causal message ordering. The algorithm works with non-FIFO channels and allows a process to multicast to arbitrary and dynamically changing process groups. The algorithm achieves optimality by transmitting the bare minimum causal dependency information and using an encoding scheme to represent and transmit this information. The algorithm is shown to be optimal both in space complexity of message overheads and in space complexity of message logs. Key Words: Causal message ordering, distributed systems, optimal synchronization, concurrency. TR02 The OSU Scheme for Congestion Avoidance in ATM Networks Using Explicit Rate Indication by Raj Jain, Shiv Kalyanaraman and Ram Viswanathan An explicit rate indication scheme for congestion avoidance in computer and telecommunication networks is proposed. The sources monitor their load and provide the information periodically to the switches. The switches, in turn, compute the load level and ask the sources to adjust their rates up or down. The scheme achieves high link utilization, fair allocation of rates among contending sources and provides quick convergence. A backward congestion notification option is also provided. The conditions under which this option is useful are indicated. TR01 An Information System to Support Collaborative Brain-Tumor Research by Sandra A. Mamrak, John Boyd, Ivan, Ordonez, Sandra L. Cottingham, and Allan J. Yates A Neuro-Oncology Information System (NOIS) has been developed to support brain tumor research. The system has been adopted for use by a group of researchers at several centers in the U.S.A. who are collaborating to input, store, and retrieve information concerning tissue specimens, corresponding clinical information, and eventual research results. The NOIS project is targeted to support medical research, rather than clinical treatment or organ registry. Thus, the context in which the NOIS is used is characterized by an ever-changing set of needs and requirements. The NOIS has been implemented to continually accommodate these changing needs. In particular, the project follows the software life-cycle approach to motivate and guide the continuing design, implementation and evaluation of the system. Keywords for indexers: Brain Neoplasms, Computer Communications Networks, Computing Methodologies, Glioma, Information Systems, Medical Informatics Applications, Medical Informatics Computing, Medical Oncology, Medical Research, Tissue Banks