RED Project

RED stands for reverse engineering of UML sequence diagrams. The goal of this project is to develop state-of-the-art algorithms for extracting sequence diagrams from Java code.

Reverse-Engineered UML Sequence Diagrams

Sequence diagrams play a central role in UML modeling of object interactions. Reverse engineering of sequence diagram allows the automatic extraction of such diagrams from existing code. This is often necessary during iterative development. A typical scenario is to perform design recovery through reverse engineering of class diagrams and sequence diagrams in the beginning of the current iteration, based on the last iteration's code. Additional reverse engineering is also usually necessary during an iteration.

Software maintenance can also benefit from reverse-engineered sequence diagrams. These diagrams are particularly well-suited for representing interactions in object-oriented software. Automatic design recovery of object interactions for software understanding and maintenance requires effective reverse engineering of sequence diagrams.

Object interactions are an essential consideration for testing of object-oriented software. Various testing approaches consider the interactions represented by sequence diagrams as part of their coverage requirements. These coverage goals can be defined with respect to different elements of statically-constructed sequence diagrams which are extracted from the code under test. Subsequent run-time analysis during test execution can be used to determine the coverage of these diagram elements and to highlight potential test weaknesses.

Static Analyses for Reverse-Engineered Sequence Diagrams

The work on RED revealed various challenging research problems. The difficulty of these problems becomes clear when one considers commercial tools that perform reverse engineering of UML sequence diagrams from Java code. UML modeling tools can produce reverse-engineered diagrams that are incorrect or incomplete. These are not low-quality tools — in fact, they are mature, well-designed, high-quality commercial tools that are very useful during software development. However, the conceptual problems related to correct reverse engineering are complicated and require advanced static analysis techniques. We have solved several of these problems with state-of-the-art static analyses for reverse engineering of UML sequence diagrams.

The following challenges were addressed:

Given some call site in the analyzed code, how should the receiver object(s) at this site be represented in the diagram? Our efforts to answer this question produced a novel object naming analysis [ICSE05].
How should the intraprocedural flow of control in the code be represented in the reverse-engineered sequence diagrams? To answer this question, we defined a new control-flow analysis for mapping a method's control-flow graph to UML 2.0 interaction fragments [PASTE05]. This work systematically revealed the limited expressive power of UML control-flow primitives, and the tradeoffs that tool builders need to face when handling the intraprocedural flow of control.
How should infeasible call chains be eliminated from the diagrams? Our work on call chain analysis showed that infeasible chains in the call graph occur even when using precise call graph construction algorithms [ISSTA04, PASTE04].

Testing Based on Reverse-Engineered Sequence Diagrams

Based on the reverse-engineered diagrams, we defined several control-flow coverage criteria for testing the interactions among a set of collaborating objects. The sequences of messages in the diagrams were used to define the coverage goals for the family of criteria, by generalizing traditional techniques such as branch coverage and path coverage. We also defined a run-time analysis that gathers coverage measurements for the criteria. A novel approach (based on integer linear programming) was used to estimate the complexity of each criterion. The results of this work compared different approaches for testing of object interactions and provided new insights for testers and for builders of test coverage tools [FASE05].

Visualization of Reverse-Engineered Sequence Diagrams

Effective visualization of large-scale complex reverse-engineered sequence diagrams is challending. Due to their large size and inefficient spatial design, the diagrams could easily become useless to software engineers. We investigated the visual limitations of UML sequence diagrams and developed a set of techniques for overcoming these limitations [VISSOFT05].

main page