Analogical Reuse of Object-Oriented Analysis Models
Abstract
Software reuse involves using again software artifacts that have been successfully built before.To be successful with software reuse, techniques for reuse must be integrated into both theinformation system development process and the programming environment. If potential reusecan be identified early in an information system development process, the gain in developmenttime can be substantial. Techniques to automatically identify reuse candidates incorporated insoftware development tools would increase the benefits for software development even more.In this dissertation the incorporation of reuse techniques based on analogical reasoning(AR) in tools for software development is proposed. These techniques use information aboutthe structure and semantics of a model from the analysis of a software system to try to identifypotential analogous models.Analogical reasoning is typically described as consisting of a set of phases. Althoughother AR phases are equally important, the focus of this thesis is on the retrieval and mappingphases of AR. The proposed approach is demonstrated using OOram role models. OOram isan object-oriented modelling notation resembling UML sequence diagrams. OOram modelswere chosen because they focus entirely on the analysis of a problem and does not take intoconsideration what objects will play the various roles in the system. The findings in this thesisare applicable also for such models.A user creates a role model during the analysis of a new project. To prevent too muchdetailed work at this stage, it would be advantageous if a tool could support the process byidentifying reusable candidates from a software development repository. The proposed approachimplements support for a tool that can search a repository for models that are analogousto the model being created. The user must then evaluate the identified models to see ifthey are suitable within the project.AR is used to identify similar cases from different problem domains. A similarity modelfor OOram role models that uses a combination of structural and semantic information aboutthe models to identify similarities is proposed. At the time the ROSA project was initiated, thiswas a natural choice.The requirements of the similarity model are that it is able to distinguish potentiallyuseful models from the ones that cannot be reused. In the approach suggested in this thesis,each named component in the model repository is linked to a word meaning in a term space.This term space is modelled after WordNet, an electronic, lexical database.During retrieval, information about structure and semantics of the models is used. Allnew role models are given a structure description before they are stored in the repository.This information is, during retrieval, used as an index. Semantic similarity among models isduring retrieval found by identifying distance in the semantic network. An upper bound forthe semantic similarity between a target model and each of the base models in the repositoryis identified, and this result is combined with a structural similarity, based on the structuredescriptions, to form a retrieval similarity. During the mapping phase, the most promising base models after retrieval are comparedto the target model. Mapping between a target and each of the retrieved base models is doneusing a genetic algorithm that tries to optimize the mapping between the two models basedon their structure and semantics, resulting in a mapping similarity. The balance betweensemantics and structure in the similarity model is vital both during retrieval and mapping.Experiments are described in which analogies are identified between a target model andthe models in a repository containing 133 models. In this context a good analogy for a rolemodel is a role model for which we calculate a high mapping similarity. This implies thatthe models have similar structure, and roles that are positioned at comparable positions in thestructures have similar semantics.In 21 of 24 cases, the model with the highest mapping similarity is identified fromamong the top 30 ranked models during retrieval. Experiments also show that if consideringthe 5 highest ranked models according to mapping similarity in each of the 24 cases, more than85 % of them will be localized among the top 30 ranked models after retrieval. The findingsreported show that the suggested approach is viable, although further studies are necessary.The top ranked model may prove not to be the best analogy after further analysis. The usermust evaluate the mappings.