INRIA - international

Oscar Nierstrasz is a Professor of Computer Science at the Institute of Computer Science (IAM) of the University of Bern, where he founded the Software Composition Group in 1994. Prof. Nierstrasz is the author of over a hundred publications and co-author of the book Object-Oriented Reengineering Patterns (Morgan Kaufmann, 2003). The Software Composition Group carries out research in diverse aspects of how to make systems more flexible with respect to changing requirements. Current research is focussed on (i) programming languages and mechanisms to support software evolution, and (ii) tools and environments to support the reverse- and re-engineering of complex software systems. Prof. Nierstrasz has been active in the international object-oriented research community, serving on the programme committees of the ECOOP, OOPSLA, ESEC and many other conferences, and as the Programme Chair of ECOOP '93, ESEC/FSE '99 and MoDELS '06.

Yann-Gaël Guéhéneuc is assistant professor at the Department of Informatics and Operations Research (Software Engineering Group) of University of Montreal since September 2003, where he leads the Ptidej Team. He holds a PhD in software engineering from University of Nantes, France (under Professor Pierre Cointe's supervision) since 2003 and an Engineering Diploma from École des Mines of Nantes since 1998. His PhD thesis was funded by Object Technology International, Inc. (now IBM OTI Labs.), where he worked in 1999 and 2000. His research interests are program understanding and program quality during development and maintenance, in particular through reverse engineering and the identification of recurring patterns. He is also interested in empirical software engineering and in software laws and theories. He has published papers in international journals and conferences and is the main developer of the Ptidej tool suite to evaluate and enhance the quality of object-oriented programs by promoting the use of patterns. He was in the organising and/or program committees of several conferences, including ECOOP, ICPC, ICSM, LMO, QAOOSE, WCRE, WOOR.

2. Historique de la collaboration

3.1. sur la collaboration déjà existante avec votre partenaire

Our joint work experience has shown the importance of exchanges between post-docs and meetings between the two teams. Nevertheless, we believe that an INRIA Associate Team project will have a large impact on solidifying these long collaboration effort. It will reduce the geographical distance between the participants and will be a clear incentive to collaborate. This collaboration will benefit existing as well as future PhD students.

The collaboration will lead to a stronger interoperability between the two reengineering platforms Moose and Ptidej. In particular a new version of FAMIX, a language independent meta-model for code representation, is currently being designed and the goal is to make sure that it is really used as an interchange meta-model between several tools. Several PhD students will visit the INRIA Team and bring their knowledge into the team. This will help the future maintenance team to build a momentum and help evangelise the FAMIX meta-model in North-America. In parallel, strong research collaborations will be possible thanks to the unique perspectives on languages, remodularisation, program understanding, and software analyses developed by the two teams.

Furthermore, the collaboration between the ADAM Team and Software Composition Group is also embodied in an ongoing submission for an FET Open project named "Dynamic Reconfiguration with Systematic Reuse: Pushing Traits to the Limits". The project focuses on Traits from a programming language perspective.

3.2. sur la collaboration avec d'autres projets INRIA

Other INRIA projects, such as Obasco or Triskell, could become involved in later stages of the collaboration.

Apart from INRIA projects, the planned work relates to advanced research in software and language engineering and could lead to a collaboration with the ADELE Team in Grenoble and the D'OC Team at University of Montpellier (Données Objets Composants pour les systèmes complex). A further step could be the proposal of a federating project involving members of these different teams, such as an ARC (Action de Recherche Coopérative).

The ADELE Team of the LSR was working on component evolution based on static analysis. It currently works on evolution based on services definition in the context of the OSGi platform. The D'OC Team has expertise in the restructuring of class hierarchies using formal concept analysis. The INRIA LANDE Team has expertise in software verification based on program slicing.

3.3. sur la collaboration avec d'autres équipes de l'organisme étranger partenaire.
An INRIA Associate Team between the ADAM Team and University of Bern will pave the way for future collaborations with other teams in Switzerland, in particular the Seal group at the University Zurich and colleagues at the University of Lugano.
The contacts with the Ptidej Team will also strengthen the contacts already existing with the Verso Team led by H. Sahraoui at University of Montreal.

4. Divers:

The benefits for the INRIA team are the following:

expand inside the INRIA team the knowledge built around the new versions of the Moose reengineering environment developed at the University of Bern. Since S. Ducasse was one of the original Moose developers and Moose is used by PhD students in the Adam Team, we want to establish ourselves as one of the main contributors to the Moose environment. Moose is currently worked on in the context of a 3 years research project funded by the Hasler Foundation carried out at the University of Bern.
have access to two complementary reengineering environments and be able to compose them.
develop new analyses based on the bridging of these two environments: for example, using the visualization power of Moose with the explanation-based constraint programming of Ptidej.
establish contact with PhD students of the other teams will help bootstrapping the reengineering group inside the ADAM Team.

II. PREVISIONS 2008

Programme de travail

The goal of this associated team is twofold: first to create synergy between two reengineering environments: Moose developed at the University of Berne and Ptidej developed at University of Montreal, and second to develop new analyses for the remodularisation of applications. We will collaborate on two distinct, yet complementary tracks: identifying modularisation problems and language support to solve these modularisation problems.

Context

The work will be carried out in the context of the Moose and Ptidej reengineering environments.

Moose was conceived in 1997 in the context of the FAMOOS European project (ESPRIT Project 21975: Framework-based Approach for Mastering Object-Oriented Software Evolution. Sept. 1996-Sept. 1999.), and since then, it has continually evolved to support research in reverse engineering and quality assurance within the Software Composition Group and other research groups across Europe [Nier05c, Duca05a]. Moose provides a flexible framework on which various tools have been implemented to support state-of-the-art analyses:

CodeCrawler is a general purpose visualization tool that supports the concept of polymetric views [Lanz03d].
ConAn is a concept analysis tool that was used for various concept location detections, including Traits mining [Lien05a].
Chronia is a tool for analyzing CVS repositories to understand how teams work [Girb05c].
Hapax is a tool to analyze the linguistic information from the source code for identifying implementation concepts [Kuhn07a].
Mondrian is an engine to allow for fast visualization prototyping through scripts [Meye06a].
SmallDude is a tool for duplication detection to reveal how developers copy from each other [Bali06a].
Softwarenaut is an interactive visualization tool [Lung06a].
CodeCity is a 3D visualization tool and [Wett07a].
TraceScraper is a tool for analyzing dynamic information [Gree06b].
Van extends Moose with the Hismo meta-model and provides various evolution analyses [Girb06a].

Ptidej started as a meta-model to describe design motifs, the solutions of design patterns in 1999 and evolved in a tool suite providing a reverse-engineering environment first during the PhD theses of Hervé Albin-Amiot and Yann-Gaël Guéhéneuc at École des Mines de Nantes and then at University of Montreal under Yann-Gaël Guéhéneuc's direction. Ptidej divides into four main parts: (1) the PADL meta-model to describe motifs and object-oriented software systems; (2) parsers to reverse engineer systems in AOL, Aspect/J, C++, MSE, Java and soon C#; (3) analyses to enrich and study models; (4) a user interface to handle models. The current analyses include:

Identification of binary class relationships [Guéh04b].
Identification of UML specialised constituents (Operation, Type, Implementation Class...) [Guéh04b].
Computation of the differences between models [Anto06a].
Specification and detection of design defects [Guéh01a,Moha05,Moha06].
Identification of design patterns [Kacz06a,Guéh07a].
Mining of metric values and lexicons [Anto07a].

1. Identifying Modularisation

The first steps for package re-modularisation consist of understanding the package structure and of identifying the mismatched classes and methods. We intend to employ various techniques such as software metrics, clustering analysis, constraint programming, and program visualization to identify modularisation problems.

First year. During the first year, the activities will aim at building bridges between the environments of the three teams, at using this bridge, and at starting collaborating on package re-modularisation. Thus, the first year will include the following activities:

Releasing the next version of FAMIX as a standard meta-model for reverse engineering. FAMIX is a language independent meta-model built in Moose. A common joint effort is well under way to improve the current version 2.2 and to offer a mechanism for making it used by a wider community.
Building a model exchange bridge between Moose and Ptidej to allow models of software systems to be exchanged between and analyzed using the two tools. The exchange is to be performed through the MSE format currently developed in the context of Moose.
Studying the re-modularisation of classes in packages by extending the work done in Montreal on using explanation-based constraint programming to identify mismatched classes [Guéh01a].

Second year. In the second year we will build on the results obtained in the first year, and we will concentrate on various analyses:

Feasibility study of the application of explanation-based constraints on large to very large systems.
Studying the use of explanation-based to solve the problem of creating new packages and moving the mismatched classes/methods into the right (possibly newly created) packages.
Devising interactive visualizations for understanding and navigating the package structure.

Third year. The third year will conclude the activities performed in the two first years. The activities include:

Validating experimentally the results of the constraint systems to re-modularise packages on several small-to-very-large systems.
Analyze and identify the difference between the package structure in J2EE systems as compared with regular Java systems.

In parallel to the mentioned activities we intend to pursue publications at appropriate conferences such as ICSM, WCRE or CSMR.

2. Language Support to Solve Modularisation Problems

In this track, we will assess the use of traits as language support to solve the modularisation problems identified in the first track. We will then perform a large case study and refine the existing constructs semantics. Traits are first-class groups of methods and represent fine-grained reuse constructs [Duca06b]. Orthogonally to class inheritance, classes may reuse and compose behavior defined in traits. Thus, traits are fine-grained units used to compose classes, while avoiding many of the problems of multiple inheritance and mixins.

Previous research evaluated the usefulness of traits by refactoring the Smalltalk collection and stream libraries, which showed up to 12% gain in terms of code reuse [Blac03a]. Other research tried to semi-automatically identify traits in existing libraries [Lien05a]. However, such refactorings, while valuable, may not address all the problems, since the hierarchies were previously expressed within single inheritance and following certain patterns due to the presence of inheritance. We want to evaluate how traits enable reuse, and what problems could be encountered when building a library using traits from scratch, taking into account that traits are units of reuse.

First year. A case study was conducted [Cass07a] that evaluates the expressiveness and reuse promoted by traits by redesigning from scratch a stream library. We now would like to develop another large case study to be able to use these two cases as benchmarks for future analysis in the context of the automatic extraction of traits. As other case study, we plan to redesign the collection hierarchy of Squeak from the ground up. The Smalltalk collection hierarchy has been used in several independent experiments of code reuse and restructuration, e.g. for refactoring, [Blac03a]as previously mentioned. It is rich and complex enough to represent a good case study. Such a large redesign (not refactoring) will help us building a reference benchmarks but also reassess the key characteristics of trait composition.

Second year. Based on the two case studies now available, we will perform an analysis of the different trait models: stateless and stateful [Berg07a]. We will re-evaluate the need of certain operators and identify the real benefits of stateful traits. We will only keep the features that are most of the time needed and eliminate special cases.

We also plan to develop several automated techniques to identify traits in existing code. FCA (Formal Concept Analysis) was already used to identify coherent group of methods or suggest refactoring to design defects [Moha06b]. We want to build on this experience and the benchmarks to improve the quality of the results. We also want to assess whether clustering techniques are adequate.

Third year. In the third year, we plan to bridge the gap between identifying modularisation problems and solving these problems using the language support provided by traits. We will enhance the automatic detection to suggest refactorings possibilities of existing single inheritance code towards traits. These refactorings will be based on the experience gathered from performing case studies.

Budget prévisionnel

1. Co-financement

- Cette coopération bénéficie-t-elle déjà d'un soutien financier de la part de l'INRIA, de l'organisme étranger partenaire ou d'un organisme tiers (projet européen, NSF, ...) ?
- Dans le cas où votre proposition serait retenue, vous parait-il probable d'obtenir de l'organisme étranger partenaire un soutien financier symétrique ?

We have submitted a proposal for a FET Open project with the University of Berne. We are waiting for the first phase results.

Researched around Moose is carried out at the University of Berne in the context of a 3 years research project funded by the Hasler Foundation. The project is concerned with modeling and analyzing J2EE systems.

ESTIMATION PROSPECTIVE DES CO-FINANCEMENTS
Organisme	Montant





Total

2. Echanges

We would like to organize every year a 4 day workshop at INRIA with participants of the three teams. The first workshop will be held in June 2008. In addition, over the year we want to have 10 day visits of students between INRIA and the other teams.

ESTIMATION DES DÉPENSES		Montant
	Nombre	Accueil	Missions	Total
Chercheurs confirmés	6	2	4
Post-doctorants	1	1
Doctorants	11	8	3
Stagiaires
Autre (précisez) :
Total				24300
		- total des co-financements		24300
	Financement "Équipe Associée" demandé

Remarques ou observations :

Details of the scheduled trips ADAM: (INRIA-Futurs)
Ducasse: 2 trips of 3 days to Berne
200 + 3 * 200 = 800 euros/trip
Bergel: 2 trips of 3 days to Berne
200 + 3 * 200 = 800 euros/trip
Suen (PhD): 2 trips of 3 days to Berne
200 + 3 * 200 = 800 euros/trip
Abdeen (PhD): 2 trip of 3 days to Berne
200 + 3 * 200 = 800 euros/trip
= 6400 Euros Ducasse: 1 trip of 10 days to Montreal
600 + 10 * 240 = 3000 euros/trip
Abdeen (PhD): 1 trip of 10 days to Montreal
600 + 10 * 240 = 3000 euros/trip
Bergel: 1 trip of 10 days to Montreal
600 + 10 * 240 = 3000 euros/trip
= 9000 Euros

SCG: (Berne)
Nierstrasz: 1 trip of 3 days to Lille
200 + 3 * 120 = 550 euros/trip
Denker (PhD): 1 trip of 10 days to Lille
200 + 10 * 120 = 1400 euros/trip
Girba (PostDoc): 2 trips of 3 days to Lille
200 + 3 * 120 = 550 euros/trip
Kuhn(PhD): 1 trip of 10 days to Lille
200 + 10 * 120 = 1400 euros/trip
Renggli(PhD): 1 trip of 10 days to Lille
200 + 10 * 120 = 1400 euros/trip
Rothlisberger(PhD): 1 trip of 10 days to Lille
200 + 10 * 120 = 1400 euros/trip
Verwaest(PhD): 1 trip of 10 days to Lille
200 + 10 * 120 = 1400 euros/trip
=8100

Ptidej (Montreal)
Guehenec: 1 trip of 10 days to Lille
600 + 10 * 120 = 1800 euros/trip
Foutse (PhD): 1 trip of 10 days to Lille
600 + 10 * 120 = 1800 euros/trip
Naouel (PhD): 1 trip of 10 days to Lille
600 + 10 * 120 = 1800 euros/trip
Salima/Adnane (PhD): 1 trip of 10 days to Lille
600 + 10 * 120 = 1800 euros/trip
=7200 Euros

References

[Anto06a] Giuliano Antoniol and Yann-Gaël Guéhéneuc, "Feature Identification: An Epidemiological Metaphor," Transactions on Software Engineering, 32(9):627--641, September 2006.

[Anto07a] Giuliano Antoniol, Yann-Gaël Guéhéneuc, Ettore Merlo, and Paolo Tonella, "Mining the Lexicon Used by Programmers during Software Evolution," Proceedings of the 23rd International Conference on Software Maintenance, October 2007. IEEE Computer Society Press.

[Bali06a] Mihai Balint, Tudor Gîrba and Radu Marinescu, "How Developers Copy," Proceedings of International Conference on Program Comprehension (ICPC 2006), 2006, pp. 56â65.

[Berg07a] Alexandre Bergel, Stéphane Ducasse, Oscar Nierstrasz and Roel Wuyts, «Stateful Traits and their Formalization,» Journal of Computer Languages, Systems and Structures, 2007, To appear

[Berg05a] Alexandre Bergel, Stéphane Ducasse, Oscar Nierstrasz and Roel Wuyts, «Classboxes: Controlling Visibility of Class Extensions,» Computer Languages, Systems and Structures, vol. 31, no. 3-4, December 2005, pp. 107-126

[Berg05b] Alexandre Bergel, Stéphane Ducasse and Oscar Nierstrasz, «Classbox/J: Controlling the Scope of Change in Java,» Proceedings of 20th International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA'05), ACM Press, New York, NY, USA, 2005, pp. 177-189.

[Blac03a] Andrew P. Black, Nathanael Scharli and Stephane Ducasse, "Applying Traits to the Smalltalk Collection Hierarchy," Proceedings of 17th International Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA'03), vol. 38, October 2003, pp. 47-64.

[Cass07a] Damien Cassou, Stéphane Ducasse and Roel Wuyts, «Redesigning with Traits: the Nile Stream trait-based Library,» International Conference on Dynmaic Languages 2007, 2007, pp. 50-79.

[Duca05a] Stéphane Ducasse and Michele Lanza, «The Class Blueprint: Visually Supporting the Understanding of Classes,» Transactions on Software Engineering (TSE), vol. 31, no. 1, January 2005, pp. 75-90.

[Duca05b] Stéphane Ducasse, Tudor Gîrba, Michele Lanza and Serge Demeyer, «Moose: a Collaborative and Extensible Reengineering Environment,» Tools for Software Maintenance and Reengineering, pp. 55-71, Franco Angeli, Milano, 2005.

[Duca06a] Stéphane Ducasse, Oscar Nierstrasz, Nathanael Schärli, Roel Wuyts and Andrew Black, «Traits: A Mechanism for fine-grained Reuse,» ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 28, no. 2, March 2006, pp. 331-388.

[Girb05c] Tudor Gîrba, Adrian Kuhn, Mauricio Seeberger and Stéphane Ducasse, "How Developers Drive Software Evolution," Proceedings of International Workshop on Principles of Software Evolution (IWPSE 2005), IEEE Computer Society Press, 2005, pp. 113â122.

[Girb06a] Tudor Gîrba and Stéphane Ducasse, "Modeling History to Analyze Software Evolution," Journal of Software Maintenance: Research and Practice (JSME), vol. 18, 2006, pp. 207-236.

[Gree06b] Orla Greevy, Stéphane Ducasse and Tudor Gîrba, "Analyzing Software Evolution through Feature Views," Journal of Software Maintenance and Evolution: Research and Practice (JSME), vol. 18, no. 6, 2006, pp. 425-456.

[Guéh01a] Yann-Gaël Guéhéneuc, Hervé Albin-Amiot, "Using Design Patterns and Constraints to Automate the Detection and Correction of Inter-Class Design Defects," TOOLS (39) 2001: 296-306

[Guéh04a] Yann-Gaël Guéhéneuc and Hervé Albin-Amiot, "Recovering Binary Class Relationships: Putting Icing on the UML Cake," Proceedings of the $19^{th}$ Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 301--314, October 2004. ACM Press.

[Guéh04a] Yann-Gaël Guéhéneuc, "A Systematic Study of UML Class Diagram Constituents for their Abstract and Precise Recovery," Proceedings of the 11th Asia-Pacific Software Engineering Conference, pages 265--274, November-December 2004. IEEE Computer Society Press.

[Guéh07a] Yann-Gaël Guéhéneuc and Giuliano Antoniol, "A Multi-layered Framework for Design Pattern Identification," Transactions on Software Engineering, December 2007. (Under revision.)

[Kacz06a] Olivier Kaczor, Yann-Gaël Guéhéneuc, and Sylvie Hamel, "Efficient Identification of Design Patterns with Bit-vector Algorithm," Proceedings of the $10^{th}$ Conference on Software Maintenance and Reengineering, pages 173--182, March 2006. IEEE Computer Society Press.

[Kuhn07a] Adrian Kuhn, Stéphane Ducasse and Tudor Gîrba, "Semantic Clustering: Identifying Topics in Source Code," Information and Software Technology, vol. 49, no. 3, March 2007, pp. 230â243.

[Lanz03d] Michele Lanza and Stéphane Ducasse, "Polymetric ViewsâA Lightweight Visual Approach to Reverse Engineering," Transactions on Software Engineering (TSE), vol. 29, no. 9, September 2003, pp. 782-795.

[Lien05a] Adrian Lienhard, Stéphane Ducasse and Gabriela Arévalo, "Identifying Traits with Formal Concept Analysis," Proceedings of 20th Conference on Automated Software Engineering (ASE'05), IEEE Computer Society, November 2005, pp. 66-75.

[Lung06a] Mircea Lungu, Michele Lanza and Tudor Gîrba, "Package Patterns for Visual Architecture Recovery," Proceedings of CSMR 2006 (10th European Conference on Software Maintenance and Reengineering), IEEE Computer Society Press, Los Alamitos CA, 2006, pp. 185â196.

[Meye06a] Michael Meyer, Tudor Gîrba and Mircea Lungu, "Mondrian: An Agile Visualization Framework," ACM Symposium on Software Visualization (SoftVis 2006), ACM Press, New York, NY, USA, 2006, pp. 135â144.

[Moha05a] Naouel Moha and Yann-Gaël Guéhéneuc. "On the Automatic Detection and Correction of Design Defects," Proceedings of the $6^{th}$ ECOOP workshop on Object-Oriented Reengineering, July 2005. Springer-Verlag.

[Moha06a] Naouel Moha, Yann-Gaël Guéhéneuc, and Pierre Leduc, "Automatic Generation of Detection Algorithms for Design Defects," Proceedings of the 21st Conference on Automated Software Engineering, pages 297--300, September 2006. IEEE Computer Society Press.

[Moha06b] Naouel Moha, Jihene Rezgui, Yann-Gaël Guéhéneuc, Petko Valtchev, and Ghizlane El Boussaidi, "Using FCA to Suggest Refactorings to Correct Design Defects," Proceedings of the 4th International Conference on Concept Lattices and their Applications, pages 297--302, September 2006. IEEE Computer Society Press.

[Nier05c] Oscar Nierstrasz, Stéphane Ducasse and Tudor Gîrba, "The Story of Moose: an Agile Reengineering Environment," Proceedings of the European Software Engineering Conference (ESEC/FSE 2005), ACM Press, New York NY, 2005, pp. 1-10, Invited paper.

[Wett07a] Richard Wettel and Michele Lanza, "Program Comprehension through Software Habitability," Proceedings of ICPC 2007 (15th International Conference on Program Comprehension), IEEE CS Press, 2007, pp. 231-240.

EQUIPE ASSOCIEE	Remoos
sélection	2008

Equipe-Projet INRIA : ADAM	Organismess étrangerss partenaires : Université de Bern and Université de Montréal
Centre de recherche INRIA : INRIA Futurs Thème INRIA : COM A	Pays : Switzerland, Canada

	Coordinateur français	Coordinateur étranger	Coordinateur étranger
Nom, prénom	Stéphane Ducasse	Oscar Nierstrasz	Yann-Gael Guéhéneuc
Grade/statut	Directeur de recherche	Full Professor at the Computer Science Department	Assistant Professor at the Department of Informatics and Operations Research
Organisme d'appartenance (précisez le département et/ou le laboratoire)	INRIA-Futurs, Université des Sciences et Technologies de Lille, LIFL-CNRS UMR 8022-INRIA	Universität Bern, Institut für Informatik und angewandte Mathematik	Faculté des Arts et des Sciences, Université de Montréal
Adresse postale	40, avenue Halley, Parc Scientifique de la Haute Borne, Bat.A, Park Plaza, Villeneuve d'Ascq 59650 France	Neubruckstrasse 10, 3012 Bern, Switzerland	CP 6128 succ. Centre-Ville Montreal, Quebec, H3C 3J7 Canada
URL	adam.lifl.fr, stephane.ducasse.free.fr	www.iam.unibe.ch/~oscar	www.iro.umontreal.ca/~guehene
Téléphone	33 3 59 57 78 66	+41 31 631 4618	1-514-343-6782
Télécopie	33 3 59 57 78 50	+41 31 631 3355	1-514-343-5834
Courriel	stephane.ducasse@inria.fr	oscar.nierstrasz@iam.unibe.ch	guehene@iro.umontreal.ca

Programme INRIA "Equipes Associées"

I. DEFINITION

Présentation de l'Équipe Associée

1. Présentation des coordinateurs étrangers