Seminario "Mining Development Email Archives"

Il 18/3/2013 dalle 11:30 alle 13:30 Alberto Bacchelli (PhD student, University of Lugano) presenterà il seminario "Mining Development Email Archives"

18/03/2013 dalle 11:30 alle 13:30

Dove Dipartimento di Informatica - Scienza e Ingeegneria, Aula E1, Mura Anteo Zamboni 2/b, Bologna

Il 18/3/2013 dalle 11:30 alle 13:30 in aula E1 presso il Dipartimento di Informatica – Scienza e Ingegneria, Mura Anteo Zamboni 2/b, Bologna, nel contesto dell’insegnamento di Ingegneria del Software, Alberto Bacchelli (PhD student, University of Lugano) presenterà il seguente seminario.

The availability of large amounts of recorded data, produced during software development, has led to a research area called mining software repositories (MSR). Researchers mine software archives both to support software understanding, development, and evolution, and to empirically validate novel ideas and techniques. Most MSR research focuses on mining archives of data that is either written by humans for a computer (e.g., source code) or generated by a computer for humans (e.g., execution traces). All this data has an easily parsable structure that allows precise fact extraction and concerns the end product of software development. Nevertheless, by only focusing on structured data we risk to leave people in the background and omit a deep investigation of human factors, which are known to be crucial in software engineering.

Other software repositories archive data that is both produced and consumed by humans: Information written in natural language used to exchange knowledge among people, such as emails or change comments. By mining this people-centric information, we have the chance to gain insights on the human factors revolving around a software project, so that we can better understand and support software development from a different perspective. Nevertheless, The noisy and unstructured nature of people-centric information makes this form of data sub-exploited.

In our work we focus on mining people-centric information, in particular on the email communication occurring among people involved in a software project. We have been implementing the necessary tools and investigating methods for exploring, exposing, and exploiting such unstructured data. We found two main problems in using email data to support program comprehension and software development: The factual gap between emails and development artifacts and the noisy and mixed-language nature of email content. In the first part of our work we have been tackling the former issue, currently we are addressing the latter challenge.

Alberto Bacchelli obtained his Bachelor and Master's degree in Computer Science from the University of Bologna, Italy. He is currently a Ph.D. student at the University of Lugano in the Faculty of Informatics. He is working under the supervision of Prof. Michele Lanza in the REVEAL (Reverse Engineering, Visualization, Evolution Analysis Lab) research group. His research interests include empirical software engineering, mining software repositories, unstructured data mining, software quality, and development tools.