vai al contenuto della pagina vai al menu di navigazione

Scaling clustering to high dimensional data

19/06/2012 dalle 11:30 alle 13:00

Dove Room "Nadia Busi" - Department of Computer Science, Mura Anteo Zamboni 7, 40127 Bologna

Aggiungi l'evento al calendario


Prof. Ira Assent (Department of Computer Science, Aarhus University, DK)


Clustering is an established data mining technique for automatically grouping objects based on mutual similarity. Today, we face increasingly high-dimensional data, i.e. data objects described by many attributes. Effects attributed to the "curse of dimensionality" mean that in high-dimensional spaces, traditional clustering methods fail to identify meaningful clusters. In little more than a decade, the research field of subspace clustering has established methods for identifying clusters in subsets of the attributes in such high-dimensional spaces. As the number of possible subsets is exponential in the number of attributes, efficient algorithms are crucial. This talk discusses approaches for efficient and scalable subspace clustering following the density-based clustering paradigm.


Ira Assent is an associate professor in the Data-Intensive Systems group in the Department of Computer Science at Aarhus University. Before joining Aarhus University, she was with Aalborg University, Denmark, and RWTH Aachen University, Germany. Her research interests are in data mining and data management, with a particular focus on efficient algorithms and scalability to high dimensions and large data volumes. She has co-authored more than 60 publications, many in top venues in data management and data mining. Ira Assent was awarded the 2008 Teacher of the Year award by the Study Board of Computer Science at Aalborg University, the 2008 BTW Dissertation award by the database section of the German computer science association, and the Borchers Medal by RWTH Aachen University for her dissertation with distinction. She has successfully attracted funding for research projects by Danish research councils for work on outlier detection and efficient query processing.