Clustering analysis is an important step towards getting insight into new data. Ensemble procedures have been designed in order to obtain improved partitions of a data set. Previous work in domain, mostly empirical, shows that accuracy and a limited diversity are mandatory features for successful ensemble construction. This paper presents a method which integrates unsupervised feature selection with ensemble clustering in order to deliver more accurate partitions. The efficiency of the method is studied on real data sets.
This article is authored also by Synbrain data scientists and collaborators. READ THE FULL ARTICLE