Outlier mining is an important task for finding anomalous objects. In practice, however, there is not always a clear distinction between outliers and regular objects as objects have different roles w.r.t. different attribute sets. An object may deviate in one subspace, i.e. a subset of attributes. And the same object might appear perfectly regular in other subspaces. One can think of subspaces as multiple views on one database. Traditional methods consider only one view (the full attribute space). Thus, they miss complex outliers that are hidden in multiple subspaces. In this work, we propose Outrank, a novel outlier ranking concept. Outrank exploits subspace analysis to determine the degree of outlierness. It considers different subsets of the attributes as individual outlier properties. It compares clustered regions in arbitrary subspaces and derives an outlierness score for each object. Its principled integration of multiple views into an outlierness measure uncovers outliers that are not detectable in the full attribute space. Our experimental evaluation demonstrates that Outrank successfully determines a high quality outlier ranking, and outperforms state-of-the-art outlierness measures.
Proceedings of the 12th Ieee International Conference on Data Mining, Icdm, 2012, p. 529-538
clusterings; multiple subspaces; outlier ranking
Main Research Area:
IEEE International Conference on Data MiningIEEE International Conference on Data Mining, 2012