Fig. 8

Clustering performance, measured by the Adjusted Rand Index (ARI), of unsupervised random forests built on the top k features (where k ranges from 2 to 12) in 5 smaller datasets (left) and 5 larger datasets (right). Features are selected using our proposed greedy-graph and brute-graph approaches, along with three state-of-the-art methods: classification-based, phylogeny-based, and leave-one-variable-out