D.Karthika, M.Prakash
A Feature selection for the high dimensional data clustering is a difficult problem because the ground truth class labels that can guide the selection are unavailable in clustering. Besides, the data may have a broad number of features and the irrelevant ones can run the clustering. A novel feature weighting scheme is proposed, in which the weight for each feature is a measure of its contribution to the clustering task. A well defined objective function is given, which can be explicitly solved in an iterative way. A fast clusteringbased feature selection algorithm (FAST) works in two steps. In the first step, graph-theoretic clustering methods are used to divide the features into clusters. In second step, the most nearest feature that is completely related to destination is chosen from each cluster to design a subset of features. Features in each cluster are independent. Efficiency of FAST is measured by using Minimum Spanning Tree (MST). Accuracy of image comparison is efficient by using semantic similarity algorithm.