Hyunki Lim
| 2024, 29(11)
| pp.21~30
| number of Cited : 0
High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements. In particular, in a multi-label environment, higher complexity is required as much as the number of labels. This paper proposes a feature selection method to improve classification performance in multi-label settings. The method considers three types of relationships: between features, between features and labels, and between labels themselves. To achieve this, a regression-based objective function is designed.
This objective function calculates the linear relationships between features and labels and uses mutual information to compute relationships between features and between labels. By minimizing this objective function, the optimal weights for feature selection are found. To optimize the objective function, a gradient descent method is applied to develop a fast-converging algorithm. The experimental results on six multi-label datasets show that the proposed method outperforms existing multi-label feature selection techniques. The classification performance of the proposed method, averaged over six datasets, showed a Hamming loss of 0.1285, a ranking loss of 0.1811, and a multi-label accuracy of 0.6416. Compared to the AMI(Approximating Mutual Information) algorithm, the performance was better by 0.0148, 0.0435, and 0.0852, respectively.