|Table of Contents|

Feature Selection Method Based on Coefficient ofVariation and Maximum Feature Tree(PDF)

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

Issue:
2021年01期
Page:
111-118
Research Field:
·计算机科学与技术·
Publishing date:

Info

Title:
Feature Selection Method Based on Coefficient ofVariation and Maximum Feature Tree
Author(s):
Xu HaifengZhang YanLiu JiangLü Danjv
School of Big Data and Intelligent Engineering,Southwest Forestry University,Kunming 650224,China
Keywords:
feature selectionfeature contribution scoringcoefficient of variationmutual
PACS:
TP3-0
DOI:
10.3969/j.issn.1001-4616.2021.01.016
Abstract:
Feature selection is a key process in data mining. Feature contribution scoring and feature optimization are its core parts. This paper proposed a CVMI(coefficient of variation and mutual of information)method that used the coefficient of variation to measure the distance between intraclass and the mutual information to measure the distance between interclass,and then applied the algorithm to the embedded feature selection method. The experiment used four UCI data sets,one set of remote sensing data and birds sound data,and tested seven different feature contribution scoring methods. The results showed that the CVMI method was more in line with the objective law of feature contribution uation. It also achieved better results compared to the other feature scoring methods. Besides,this paper also proposed a feature optimization method CVMI-RRMFT(remove redundancy of maximum feature tree)based on CVMI to construct a maximum feature tree and remove redundancy with two-neighborhood. Experiment results demonstrated that this feature optimization method effectively reduced data dimensions and improved the classification accuracy.

References:

[1] Kozodoi N,Lessmann S,Papakonstantinou K,et al. A multi-objective approach for profit-driven feature selection in credit scoring[J]. Decision support systems,2019,120:106-117.
[2]JIANG B,LI C,RIJKE M D,et al. Probabilistic feature selection and classification vector machine[J]. ACM transactions on knowledge discovery from data,2019,13(2):1-27.
[3]KULKARNI A,METTA R. A new code obfuscation scheme for software protection[C]//2014 IEEE 8th International Symposium on Service Oriented System Engineering. Oxford:IEEE,2014:409-414.
[4]COLLBERG C,THOMBORSON C,LOW D. A taxonomy of obfuscating transformations[D]. New Zealand:The University of Auckland,1997.
[5]LI J,CHENG K,WANG S,et al. Feature selection:a data perspective[J]. ACM computing surveys,2017,50(6):1-45.
[6]李郅琴,杜建强,聂斌. 特征选择方法综述[J]. 计算机工程与应用,2019,55(24):10-19.
[7]ZHANG Y,WANG Q,GONG D,et al. Nonnegative Laplacian embedding guided subspace learning for unsupervised feature selection[J]. Pattern recognition,2019,93:337-352.
[8]ZHAO S,ZHANG Y,XU H,et al. Ensemble classification based on feature selection for environmental sound recognition[J]. Mathematical problems in engineering,2019,2019.
[9]SAQLAIN S M,SHER M,SHAH F A,et al. Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines[J]. Knowledge and information systems,2019,58(1):139-167.
[10]张康,黑保琴,周壮,等. 变异系数降维的CNN高光谱遥感图像分类[J]. 遥感学报,2018,22(1):87-96.
[11]MAFARJA M,ALJARAH I,HEIDARI A A,et al. Binary dragonfly optimization for feature selection using time-varying transfer functions[J]. Knowledge-based systems,2018,161:185-204.
[12]王金杰,李炜. 混合互信息和粒子群算法的多目标特征选择方法[J]. 计算机科学与探索,2020,14(1):83-95.
[13]RAO H,SHI X,RODRIGUE A K,et al. Feature selection based on artificial bee colony and gradient boosting decision tree[J]. Applied soft computing,2019,74:634-642.
[14]WANG H,MENG Y,YIN P,et al. A model-driven method for quality reviews detection:an ensemble model of feature selection[C]//Wuhan International Conference on E-Business. Wuhan,China,2016:2.
[15]巫红霞,谢强. 基于加权社区检测与增强人工蚁群算法的高维数据特征选择[J]. 计算机应用与软件,2019,36(9):285-292,301.



[16]程玉胜,宋帆,王一宾,等. 基于专家特征的条件互信息多标记特征选择算法[J]. 计算机应用,2020,40(2):503-509.

[17]DUA D,GRAFF C. UCI Machine Learning Repository[http://archive.ics.uci.edu/ml]. Irvine,CA:University of California,School of Information and Computer Science. 2019.

Memo

Memo:
-
Last Update: 2021-03-15