|Table of Contents|

Tumor Gene Selection Based on F-score and Binary Grey Wolf Optimization(PDF)

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

Issue:
2024年01期
Page:
111-120
Research Field:
计算机科学与技术
Publishing date:

Info

Title:
Tumor Gene Selection Based on F-score and Binary Grey Wolf Optimization
Author(s):
Mu XiaoxiaZheng Lijing
(College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007,China)
Keywords:
tumor geneFisher-scoreSpearman correlation coefficientbinary grey wolf optimization algorithmfeature selection
PACS:
TP311
DOI:
10.3969/j.issn.1001-4616.2024.01.013
Abstract:
According to the tumor gene situation of high dimensionality,noise and redundancy,this paper improved the F-score algorithm by the Spearman correlation coefficient,optimized the binary gray wolf algorithm,and proposed a gene feature selection algorithm with the improved F-score and the binary gray wolf algorithm. Firstly,by considering the correlation between features,the F-score value of each feature and the absolute value of Spearman correlation coefficient between features were calculated. Secondly,by calculating the weight coefficient,the weight value of each feature was derived to be ranked according to their importance and select a primary feature subset. Finally,the binary gray wolf algorithm was optimized through adjusting the proportion of global search and local search to enhance the global search capability and improve the speed of local search,so that the time overhead could be saved and the optimal feature subset was selected,which can improve the classification performance and efficiency of feature selection. The designed algorithm is tested on nine tumor gene datasets and simulated on two indexes of correct accuracy and number of filtered features. When compared with four other algorithms,the experimental results prove that the algorithm performed well,reduced the dimensionality of gene data,and had better classification accuracy.

References:

[1]吴辰文,纪海斌. 混合mRMR和改进磷虾群的肿瘤基因特征选择算法[J]. 西北大学学报(自然科学版),2022,52(2):262-269.
[2]孙林,徐枫,李硕,等. 基于ReliefF和最大相关最小冗余的多标记特征选择[J]. 河南师范大学学报(自然科学版),2023,51(6):22-30.
[3]马超. 基于FCBF特征选择和集成优化学习的基因表达数据分类算法[J]. 计算机应用研究,2019,36(10):2986-2991.
[4]王琛,董永权. 基于二进制灰狼优化的特征选择及文本聚类[J]. 计算机工程与设计,2021,42(9):2526-2535.
[5]GUYON I,WESTON J,BARNHILL S,et al. Gene selection for cancer classification using support vector machines[J]. Machine learning,2002,46:389-422.
[6]谢娟英,王春霞,蒋帅,等. 基于改进的F-score与支持向量机的特征选择方法[J]. 计算机应用,2010,30(4):993-996.
[7]谢娟英,郑清泉,吉新媛. F-score结合核极限学习机的集成特征选择算法[J]. 陕西师范大学学报(自然科学版),2020,48(2):1-8.
[8]吴晓燕,刘笃晋. 基于樽海鞘群与粒子群混合优化算法的特征选择[J]. 重庆邮电大学学报(自然科学版),2021,33(5):844-850.
[9]秦喜文,王芮,于爱军,等. 基于F-score的特征选择算法在多分类问题中的应用[J]. 长春工业大学学报,2021,42(2):128-134.
[10]MIRJALILI S,MIRJALILI S M,LEWIS A. Grey wolf optimizer[J]. Advances in engineering software,2014,69:46-61.
[11]EMARY E,ZAWBA H M,HASSANIEN A E. Binary grey wolf optimization approaches for feature selection[J]. Neurocomputing,2016,172(8):371-381.
[12]陈长倩,慕晓冬,牛犇,等. 结合高斯分布的改进二进制灰狼优化算法[J]. 计算机工程与应用,2019,55(13):145-150.
[13]邢燕祯,王东辉. 一种基于收敛因子改进的灰狼优化算法[J]. 网络新媒体技术,2020,9(3):28-34.
[14]王伟,吕婷婷,周晓冰. 河南5A级景区网络关注度时空演变特征与影响因素[J]. 河南师范大学学报(自然科学版),2023,51(2):70-78.
[15]孙林,马天娇,薛占熬. 基于Fisher score与模糊邻域熵的多标记特征选择算法[J/OL]. 计算机应用:1-12[2023-08-18]. https://kns-cnki-net.webvpn.las.ac.cn/kcms/detail/51.1307.tp.20230214.1544.002.html.
[16]吴迪,郭嗣琮. 改进的Fisher Score特征选择方法及其应用[J]. 辽宁工程技术大学学报(自然科学版),2019,38(5):472-479.
[17]王梓辰,窦震海,董军,等. 多策略改进的自适应动态鲸鱼优化算法[J]. 计算机工程与设计,2022,43(9):2638-2645.
[18]崔鸣,靳其兵. 基于Levy飞行策略的灰狼优化算法[J]. 计算机与数字工程,2022,50(5):948-952,958.
[19]汪丽丽,邓丽,余玥,等. 基于Spark的肿瘤基因混合特征选择方法[J]. 计算机工程,2018,44(11):1-6.
[20]SUN L,WANG L Y,DING W P,et al. Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets[J]. IEEE transactions on fuzzy systems,2021,29(1):19-33.
[21]YANG J,LIU Y L,FENG C S,et al. Applying the Fisher score to identify Alzheimer's disease-related genes[J]. Genetics & molecular research gmr,2016,15(2):19-28.
[22]SALEM H,ATTIYA G,EL-FISHAWY N. Classification of human cancer diseases by gene expression profiles[J]. Applied soft computing,2016,50:124-134.
[23]ALGAMAL Z Y,LEE M H.A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification[J]. Advances in data analysis and classification,2019,13(3):753-771.
[24]SHAH S H,IQBAL M J,AHMAD I,et al. Optimized gene selection and classification of cancer from microarray gene expression data using deep learning[J]. Neural computing and applications,2020,(3/4):1-12.

Memo

Memo:
-
Last Update: 2024-03-15