|Table of Contents|

Mining and Decision-Making of Breast Cancer MedicalRecord Text Based on Decision Tree(PDF)

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

Issue:
2019年03期
Page:
42-51
Research Field:
·全国机器学习会议论文专栏·
Publishing date:

Info

Title:
Mining and Decision-Making of Breast Cancer MedicalRecord Text Based on Decision Tree
Author(s):
Gong Lejun1Zhang Lipeng1Li Yuxi1Wu Xianghui1Gao Zhihong2Pan Chuandi2Yang Geng1
(1.Jiangsu Key Lab of Big Data Security & Intelligent Processing,School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)(2.Zhejiang Engineering Research Center of Intelligent Medicine,Wenzhou 325035,China)
Keywords:
breast cancernatural language processingdecision treetext miningNeo4j
PACS:
TP391
DOI:
10.3969/j.issn.1001-4616.2019.03.006
Abstract:
Breast cancer is one of the most common malignant tumors in women,which seriously threatens the health of women worldwide. Clinical medical records carry the diagnostic information from experienced doctors. Mining these records could receive breast cancer-related conditions. This paper presents a method using data mining algorithm-Decision Tree to process medical records,to obtain breast cancer disease-related information via text processing. We conduct TNM and clinical cancer staging decisions for breast cancer and validate decision results. At the same time,we also combine the Neo4j-map database to establish breast cancer TNM-clinical staging knowledge map. The example shows that this method could obtain TNM and clinical cancer grading conditions for breast cancer. It indicates that the presented method is expected to be used to assist doctors in making decisions.

References:

[1] 史双,路潜,杨萍,等. 乳腺癌就诊延误的研究现状[J]. 中华护理杂志,2015(4):88-91.
[2]陈万青,张思维,郑荣寿,等. 中国2009年恶性肿瘤发病和死亡分析[J]. 中国肿瘤,2013(1):5-15.
[3]TSURUOKA Y,MIWA M,HAMAMOTO K,et al. Discovering and visualizing indirect associations between biomedical concepts[J]. Bioinformatics,2011,27(13):111-119.
[4]OKAZAKI N,ANANIADOU S,TSUJII J. Building a high-quality sense inventory for improved abbreviation disambiguation[J]. Bioinformatics,2010,26(9):1246-1253.
[5]WANG X,TSUJII J,ANANIADOU S. Disambiguating the species of biomedical named entities using natural language parsers[J]. Bioinformatics,2010,26(5):661-667.
[6]WANG X,RAK R,RESTIFICAR A,et al. Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature[J]. BMC bioinformatics,2011,12(Suppl 8):S11.
[7]HANNA S. Text mining and information analysis of health documents[J]. Artificial intelligence in medicine,2014,61(3):127-130.
[8]王浩畅,赵铁军. 生物医学文本挖掘技术的研究与进展[J]. 中文信息学报,2008,22(3):89-98.
[9]李慧林. 基于电子病历的疾病预测方法研究及应用[D]. 郑州:郑州大学,2018.
[10]Hazewinkel Mirjam C,de Winter Remco F P,van Est Roel W,et al. Text analysis of electronic medical records to predict seclusion in psychiatric wards:proof of concept.[J]. Frontiers in psychiatry,2019,10:188-192.
[11]李德辉,范焕芳,孙春霞. 乳腺癌中医证型与TNM分期相关性的Meta分析[J]. 中国老年学杂志,2017(15):135-137.
[12]傅春燕,陈述政,潘颖. 乳腺癌中医症候分类与TNM分期相关性研究[J]. 中国现代医生,2013(4):121-123.
[13]薛卫成. 介绍乳腺癌TNM分期系统(第7版)[J]. 诊断病理学杂志,2010(4):6-9.
[14]刘雨馨,王振光,武凤玉,等. 乳腺癌术前TNM分期与术后~(18)F-FDGPET/CT阳性显像相关性分析[J]. 影像研究与医学应用,2018(11):86-88.
[15]代文杰,张爽. 从AJCC第8版乳腺癌预后分期解读看外科临床新进展[J]. 临床外科杂志,2018(1):21-23.
[16]薛卫成,阚秀. 介绍乳腺癌TNM分期系统(第6版)[J]. 诊断病理学杂志,2008,15(3):161-164.
[17]刘艳辉,张芬. 新辅助化疗后的乳腺癌AJCC TNM分级与预后关系的评价[J]. 循证医学,2007(3):149-151.
[18]王若佳,魏思仪,赵怡然,等. 数据挖掘在健康医疗领域中的应用研究综述[J]. 图书情报知识,2018(11):116-125.
[19]姜欣,徐六通,张雷. C4.5决策树展示算法的设计[J]. 计算机工程与应用,2003,8(4):93-95.
[20]王灿辉,张敏,马少平. 自然语言处理在信息检索中的应用综述[J]. 中文信息学报,2007(2):37-47.
[21]刘颖. 计算语言学[M]. 北京:清华大学出版社,2002.
[22]宗成庆. 统计自然语言处理[M]. 北京:清华大学出版社,2008.
[23]车万翔,刘挺,李生. 实体关系自动抽取[J]. 中文信息学报,2005(2):2-7.
[24]李保利,陈玉忠,俞士汶. 信息抽取研究综述[J]. 计算机工程与应用,2003(10):4-8.
[25]孙琳. 基于NLPIR汉语分词系统和BFSU PowerConc 1.0的警务汉语词频与搭配研究——以禁毒案件为例[J]. 现代语文(语言研究版),2016(12):142-147.
[26]CHEN S B,RAO P. Land degradation monitoring using multitemporal Landsat TM/ETM data in a transition zone between grassland and cropland of northeast China[J]. International journal of remote sensing,2008,29(7):2055-2073.
[27]PAL M,MATHER P M. An assessment of the effectiveness of decision tree methods for land cover classification[J]. Remote sensing of environment,2003,86(4):554-565.
[28]苗夺谦,王珏. 基于粗糙集的多变量决策树构造方法[J]. 软件学报,1997(6):26-32.
[29]马克. 数据清洗在统计调查实践中的应用[J]. 调研世界,2018(10):1-2.
[30]郝爽,李国良,冯建华,等. 结构化数据清洗技术综述[J]. 清华大学学报(自然科学版),2018,58(12):3-16.
[31]饶萍,王建力,王勇. 基于多特征决策树的建设用地信息提取[J]. 农业工程学报,2014(12):241-248.
[32]刘学艺,李平,郜传厚. 极限学习机的快速留一交叉验证算法[J]. 上海交通大学学报,2011,45(8):49-54.

Memo

Memo:
-
Last Update: 2019-09-30