[1]尹 迪,周俊生,曲维光.基于联合模型的中文嵌套命名实体识别[J].南京师大学报(自然科学版),2014,37(03):29.
 Yin Di,Zhou Junsheng,Qu Weiguang.Chinese Nested Named Entity Recognition Using a Joint Model[J].Journal of Nanjing Normal University(Natural Science Edition),2014,37(03):29.
点击复制

基于联合模型的中文嵌套命名实体识别()
分享到:

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
第37卷
期数:
2014年03期
页码:
29
栏目:
计算机科学
出版日期:
2014-09-30

文章信息/Info

Title:
Chinese Nested Named Entity Recognition Using a Joint Model
作者:
尹 迪周俊生曲维光
南京师范大学计算机科学与技术学院,江苏 南京 210023
Author(s):
Yin DiZhou JunshengQu Weiguang
School of Computer Science and Technology,Nanjing Normal University,Nanjing 210023,China
关键词:
嵌套命名实体识别序列化标注模型联合模型感知器算法
Keywords:
nested named entity recognitionsequence labeling modelsjoint modelsperceptron algorithm
分类号:
TP391
文献标志码:
A
摘要:
中文嵌套命名实体识别是自然语言处理中一个比较困难的问题.针对传统的序列化标注方法的不足,本文提出了一种新的基于联合模型的中文嵌套命名实体识别方法,该方法将嵌套命名实体识别看作是一种联合切分和标注任务.联合模型用一种改进的beam search算法作为系统的解码算法,并采用一种在线学习算法平均感知器算法作为训练算法,获得了较快的收敛速度和较好的识别效果.实验结果表明基于联合模型的方法对嵌套命名实体识别取得了更好的效果.
Abstract:
Chinese nested named entity recognition is a very difficult problem in natural language processing.This paper presents a novel method based on a joint model,which treats the recognition of Chinese nested named entity as a task of joint word segmentation and labeling.The proposed method exploits an improved beam search algorithm as decoding algorithm,and uses the averaged perceptron algorithm as training algorithm,attaining fast convergence during training.The experimental results show that the joint model achieves better performance than two baseline systems using the traditional sequence labeling models.

参考文献/References:

[1] 刘非凡,赵军,徐波.实体提及的多层嵌套识别方法研究[J].中文信息学报,2007,21(2):14-21.
[2]Fu Chunyuan,Fu Guohong.Morpheme-based Chinese nested named entity recognition[C]//The 9th International Conference on Fuzzy System and Knowlodge Discovery.Chendu:IEEE,2012:2 546-2 550.
[3]周俊生,戴新宇,尹存燕,等.基于层叠条件随机场的中文机构名自动识别[J].电子学报,2006,34(5):804-808.
[4]Alex B,Haddow B,Grover C.Recognising nested named entities in biomedical text[C]//Biological,Translational,and Clinical Language Processing,2007:65-72.
[5]Jenny Rose Finkel,Christopher D.Manning.Nested named entity recognition[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Singapore:ACL,2009:141-150.
[6]Lafferty J,McCallum A,Pereira F.Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proceedings of the International Conference on Machine Learning.Williamstown:Morgan Kaufmann,2001:282-289.
[7]Yue Zhang,Stephen Clark.Joint word segmentation and POS tagging using a single perceptron[C]//Proceedings of ACL-HLT.Columbus:ACL,2008:888-896.
[8]Yue Zhang,Stephen Clark.A fast decoder for joint word segmentation and POS-tagging using a single discriminative model[C]//Proceedings of EMNLP.Cambridge:ACL,2010:843-852.
[9]Liang Huang,Suphan Fayong,Yang Guo.Structured perceptron with inexact search[C]//Proceedings of NAACL.Canada:ACL,2012:142-151.

备注/Memo

备注/Memo:
收稿日期:2014-02-18.
基金项目:国家自然科学基金(61272221,61472191)、江苏省社科基金(12YYA002).
通讯联系人:周俊生,博士,副教授,硕士生导师,研究方向:自然语言处理.E-mail:zhoujs@njnu.edu.cn
更新日期/Last Update: 2014-09-30