[1]张晓明,申 晴,王 芳,等.多源区域民生话题演化技术研究[J].南京师大学报(自然科学版),2023,46(03):105-111.[doi:10.3969/j.issn.1001-4616.2023.03.014]
 Zhang Xiaoming,Shen Qing,Wang Fang,et al.Research on the Evolution Technology of Livelihood Topics in District Areas from Multiple Data Resources[J].Journal of Nanjing Normal University(Natural Science Edition),2023,46(03):105-111.[doi:10.3969/j.issn.1001-4616.2023.03.014]
点击复制

多源区域民生话题演化技术研究()
分享到:

《南京师大学报(自然科学版)》[ISSN:1001-4616/CN:32-1239/N]

卷:
第46卷
期数:
2023年03期
页码:
105-111
栏目:
计算机科学与技术
出版日期:
2023-09-15

文章信息/Info

Title:
Research on the Evolution Technology of Livelihood Topics in District Areas from Multiple Data Resources
文章编号:
1001-4616(2023)03-0105-07
作者:
张晓明申 晴王 芳赵培森于占鲁
(北京石油化工学院信息工程学院,北京 102617)
Author(s):
Zhang XiaomingShen QingWang FangZhao PeisenYu Zhanlu
(College of Information Engineering,Beijing Institute of Petrochemical Technology,Beijing 102617,China)
关键词:
话题演化民生演化率演化指数新冠肺炎疫情
Keywords:
topic evolution livelihood evolution rate evolution index new crown pneumonia epidemic
分类号:
TP391
DOI:
10.3969/j.issn.1001-4616.2023.03.014
文献标志码:
A
摘要:
民生一直是社会重点话题,近两年的疫情防控又为话题聚焦和演化注入了新的内容. 本文基于大量区域化民生数据进行LDA模型的困惑度分析,证明多源文本话题比单源文本更全面. 并进一步提出了民生话题演化技术框架,创新设计了热度演化率和关键词演化率的计算方法和实现算法. 基于HTDI模型和关键词演化率,综合设计了民生话题演化指数LTEI. 实验数据采集于北京大兴区的官方微博和百度贴吧. 实验结果表明,TF-IDF模型比TextRank模型更合适计算关键词演化率; 与HTDI指数相比,LTEI指数与实际话题演化趋势更加贴合,更适合用于区域民生话题演化分析.
Abstract:
The topic of people's livelihood has always been a key social issue. The epidemic prevention and control in the past two years has injected new content into the focus and evolution of the topic. Based on a large number of collected regional livelihood topic data,the perplexities as LDA model are analyzed to show that the LDA topics from the multiple source data are more comprehensive than the individuals. Then,a kind of technique framework of livelihood topic evolution is put forward firstly. Some new ideas of heat evolution rate(ER)and keyword ER are created with detail definition and concrete algorithms. Furthermore,based on the HTDI model and keyword ER,the comprehensive model as livelihood topic evolution index(LTEI)is designed for topic evolution process. The data set is collected online from official Weibo,Baidu Tieba mainly in Daxing District of Beijing. The experimental results show that the TD-IDF model is more suitable for keyword ER than TextRank model. Compared with HTDI,the LTEI is more consistent with the evolution trend of actual topics and is more suitable for the evolution of regional livelihood topics.

参考文献/References:

[1]单斌,李芳. 基于LDA话题演化研究方法综述[J]. 中文信息学报,2010,24(6):43-49.
[2]彭敏,官宸宇,朱佳晖. 面向社交媒体文本的话题检测与追踪技术研究综述[J]. 武汉大学学报(理学版),2016,62(3):197-217.
[3]钱莉,朱恒民,魏静. 话题演化研究综述[J]. 数字图书馆论坛,2021(11):57-64.
[4]刘怡君,马宁,李倩倩. 非常规突发事件中社会舆论的超网络建模与态势预测[J]. 中国应急管理,2014,(7):14-21.
[5]唐丽,甄东,李倩. 基于泊松回归模型和注意力配置理论的新冠疫情防控研究[J]. 南京师大学报(自然科学版),2021,44(1):6-12.
[6]BAI Y,JIA S L,CHEN L. Topic evolution analysis of COVID-19 news articles[C]//Journal of physics:Conference Series. New York:ACM,2020:052009.
[7]龚晓康,应文豪,王骏. 结合LDA和孪生BiLSTM的话题演化跟踪方法[J]. 中文信息学报,2022,36(2):93-103.
[8]裴可锋,陈永洲,马静. 基于DTPM模型的话题热度预测方法[J]. 情报杂志,2016,35(12):52-57.
[9]唐晓波,向坤. 基于 LDA 模型和微博热度的热点挖掘[J]. 图书情报工作,2014,58(5):58-63.
[10]陈兴蜀,高悦,江浩. 基于OLDA的热点话题演化跟踪模型[J]. 华南理工大学学报(自然科学版),2016,44(5):130-136.
[11]ZHU J H,LI X H,PENG M,et al. Coherent topic hierarchy:a strategy for topic evolutionary analysis on microblog feeds[J]. Web-age information management. 2015,9098:70-82.
[12]MEI Q,ZHAI C. Discovering evolutionary theme patterns from text-an exploration of temporal text mining[C]//KDD'05,2005.
[13]BLEM D M,LAFFERTY J D. Dynamic topic models[C]//Proceedings of the 23rd International Conference on Machine Learning. New York:ACM,2006:113-120.
[14]WANG X,MCCALLUM A. Topics over time:a non-Markov continuous-time model of topical treads[C]//Proceedings of the 12th ACM SIGKADD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2006:424-433.
[15]PATRICK K,Elaheh Momeni. Optimized tracking of topic evolution[J]. arXiv,2019.
[16]李纲,陈思菁,毛进,等. 自然灾害事件微博热点话题的时空对比分析[J]. 数据分析与知识发现,2019,3(11):1-15.
[17]黄微,赵江元,闫璐. 网络热点事件话题漂移指数构建与实证研究[J]. 数据分析与知识发现,2020,4(11):92-101.
[18]张佩瑶,刘东苏. 基于词向量和BTM的短文本话题演化分析[J]. 数据分析与知识发现,2019,3(3):95-101.

备注/Memo

备注/Memo:
收稿日期:2022-08-08.
基金项目:北京市优秀人才项目(ZZB2019005)、北京市科技计划一般项目(KM202010017011).
通讯作者:张晓明,博士,教授,研究方向:网络信息隐藏、大数据技术与智能计算. E-mail:zhangxiaoming@bipt.edu.cn
更新日期/Last Update: 2023-09-15