首页 | 本学科首页   官方微博 | 高级检索  
     

基于Word2Vec的编程领域词语拼写错误检测算法
引用本文:刘峻松,唐明靖,薛岗,杨成荣. 基于Word2Vec的编程领域词语拼写错误检测算法[J]. 计算机应用与软件, 2022, 39(3): 277-284. DOI: 10.3969/j.issn.1000-386x.2022.03.045
作者姓名:刘峻松  唐明靖  薛岗  杨成荣
作者单位:云南大学软件学院 云南 昆明650000,云南师范大学生命科学学院 云南 昆明650000,六盘水师范学院 贵州 六盘水553004
摘    要:Stack Overflow是一个计算机编程领域的问答社区,其中的文本蕴含大量有价值的信息可供挖掘,但由于其本身存在大量的错误词汇,给文本的分析造成影响.对此,提出一种词语自动检测纠错算法,通过词向量的技术以语义相似度为核心,对错误词汇进行分析,结合改进的编辑距离算法对文本进行自动检测纠错.实验结果表明,该算法能够对诸...

关 键 词:词向量  编辑距离  拼写纠错  Word2Vec  Stack Overflow

SPELLING CHECK ALGORITHM OF WORDS IN THE FIELD OF PROGRAMMING BASED ON WORD2VEC
Liu Junsong,Tang Mingjing,Xue Gang,Yang Chengrong. SPELLING CHECK ALGORITHM OF WORDS IN THE FIELD OF PROGRAMMING BASED ON WORD2VEC[J]. Computer Applications and Software, 2022, 39(3): 277-284. DOI: 10.3969/j.issn.1000-386x.2022.03.045
Authors:Liu Junsong  Tang Mingjing  Xue Gang  Yang Chengrong
Affiliation:(School of Software,Yunnan University,Kunming 650000,Yunnan,China;School of Life Sciences,Yunnan Normal University,Kunming 650000,Yunnan,China;Liupanshui Normal University,Liupanshui 553004,Guizhou,China)
Abstract:Stack Overflow is a question-and-answer community in the field of computer programming,where text contains a lot of valuable information to mine,but there are a lot of wrong words in it,which affects the analysis of text.This paper proposes an automatic error check and correction algorithm of words.It analyzed the wrong words by using the technique of word vector and focusing on semantic similarity,and combined the modified editing distance algorithm to automatically check and correct the text.Experiments show that the proposed algorithm can automatically check and correct the subject text of such professional areas,and can restore the standard text semantics well.
Keywords:Word vector  Edit distance  Spelling correction  Word2Vec  Stack Overflow
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号