首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 112 毫秒
1.
中文姓名的自动辨识   总被引:48,自引:16,他引:32  
中文姓名的辨识对汉语自动分词研究具有重要意义。本文提出了一种在中文文本中自动辨识中文姓名的算法。我们从新华通讯社新闻语料库中随机抽取了300个包含中文姓名的句子作为测试样本。实验结果表明, 召回率达到了99.77%。  相似文献   

2.
基于姓氏驱动的中国姓名自动识别方法   总被引:3,自引:3,他引:3  
文章基于姓氏驱动和上下文信息,利用从真实姓名样本库和文本语料库中得到的大量统计数据,提出了一种中国姓名识别的分级加权筛选模型,利用基于这一模型的识别算法和冲突解决策略,实现中国人名的自动识别。通过从《人民日报》随机抽取的500个含有人名的句子进行测试,表明:中国姓名召回率达89.2%,精确率达93.15%。  相似文献   

3.
论文在对大规模姓名样本库、姓名语料库进行统计的基础上,将姓氏库中的姓氏按优先级分类,并研究了前300个姓氏作普通单字时其上下文用字规律,将确定真姓氏并识别姓名的过程看作对句子的姓氏用字集进行划分的过程,设计实现了一个中文姓名自动识别试验系统。同时提出多级阈值的概念,即姓名右边界阈值和识别阈值均是优先级的函数。对系统的开放测试召回率和准确率分别为80.62%、89.27%。  相似文献   

4.
在现有的吉林联通BSS系统自身的功能模块中没有提供身份证扫描和图像保存的功能,身份证号码和姓名由前台营业员手工录入,复印件以纸质形式保存,图像不能数字化保存,不能方便的查询用户在办理业务时是否使用有效证件的原件。所以有需求要开发一套自动扫描身份证并识别身份证号码和姓名的系统。在使用吉林联通BSS系统在采集用户的信息时使用身份证扫描仪扫描用户身份证,然后将信息推送到现有的吉林联通BSS系统中相应的录入文本框中,而且用户的身份证号,姓名等信息保存到吉林联通BSS系统后台数据库中,身份证的扫描图像也要以文件的形式保存到吉林联通后台系统中的ftp服务器中,并能够根据身份证号或姓名进行查询和统计。  相似文献   

5.
基于聚类的模糊柴油机故障诊断   总被引:1,自引:1,他引:0  
在柴油机系统故障诊断背景问题的研究中,针对系统的安全准确预测故障类型,提出了一种从柴油机故障历史数据样本中提取模糊规则的方法,将隐含于样本中的专家知识转化为更易于理解的模糊规则,并以此建立模糊故障分类器和模糊故障诊断系统.先利用减法聚类获得故障数据中具有代表性的样本,以样本为基础,生成模糊规则,进行优化后,通过规则综合...  相似文献   

6.
基于结合性自动识别中文姓名   总被引:7,自引:1,他引:6  
汉字分词系统中,姓名的识别一直是一个比较难处理的部分。本文以姓名和其前后语实的结合性为突破口,在分词预处理中将姓名加以标识。对新华社语料测试的结果令人满意,而县系统还具有独特的开放性和自我学习功能。  相似文献   

7.
采用统计方法来识别中文姓名。该方法将中文姓名的识别过程分为姓名候选和姓名确认两个阶段。采用隐马尔可夫模型(HMM)分类器从未经切分的汉字串中候选姓名。利用人名与上下文词汇的互信息对候选人名进行最后的确认。该方法是完全数据驱动的,不需要姓名识别模板和规则。试验结果表明,该方法的召回率为82.7%,准确率为89.6%。  相似文献   

8.
本文基于统计和规则提出一种中文识别方法。利用统计信息得到候选中文姓名,而后利用姓名前后的指界词、称谓词等相关信息从候选中文姓名中进行筛选,完成识别。实验表明该方法的正确率和召回率比较高,并且由于中文姓名在未登录词中占有很大比例,本文方法可以帮助进一步提高汉语自动分词的识别效果。  相似文献   

9.
介绍了一种二代身份证识别验证系统,该系统针对身份证照片样本单一的问题,提出一种将二代身份证照片从单一样本虚拟为多样本的方法。该系统在在一定程度上减弱了人脸姿态的变化对识别率的影响,并在实际采集的数据库中验证了该方法的有效性。  相似文献   

10.
本文描述由日文假名到汉字转换的日本汉字姓名输入系统。姓名输入通过假名键盘转换成一组可能同音异义的汉字姓名,显示在 CRT 监视器上。然后由操作人员从中把真正要输入的姓名选出来。为便于迅速而准确地操作,该系统具有的主要特点是采用两级选择名单的办法和转换表(以出现频度为基础)的优先树结构。该系统适用于人员情报的检索。  相似文献   

11.
针对在线评论情感分析的复杂特征抽取问题,提出一种基于粗糙集的在线评论情感分析模型。分析传统词袋性特征,指出固定搭配特征在情感极性判别中的作用,采用粗糙集方法挖掘在线评论中的固定搭配特征,将其融合于SVM与Naive Bayes等情感分析模型中。实际酒店的在线评论情感分析结果表明,增加粗规则后,SVM模型与Naive Bayes模型获得的评论情感判别精度都有所提高。  相似文献   

12.
ContextObject-oriented software undergoes continuous changes—changes often made without consideration of the software’s overall structure and design rationale. Hence, over time, the design quality of the software degrades causing software aging or software decay. Refactoring offers a means of restructuring software design to improve maintainability. In practice, efforts to invest in refactoring are restricted; therefore, the problem calls for a method for identifying cost-effective refactorings that efficiently improve maintainability. Cost-effectiveness of applied refactorings can be explained as maintainability improvement over invested refactoring effort (cost). For the system, the more cost-effective refactorings are applied, the greater maintainability would be improved. There have been several studies of supporting the arguments that changes are more prone to occur in the pieces of codes more frequently utilized by users; hence, applying refactorings in these parts would fast improve maintainability of software. For this reason, dynamic information is needed for identifying the entities involved in given scenarios/functions of a system, and within these entities, refactoring candidates need to be extracted.ObjectiveThis paper provides an automated approach to identifying cost-effective refactorings using dynamic information in object-oriented software.MethodTo perform cost-effective refactoring, refactoring candidates are extracted in a way that reduces dependencies; these are referred to as the dynamic information. The dynamic profiling technique is used to obtain the dependencies of entities based on dynamic method calls. Based on those dynamic dependencies, refactoring-candidate extraction rules are defined, and a maintainability evaluation function is established. Then, refactoring candidates are extracted and assessed using the defined rules and the evaluation function, respectively. The best refactoring (i.e., that which most improves maintainability) is selected from among refactoring candidates, then refactoring candidate extraction and assessment are re-performed to select the next refactoring, and the refactoring identification process is iterated until no more refactoring candidates for improving maintainability are found.ResultsWe evaluate our proposed approach in three open-source projects. The first results show that dynamic information is helpful in identifying cost-effective refactorings that fast improve maintainability; and, considering dynamic information in addition to static information provides even more opportunities to identify cost-effective refactorings. The second results show that dynamic information is helpful in extracting refactoring candidates in the classes where real changes had occurred; in addition, the results also offer the promising support for the contention that using dynamic information helps to extracting refactoring candidates from highly-ranked frequently changed classes.ConclusionOur proposed approach helps to identify cost-effective refactorings and supports an automated refactoring identification process.  相似文献   

13.
随着社交媒体的迅速发展,信息过载问题越发严重,因此如何从海量、短小而充满噪声的社交媒体数据中发现和挖掘出热点话题或者热点事件成为一个重要的问题。结合社交媒体数据实时性、地理性、包含较多元数据等特点,提出了用户行为分析与文本内容分析相结合的热点挖掘方法。在内容分析过程中,提出了从更细的词语粒度进行聚类,以代替传统的在消息粒度进行聚类的经典方法。为了提高话题关键词提取的效果,引入了基于词向量技术,并通过语义聚类的方法进行热点挖掘。在真实数据集上的实验结果表明,该方法提取的关键词语义关联性强、话题划分效果好,在主要指标上优于传统的热点挖掘方法。  相似文献   

14.
Video surveillance is an active research topic in computer vision.In this paper,humans and cars identifcation technique suitable for real time video surveillance systems is presented.The technique we proposed includes background subtraction,foreground segmentation,shadow removal,feature extraction and classifcation.The feature extraction of the extracted foreground objects is done via a new set of afne moment invariants based on statistics method and these were used to identify human or car.When the partial occlusion occurs,although features of full body cannot be extracted,our proposed technique extracts the features of head shoulder.Our proposed technique can identify human by extracting the human head-shoulder up to 60%–70%occlusion.Thus,it has a better classifcation to solve the issue of the loss of property arising from human occluded easily in practical applications.The whole system works at approximately 16 29 fps and thus it is suitable for real-time applications.The accuracy for our proposed technique in identifying human is very good,which is 98.33%,while for cars identifcation,the accuracy is also good,which is 94.41%.The overall accuracy for our proposed technique in identifying human and car is at 98.04%.The experiment results show that this method is efective and has strong robustness.  相似文献   

15.
识别搜索引擎用户的查询意图在信息检索领域是备受关注的研究内容.文中提出一种融合多类特征识别Web查询意图的方法.将Web查询意图识别作为一个分类问题,并从不同类型的资源包括查询文本、搜索引擎返回内容及Web查询日志中抽取出有效的分类特征.在人工标注的真实Web查询语料上采用文中方法进行查询意图识别实验,实验结果显示文中采用的各类特征对于提高查询意图识别的效果皆有一定帮助,综合使用这些特征进行查询意图识别,88.5%的测试查询获得准确的意图识别结果.  相似文献   

16.
文语转换系统中基于语料的汉语自动分词研究   总被引:9,自引:0,他引:9  
基于一个实际的文语转换系统,介绍了经的一些处理方法,采用了一种改进的最大匹配法,可以切分出所有的交集歧义,提出了一基于统计模型的算法来处理其中的多交集歧义的字段,并用穷举法和一睦简单的规则相结合的方法从实用角度解决多音字的异读问题以及中文姓名的自动识别方法,解决了汉语切分歧义、多音词处理、,中文姓名的自动识别问题,达到实现一文语转换的。  相似文献   

17.
This paper emphasizes the importance of making field measurements for effective and realistic dependability evaluatians.Two examples are given,both based on real data from IBM mainframes.The first evaluates the imnact of the operating environment on system failure characteristics and the second shows how an accurate model depicting this interaction can be extracted from real data.  相似文献   

18.
19.
基于计算机视觉的植物黑腐病病斑分析   总被引:1,自引:0,他引:1  
给出一种基于图像处理和神经网络技术进行植物黑腐病病斑分析的方法,即利用图像处理技术提取病斑的几何特征和颜色特征,其中几何特征根据病斑形状提取,并基于HSV空间提取病斑颜色矩结合红绿颜色特征作为病斑的颜色特征,最后利用神经网络加以识别,从而判断病斑所处的生长周期.给出系统的总体设计和实现方案,研究结果表明,该系统获得了较为理想的检测效果.这一思路为植物病害检测和分析提供一种新的方法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号