共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
李思宇 《电子技术与软件工程》2016,(19):71
研究表明软件系统普遍存在重复代码,为了维护和重构系统,人们需要代码相似性检测工具找到重复代码。大部分传统代码相似性检测方法和工具是基于特定程序语言和目标平台,但是许多软件系统包含多种程序语言编写的源代码。为了使代码相似性检测方法不局限于特定语言或平台,本文提出了一种基于中间表示的代码相似性检测方法。中间表示包含了程序的运行的基本信息,并且很少受到语言与平台的影响。所以基于中间表示的代码相似性检测方法能够有效的检测出代码的相似程度,并且具有跨语言和跨平台的特性。这种方法的主要思想是利用编译器将源代码编译为中间表示,再对中间表示进行文本的相似性比较,最后利用局部敏感性哈希高效的检测出相似代码对。实验表明基于中间表示的代码相似性检测方法比其他方法有更高的精确性。 相似文献
3.
4.
5.
6.
7.
8.
在Web行为挖掘中,序列模式聚类是一个很重要的课题,其首要问题就是web序列模式间的相似性度量.以往的多数方法都仅仅针对序列本身进行度量,而忽略了系统中资源本身所存在的关联关系以及用户对资源访问的时间因素.针对该问题,提出了一种基于考虑资源相似性的web访问序列模式的相似度量方法,并且考虑了用户访问资源的时间因素.经过检验,证明能够有效真实地反映实际情况. 相似文献
9.
基于环签名思想的一种类群签名方案 总被引:10,自引:0,他引:10
群签名方案存在着管理员权利过大的缺点,而环签名方案又无法追踪签名人的身份,本文利用环签名的思想提出的一个新的类似群签名的匿名签名方案解决了这一矛盾.和已有的群签名方案相比,该方案因保留了环签名的部分特性而具有如下优点:(1)管理员的权限得到了限制,他必须和签名接收方合作才能共同追踪签名者的身份;(2)签名者可以灵活地、主动地选择匿名范围,即他可以任意选取d个合法的公钥说明自己在其中;(3)用户加入和撤销特别方便,管理员仅需在公告牌上公布和删除该成员的相关数据. 相似文献
10.
11.
12.
SBHCF:基于奇异值分解的混合协同过滤推荐算法 总被引:1,自引:1,他引:0
针对传统协同过滤中的最近邻查找不够合理导致推荐的准确率较低的困境。提出一个基于矩阵分解的混合相似度算法。该方法融合了基于模型的奇异值矩阵分解算法和基于近邻的协同过滤算法皮尔逊相关系数,并引入阈值和杰卡德系数对相似度进行修正。在公共有效数据集上的实验表明,所提出算法的平均绝对误差比传统的推荐算法至少降低了7.7%,有效提高了推荐准确率。 相似文献
13.
Yuki Sakumichi Masanori Akiyoshi Masaki Samejima Hironori Oka 《Electronics and Communications in Japan》2014,97(3):38-44
This paper discusses how to detect inquiry e‐mails corresponding to predefined FAQs (frequently asked questions). Web‐based interactions such as ordering and registration forms on a Web page are usually provided with FAQ pages to help users. However, most users submit their inquiry e‐mails without checking such pages. This causes help desk operators to process large numbers of e‐mails even if some contents match FAQs. Automatic detection of such e‐mails is proposed based on an SVM (support vector machine) and a specific Jaccard coefficient based on positive and negative already‐received inquiry e‐mails. Experimental results show its effectiveness, and we also discuss future work to improve our method. 相似文献
14.
Jun Liu Nan Chang Sanguo Zhang Zhenming Lei 《International Journal of Communication Systems》2015,28(12):1884-1897
The user clients for accessing Internet are increasingly shifting from desktop computers to cellular devices. To be competitive in the rapidly changing market, operators, Internet service providers and application developers are required to have the capability of recognizing the models of cellular devices and understanding the traffic dynamics of cellular data network. In this paper, we propose a novel Jaccard measurement‐based method to recognize cellular device models from network traffic data. This method is implemented as a scalable paralleled MapReduce program and achieves a high accuracy, 91.5%, in the evaluation with 2.9 billion traffic records collected from the real network. Based on the recognition results, we conduct a comprehensive study of three characteristics of network traffic from device model perspective, the network access time, the traffic volume, and the diurnal patterns. The analysis results show that the distribution of network access time can be modeled by a two‐component Gaussian mixture model, and the distribution of traffic volumes is highly skewed and follows the power law. In addition, seven distinct diurnal patterns of cellular device usage are identified by applying unsupervised clustering algorithm on the collected massive traffic data. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
15.
近些年来伪基站垃圾短信的泛滥是导致垃圾短信无法根治的重要原因。与目前普遍采取的单独从终端或者网络一侧对伪基站垃圾短信进行防治的方法相比,本文提出了一种终端和业务两侧协同的防治方法,通过业务侧对端口类短信进行数字签名和在终端侧进行验证,在不影响用户短信业务体验的前提下,在智能终端上实现了对伪基站所发送端口类短信的100%的识别和拦截。 相似文献
16.
17.
18.
The ability of accurate and scalable mobile device recognition is critically important for mobile network operators and ISPs to understand their customers’ behaviours and enhance their user experience. In this paper, we propose a novel method for mobile device model recognition by using statistical infor-mation derived from large amounts of mobile network traffic data. Specifically, we create a Jaccard-based coefficient measure method to identify a proper keyword representing each mobile device model from massive unstruc-tured textual HTTP access logs. To handle the large amount of traffic data generated from large mobile networks, this method is designed as a set of parallel algorithms, and is imple-mented through the MapReduce framework which is a distributed parallel programming model with proven low-cost and high-efficiency features. Evaluations using real data sets show that our method can accurately recognise mobile client models while meeting the scalability and pro-ducer-independency requirements of large mobile network operators. Results show that a 91.5% accuracy rate is achieved for rec-ognising mobile client models from 2 billion records, which is dramatically higher than existing solutions. 相似文献
19.