期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Collusion set detection using graph clustering

Girish Keshav Palshikar Manoj M. Apte 《Data mining and knowledge discovery》2008,16(2):135-164

Many mal-practices in stock market trading—e.g., circular trading and price manipulation—use the modus operandi of collusion. Informally, a set of traders is a candidate collusion set when they have “heavy trading” among themselves, as compared to their trading with others. We formalize the problem of detection of collusion sets, if any, in the given trading database. We show that naïve approaches are inefficient for real-life situations. We adapt and apply two well-known graph clustering algorithms for this problem. We also propose a new graph clustering algorithm, specifically tailored for detecting collusion sets. A novel feature of our approach is the use of Dempster–Schafer theory of evidence to combine the candidate collusion sets detected by individual algorithms. Treating individual experiments as evidence, this approach allows us to quantify the confidence (or belief) in the candidate collusion sets. We present detailed simulation experiments to demonstrate effectiveness of the proposed algorithms. 相似文献

2.

Autonomous clustering using rough set theory 总被引：1，自引：0，他引：1

Charlotte Bean Chandra Kambhampati 《国际自动化与计算杂志》2008,5(1):90-102

This paper proposes a clustering technique that minimizes the need for subjective human intervention and is based on elements of rough set theory （RST）. The proposed algorithm is unified in its approach to clustering and makes use of both local and global data properties to obtain clustering solutions. It handles single-type and mixed attribute data sets with ease. The results from three data sets of single and mixed attribute types are used to illustrate the technique and establish its efficiency. 相似文献

3.

新型区间数据模糊C-均值聚类算法

下载免费PDF全文

岳明道《计算机工程与应用》2011,47(13):157-160

在传统模糊C-均值聚类算法的基础上,提出了一种新型区间值数据模糊聚类算法。运用区间分割策略改进了区间距离的计算公式,成功解决了区间距离计算方法存在的缺陷。提出了区间值数据模糊聚类的数学模型,并拓广模糊C-均值算法对区间值数据进行聚类。仿真验证了所提出算法的有效性。相似文献

4.

基于BP神经网络的网络安全评价方法研究 总被引：2，自引：0，他引：2

于群冯玲《计算机工程与设计》2008,29(8):1963-1966

网络安全涉及计算机、通信、物理、数学、生物、管理、社会等众多领域,是一项复杂的系统工程.因此,必须采用系统工程的思想和方法,对整个网络的安全状况进行综合评价,才能得出科学的评价结果.以层次分析法为基础,构建了BP神经网络评价模型,可用于网络安全等级的综合评价,从而得到更科学、合理的评价结果.研究工作为全面评价计算机网络安全状况提供新的思路和方法,对网络安全测评认证工作具有较高的理论价值和广阔的应用前景. 相似文献

5.

Interval iterations for including a set of solutions

Prof. Dr. R. Krawczyk 《Computing》1984,32(1):13-31

For including a set of solutions of a function strip we apply interval iterations. To construct a sequence of intervals converging to a fix-interval or a pseudofix-interval we make use of three kinds of interval operators with different properties and two iteration methods which are in accordance with the assumptions. 相似文献

6.

面向混合属性数据集的双重聚类方法

陈新泉《计算机工程与科学》2013,35(2):127-132

面对复杂信息环境下的数据预处理需求,提出了一种可以处理混合属性数据集的双重聚类方法。这种双重聚类方法由双重近邻无向图的构造算法或其改进算法,基于分离集合并的双重近邻图聚类算法、基于宽度优先搜索的双重近邻图聚类算法、或基于深度优先搜索的双重近邻图聚类算法来实现。通过人工数据集和UCI标准数据集的仿真实验,可以验证,尽管这三个聚类算法所采用的搜索策略不同,但最终的结果是一致的。仿真实验结果还表明,对于一些具有明显聚类分布结构且无近邻噪声干扰的数据集,该方法经常能取得比K-means算法和AP算法更好的聚类精度,从而说明这种双重聚类方法具有一定的有效性。为进一步推广并在实际中发掘出该方法的应用价值,最后给出了一点较有价值的研究展望。相似文献

7.

A rough set approach for selecting clustering attribute

Tutut Herawan Mustafa Mat Deris Jemal H. Abawajy 《Knowledge》2010,23(3):220-231

A few of clustering techniques for categorical data exist to group objects having similar characteristics. Some are able to handle uncertainty in the clustering process while others have stability issues. However, the performance of these techniques is an issue due to low accuracy and high computational complexity. This paper proposes a new technique called maximum dependency attributes (MDA) for selecting clustering attribute. The proposed approach is based on rough set theory by taking into account the dependency of attributes of the database. We analyze and compare the performance of MDA technique with the bi-clustering, total roughness (TR) and min–min roughness (MMR) techniques based on four test cases. The results establish the better performance of the proposed approach. 相似文献

8.

基于Seed集的半监督核聚类 总被引：1，自引：1，他引：1

下载免费PDF全文

李昆仑张超曹铮刘明《计算机工程与应用》2009,45(20):154-157

提出了一种新的半监督核聚类算法——SKK-均值算法。算法利用一定数量的标记样本构成seed集,作为监督信息来初始化K-均值算法的聚类中心,引导聚类过程并约束数据划分;同时还采用了核方法把输入数据映射到高维特征空间,并用核函数来实现样本之间的距离计算。在UCI数据集上进行了数值实验,并与K-均值算法和核-K-均值算法进行了比较。相似文献

9.

基于集对分析的半监督ISODATA聚类

下载免费PDF全文

魏小涛《计算机工程与应用》2009,45(36):99-100

提出一个基于集对分析的半监督ISODATA聚类算法,用于网络异常检测。在三方面进行了改进：首先,算法能够直接处理字符数字混合属性的数据,并使用集对分析来计算数据记录之间的距离;其次,算法同时处理有标号和无标号的数据,并利用少量的有标号数据来指导算法的分裂过程;最后,将算法的输入参数减少到只有两个。在KDD99入侵检测数据集上的实验结果显示,该算法获得了95.62%的检测率和1.29%的误报率。相似文献

10.

A clustering performance measure based on fuzzy set decomposition 总被引：4，自引：0，他引：4

Backer E Jain AK 《IEEE transactions on pattern analysis and machine intelligence》1981,(1):66-75

Clustering is primarily used to uncover the true underlying structure of a given data set and, for this purpose, it is desirable to subject the same data to several different clustering algorithms. This paper attempts to put an order on the various partitions of a data set obtained from different clustering algorithms. The goodness of each partition is expressed by means of a performance measure based on a fuzzy set decomposition of the data set under consideration. Several experiments reported in here show that the proposed performance measure puts an order on different partitions of the same data which is consistent with the error rate of a classifier designed on the basis of the obtained cluster labelings. 相似文献

11.

基于覆盖粗糙集模型的层次聚类算法

龚科华邱桃荣熊树洁徐苏《计算机工程与设计》2009,30(22)

目前大部分聚类算法只适用于处理属性取值为单值的数值型数据,介绍了一种新的基于粗糙集理论的聚类算法,该算法不仅可用于取值为单值的数值型数据聚类,而且能够用于取值为多值的非数值型数据聚类.该算法利用基于相容关系的属性最小覆盖来求解对象各属性的对象属性信息粒.在此基础上,通过对象属性信息粒和对象粗糙相似度的运算构建各对象的相容粒.最后,把具有相同相容粒的对象视为同一等价类,从而实现对论域的聚类,进而对数据对象进行层次聚类.实验结果表明,该算法是可行的. 相似文献

12.

基于模糊集的蚁群空间聚类方法研究 总被引：1，自引：1，他引：0

下载免费PDF全文

陈应显《计算机工程与应用》2011,47(2):5-7

定义了对象间的平均距离,并将平均距离作为对象相似性的论域。通过隶属函数将对象间的相似性映射为论域上的一个模糊子集。由给定的置信水平λ,将模糊集分离为普通集,对蚂蚁是否拾起还是放下对象作出决策,实现对空间数据的聚类。并以矿山实际测量数据为空间数据源,采用基本的蚁群聚类算法和模糊蚁群空间聚类算法分别对其进行聚类。通过对这两种算法的实验结果进行分析比较,证明改进后的算法提高了聚类效果。相似文献

13.

Hybrid strategy for selecting compact set of clustering partitions

《Applied Soft Computing》2020

The selection of the most appropriate clustering algorithm is not a straightforward task, given that there is no clustering algorithm capable of determining the actual groups present in any dataset. A potential solution is to use different clustering algorithms to produce a set of partitions (solutions) and then select the best partition produced according to a specified validation measure; these measures are generally biased toward one or more clustering algorithms. Nevertheless, in several real cases, it is important to have more than one solution as the output. To address these problems, we present a hybrid partition selection algorithm, HSS, which accepts as input a set of base partitions potentially generated from clustering algorithms with different biases and aims, to return a reduced and yet diverse set of partitions (solutions). HSS comprises three steps: (i) the application of a multiobjective algorithm to a set of base partitions to generate a Pareto Front (PF) approximation; (ii) the division of the solutions from the PF approximation into a certain number of regions; and (iii) the selection of a solution per region by applying the Adjusted Rand Index. We compare the results of our algorithm with those of another selection strategy, ASA. Furthermore, we test HSS as a post-processing tool for two clustering algorithms based on multiobjective evolutionary computing: MOCK and MOCLE. The experiments revealed the effectiveness of HSS in selecting a reduced number of partitions while maintaining their quality. 相似文献

14.

The query clustering problem: a set partitioning approach

Gopal R.D. Ramesh R. 《Knowledge and Data Engineering, IEEE Transactions on》1995,7(6):885-899

In this research, we address the query clustering problem which involves determining globally optimal execution strategies for a set of queries. The need to process a set of queries together often arises in deductive database systems, scientific database systems, large bibliographic retrieval systems and several other database applications. We address the optimization problem from the perspective of overlaps in data requirements, and model the batched operations using a set-partitioning approach. In this model, we first consider the case of m queries each involving a two-way join operation. We develop a recursive methodology to determine all the processing strategies in this case. Next, we establish certain dominance properties among the strategies, and develop exact as well as heuristic algorithms for selecting an appropriate strategy. We extend this analysis to a clustering approach, and outline a framework for optimizing multiway joins. The results show that the proposed approach is viable and efficient, and can easily be incorporated into the query processing component of most database systems 相似文献

15.

基于决策粗糙集的多属性灰色关联聚类方法

刘勇王冬冬周婷《控制与决策》2017,32(11):2034-2038

针对多属性灰色关联聚类的阈值确定问题,利用决策粗糙集方法,通过引入两个阈值参数定义决策对象间的可能关系和集合;将其代替基于灰色关联聚类的非此即彼关系,构建基于决策粗糙集的多属性灰色关联聚类方法,并采用贝叶斯推理探讨多属性灰色关联聚类的阈值计算机理;最后以案例验证所提方法的有效性和合理性.结果表明,所提出的方法是经典灰色关联聚类的拓展和泛化,能够客观、科学地确定多属性灰色关联聚类阈值. 相似文献

16.

动态聚类法的粗糙集规则提取

宋云雪于宏超史永胜《计算机工程与设计》2010,31(13)

针对航空发动机的故障样本,提出了一种基于动态聚类的粗糙集规则提取算法.给出了该算法的模型,描述了动态聚类方法和广义欧氏距离,举例说明了这种算法,用神经网络对样本进行训练并验证约简是否正确.结果表明,动态聚类法可以改善分类,使最终的核与约简更精准,去除了干扰信息的影响,在保证诊断精度的同时.提高了故障识别的正确率. 相似文献

17.

基于集对分析的区间概率随机多准则决策方法 总被引：1，自引：0，他引：1

王坚强龚岚《控制与决策》2009,24(12)

定义了区间概率空间以及区间概率随机变量．针对准则权重确知且准则值为区间概率随机变量的多准则决策问题,提出一种基于集对分析的决策方法．该方法首先根据离差最大化,确定各随机变量的概率,将区间型概率问题转化为经典的确定型概率问题;然后利用集对分析建立规划模型,将区间状态值用联系数表示,并根据集对势序准则对方案进行排序;最后通过实例说明该方法的有效性和可行性．相似文献

18.

模糊策略下的搜索文本聚类分析技术

下载免费PDF全文

万红新彭云《计算机工程与应用》2009,45(33):135-137

在现有的搜索文本中,存在大量的不确定文本结构和内容,使得常规的聚类算法难以实现,并且文本搜索的结果没有进行类聚,造成搜索结果集合数据量非常庞大。提出了基于模糊集的文本搜索的聚类分析的方法,通过模糊技术对异构数据进行处理,可以改善算法实现的时间和空间的复杂度,减少文本处理的维度,提高算法的鲁棒性,对算法的实现给出了实例分析。通过与其他聚类算法的实测数据的比对分析,验证了算法实现的精确性和效率性。相似文献

19.

角色模型的粗集-模糊聚类分析方法

下载免费PDF全文

吴勘《计算机工程与应用》2013,49(11):31-34

角色分析可以满足产品个性化设计系统中对于用户模型构建的需要。提出了基于粗集的模糊聚类角色分析与模型构建方法,通过构造基于粗集的模糊相似矩阵、确定角色属性的模糊相似聚类分析方法,从用户调研数据中提取典型用户属性特征,构建角色模型。该方法完善了角色分析在产品设计中的应用方法,有助于快速生成概念产品设计模型与方案。相似文献

20.

面向扩展目标跟踪的网格聚类量测划分方法

下载免费PDF全文

唐孟麒李波郝丽君《智能系统学报》2022,17(4):806-813

针对扩展目标跟踪中量测集划分困难及目标数目估计不准的问题,提出了一种面向扩展目标跟踪的网格聚类量测集划分方法。首先,由目标之间的时空关联性,将当前时刻的量测划分为存活目标量测与新生目标量测。然后,针对高斯混合概率假设密度滤波器与扩展目标高斯混合概率假设密度滤波器,分别推导出改进的模糊C均值算法与改进的网格聚类算法用于划分存活目标量测集与新生目标量测集。仿真结果表明本文方法可实现量测集的准确划分,有效完成扩展目标跟踪,避免了漏检与过检。相似文献