首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
在语音合成技术的研究中,情感语音合成是当前研究的热点.在众多研究因素中,建立恰当的韵律模型和选取好的韵律参数是研究的关键,它们描述的正确与否,直接影响到情感语音合成的输出效果.为了攻克提高情感语音自然度这一难点,对影响情感语音合成技术韵律参数进行了分析,建立了基于关联规则的情感语音韵律基频模型.本文通过研究关联规则、改进数据挖掘Apriori算法并由此来获得韵律参数中基频变化规则,并为情感语音合成的选音提供指导和帮助.  相似文献   

2.
基于关联规则的数据挖掘技术的快速算法   总被引:11,自引:1,他引:11  
周剑雄  王明哲 《计算机工程》2003,29(12):48-49,92
提出了一种改进的Apriori算法的数据挖掘模式,探讨了对其中的生成候选频繁项目集、生成强关联规则等几个关健步骤运用标准SQL语言的算法实现。  相似文献   

3.
关联规则挖掘可以发现大量数据中项集之间相关联系的知识,这些重要信息是关于这些数据的整体特征描述以及对其发展趋势的预测,对决策的制定有着重要的参考价值。主要介绍了数据挖掘和关联规则挖掘的概念,并对数据挖掘经典算法Apriori的进行了分析与改进,算法的改进可以有效地减少对数据库的扫描次数,使挖掘的效率更好更快。  相似文献   

4.
贾桂霞  张永 《计算机工程与设计》2006,27(12):2175-2177,2186
在数据挖掘领域,关联规则的挖掘和基于粗糙集理论抽取决策规则是两种截然不同的方法,但在统计意义下两种方法产生的规则基本相同。结合关联规则挖掘方法和粗糙集方法的优点,基于Apriori算法提出一种优化算法,获取具有一定支持度和可信度阈值且不产生冗余的决策规则,以提高粗糙集属性值约简算法的性能。  相似文献   

5.
关联规则算法综述   总被引:1,自引:0,他引:1  
介绍了关联规则的概念及挖掘的过程,以Apriori算法为例,阐明了算法思想及其优化方法,描述了关联规则在社会生活领域的应用。  相似文献   

6.
论文首先简要地介绍关联规则的概念、基本原理及分类。然后详细地讨论了Apriori算法的基本原理,同时也指出了Apriori算法的一些缺陷。针对这些缺陷提出了解决方法,列举了几种改进算法。最后概述了关联规则数据挖掘的发展趋势。  相似文献   

7.
随着时代的进步和科学技术的发展,数据资源越来越多,但是信息贫乏的困境却依然无法摆脱,于是如今开始大力对新的数据分析方法和工具进行查找,从海量数据中将有用知识给提取出来。针对如今Apriori算法效率的瓶颈,就需要提出策略来改进本算法。本文简要分析了基于数据挖掘关联规则Apriori算法的优化对策,希望可以提供一些有价值的参考意见。  相似文献   

8.
基于关联规则数据挖掘Apriori算法的研究与应用   总被引:2,自引:0,他引:2  
目前在我国,对数据挖掘技术的研究与应用并不是很广泛.大多数数据库只能实现数据的录入、查询、统计等较低层次的功能,无法发现数据中存在的各种有用的信息.基于关联规则的数据挖掘主要用于发现数据集中项目之间的联系.以超市购物为例,目的在于找出顾客所购买商品之间的内在关联.利用Apriori算法的先验原理,减少Apriori算法在搜索频繁项目集时对候选式的搜索次数,并在对顾客购买的商品模型进行抽象的基础上,利用vc++与access数据库实现的算法系统,对所购买的商品之间的内在关联进行模拟分析.根据得到的数据分析出置信度较高的几种商品,通过对这些商品集中摆放,可以提高收益,从而证明改进的Apriori的实用性.  相似文献   

9.
关联规则挖掘算法的改进   总被引:2,自引:1,他引:2  
为了提供一种更加准确高效的关联规则算法,在传统的Apriori算法的基础上引入分而治之的理念和加权的思想。先把数据库分成互不相交的块,根据需求分析从每一个块中产生用户感兴趣的子集,把所有的子集合并成挖掘对象,再利用普通的关联规则算法产生频繁项集,最后在该项集的基础上产生加权频繁项集。该算法基本上克服了传统Apriori算法的缺点,从而大大地提高了运算效率,最大限度解决了“项集生成瓶颈”问题,并且使得生成的关联规则更加科学、准确。  相似文献   

10.
为了提供一种更加准确高效的关联规则算法,在传统的Apriori算法的基础上引入分而治之的理念和加权的思想.先把数据库分成互不相交的块,根据需求分析从每一个块中产生用户感兴趣的子集,把所有的子集合并成挖掘对象,再利用普通的关联规则算法产生频繁项集,最后在该项集的基础上产生加权频繁项集.该算法基本上克服了传统Apriori算法的缺点,从而大大地提高了运算效率,最大限度解决了"项集生成瓶颈"问题,并且使得生成的关联规则更加科学、准确.  相似文献   

11.
Most active research in Host and Network-based Intrusion Detection (ID) and Intrusion Prevention (IP) systems are only able to detect and prevent attacks of the computer systems and attacks at the Network Layer. They are not adequate to countermeasure XML-related attacks. Furthermore, although research have been conducted to countermeasure Web application attacks, they are still not adequate in countering SOAP or XML-based attacks. In this paper, a predictive fuzzy association rule model aimed at segregating known attack patterns (such as SQL injection, buffer overflow and SOAP oversized payload) and anomalies is developed. First, inputs are validated using business policies. The validated input is then fed into our fuzzy association rule model (FARM). Consequently, 20 fuzzy association rule patterns matching input attributes with 3 decision outcomes are discovered with at least 99% confidence. These fuzzy association rule patterns will enable the identification of frequently occurring features, useful to the security administrator in prioritizing which feature to focus on in the future, hence addressing the features selection problem. Data simulated using a Web service e-commerce application are collected and tested on our model. Our model's detection or prediction rate is close to 100% and false alarm rate is less than 1%. Compared to other classifiers, our model's classification accuracy using random forests achieves the best results with RMSE close to 0.02 and time to build the model within 0.02 s for each data set with sample size of more than 600 instances. Thus, our novel fuzzy association rule model significantly provides a viable added layer of security protection for Web service and Business Intelligence-based applications.  相似文献   

12.
汉语韵律特征受语境参数影响时受前后音节的影响较大,利用数据挖掘中的多维关联规则方法处理汉语韵律问题,得出研究者感兴趣的有用规则,然后利用神经网络方法分析输出更为准确的韵律参数,构建语音合成系统的韵律模型。将构建的韵律模型在软件平台上进行模拟实验,实验证明,该模型能满足语音合成系统的要求。  相似文献   

13.
基于支持度的关联规则挖掘算法无法找到那些非频繁但效用很高的项集,基于效用的关联规则会漏掉那些效用不高但发生比较频繁、支持度和效用值的积(激励)很大的项集。提出了基于激励的关联规则挖掘问题及一种自下而上的挖掘算法HM-miner。激励综合了支持度与效用的优点,能同时度量项集的统计重要性和语义重要性。HM-miner利用激励的上界特性进行减枝,能有效挖掘高激励项集。  相似文献   

14.
Existing parallel algorithms for association rule mining have a large inter-site communication cost or require a large amount of space to maintain the local support counts of a large number of candidate sets. This study proposes a de-clustering approach for distributed architectures, which eliminates the inter-site communication cost, for most of the influential association rule mining algorithms. To de-cluster the database into similar partitions, an efficient algorithm is developed to approximate the shortest spanning path (SSP) to link transaction data together. The SSP obtained is then used to evenly de-cluster the transaction data into subgroups. The proposed approach guarantees that all subgroups are similar to each other and to the original group. Experiment results show that data size and the number of items are the only two factors that determine the performance of de-clustering. Additionally, based on the approach, most of the influential association rule mining algorithms can be implemented in a distributed architecture to obtain a drastic increase in speed without losing any frequent itemsets. Furthermore, the data distribution in each de-clustered participant is almost the same as that of a single site, which implies that the proposed approach can be regarded as a sampling method for distributed association rule mining. Finally, the experiment results prove that the original inadequate mining results can be improved to an almost perfect level.  相似文献   

15.
This paper introduces a new approach to a problem of data sharing among multiple parties, without disclosing the data between the parties. Our focus is data sharing among parties involved in a data mining task. We study how to share private or confidential data in the following scenario: multiple parties, each having a private data set, want to collaboratively conduct association rule mining without disclosing their private data to each other or any other parties. To tackle this demanding problem, we develop a secure protocol for multiple parties to conduct the desired computation. The solution is distributed, i.e., there is no central, trusted party having access to all the data. Instead, we define a protocol using homomorphic encryption techniques to exchange the data while keeping it private.  相似文献   

16.
曾安平 《计算机应用》2012,32(8):2198-2201
针对传统关联规则算法产生的规则关联性弱、种类少的缺点,结合Spearman秩相关系数,提出了一种多类关联算法。该算法在传统算法产生的强规则基础上,利用Spearman秩相关方法计算出规则中产品间的同步异步等相关性。将其作为兴趣度阈值,算法可同时产生同步正规则、异步正规则、同步负规则和异步负规则四类关联规则,且规则间联系紧密。实验结果表明了算法的有效性和优越性。  相似文献   

17.
This paper considers a problem of finding predictive and useful association rules with a new Web mining algorithm, a streaming association rule (SAR) model. We first adopt a weighted order-dependent scheme (assigning more weights for early visited pages) rather than taking a traditional Boolean scheme (assigning 1 for visited and 0 for non-visited pages). This way, we intend to improve the limited representation of navigation patterns in previous association rule mining (ARM) algorithms. We also note that most traditional association rule models are not scalable because they require multiple scans of all records to re-calibrate a predictive model when there are new updates in original databases. The proposed SAR model takes a “divide-and-conquer” approach and requires only single scan of data sets to avoid the curse of dimensionality. Through comparative experiments on a real-world data set, we show that prediction models based on a weighted order-dependent representation are more accurate in predicting the next moves of Web navigators than models based on a Boolean representation. In particular, when combined with several heuristics developed to eliminate redundant association rules, SAR models show a very comparable prediction accuracy while maintaining a small fraction of association rules compared to traditional ARM models. Finally, we quantify and graphically show the significance or contribution of each pages to forming unique rule sets in each database segments.  相似文献   

18.
A multiobjective genetic algorithm (GA) is introduced to identify both the neighborhood and the rule set in the form of a parsimonious Boolean expression for both one- and two-dimensional cellular automata (CA). Simulation results illustrate that the new algorithm performs well even when the patterns are corrupted by static and dynamic noise.  相似文献   

19.
The ensembling of classifiers tends to improve predictive accuracy. To obtain an ensemble with N classifiers, one typically needs to run N learning processes. In this paper we introduce and explore Model Jittering Ensembling, where one single model is perturbed in order to obtain variants that can be used as an ensemble. We use as base classifiers sets of classification association rules. The two methods of jittering ensembling we propose are Iterative Reordering Ensembling (IRE) and Post Bagging (PB). Both methods start by learning one rule set over a single run, and then produce multiple rule sets without relearning. Empirical results on 36 data sets are positive and show that both strategies tend to reduce error with respect to the single model association rule classifier. A bias–variance analysis reveals that while both IRE and PB are able to reduce the variance component of the error, IRE is particularly effective in reducing the bias component. We show that Model Jittering Ensembling can represent a very good speed-up w.r.t. multiple model learning ensembling. We also compare Model Jittering with various state of the art classifiers in terms of predictive accuracy and computational efficiency.  相似文献   

20.
关联规则是一种常见的知识表达形式。本文介绍了关联规则的提取模式和基于PS架构提取模式的不足;介绍了关联规则兴趣度的定义,包括客观兴趣度和主观兴趣度以及综合兴趣度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号