共查询到20条相似文献,搜索用时 0 毫秒
1.
Wolff R. Schuster A. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2004,34(6):2426-2438
We extend the problem of association rule mining--a key data mining problem--to systems in which the database is partitioned among a very large number of computers that are dispersed over a wide area. Such computing systems include grid computing platforms, federated database systems, and peer-to-peer computing environments. The scale of these systems poses several difficulties, such as the impracticality of global communications and global synchronization, dynamic topology changes of the network, on-the-fly data updates, the need to share resources with other applications, and the frequent failure and recovery of resources. We present an algorithm by which every node in the system can reach the exact solution, as if it were given the combined database. The algorithm is entirely asynchronous, imposes very little communication overhead, transparently tolerates network topology changes and node failures, and quickly adjusts to changes in the data as they occur. Simulation of up to 10,000 nodes show that the algorithm is local: all rules, except for those whose confidence is about equal to the confidence threshold, are discovered using information gathered from a very small vicinity, whose size is independent of the size of the system. 相似文献
2.
日志挖掘为WAP增值业务运营和策略调整提供了数据依据.介绍了WAP增值业务中日志预处理.引入关联规则的概念到WAP增值业务日志挖掘中,分析了经典数据挖掘Apfiori算法.从两方面做了改进:利用修剪技术,由一项频繁集生成二项候选集,减少大量二项候选集;用扫描内存代替扫描数据库,减少大量扫描时间.实验表明这两种改进方法能快速完成WAP增值业务中素材关联的挖掘. 相似文献
3.
《微型机与应用》2017,(19):16-18
瞬时胎心率是监测胎儿健康状态的一种重要方式。当前,监控胎儿心率是重要而复杂的任务,正确的自动化分类和规则提取是非常必要的。医疗诊断自动化系统,不仅加强医疗保健,同时也可以降低成本。设计了一个有效挖掘规则,并根据给定的参数来预测胎儿的风险水平。采用C4.5、Classification and Regression Tree(CART)、随机森林分类器来进行系统比较。该系统的性能评价由分类精度、产生规则数量构成。实验结果表明,基于随机森林分类器的系统具有高精度(99.4%)的预测胎儿健康状态的潜力,同时,产生的规则数量精简且可供于医生决策。 相似文献
4.
This paper examines the existence of gender differences in computer mediated (CM) negotiations where “gender differences” refers to the differential patterns of behavior of males and females proposed by Rubin and Brown (Rubin, J. Z., & Brown, B. R. (1975). Bargainers as individuals. In The social psychology of bargaining and negotiation (pp. 157–196). New York: Academic Press). Namely, males are more profit oriented and females are more relationship oriented. External manipulations encouraging cooperativeness with other negotiators either by profitable or social incentives were inserted in the negotiations performed within the Colored Trails (CT) game framework. The negotiators included 27 females and 33 males who negotiated in foursomes via computers. In the first study we focused on independent negotiators whose success was not crucially dependent on the other party. In the second study negotiators were dependent upon one another, encouraging integrative solutions. The findings reveal that the social incentive (team factor) positively affected the females’ cooperativeness in contrast to males who were slightly less cooperative. On the other hand, profitable incentive influenced the males’ cooperativeness level, while no change was shown by females, which is consistent with Rubin and Brown’s distinction. These tendencies were reduced when playing with a non-reciprocal simulated agent. The causes for gender differences in CM as well as in face-to-face (FTF) negotiations are discussed. 相似文献
5.
In the field of data mining, an important issue for association rules generation is frequent itemset discovery, which is the key factor in implementing association rule mining. Therefore, this study considers the user’s assigned constraints in the mining process. Constraint-based mining enables users to concentrate on mining itemsets that are interesting to themselves, which improves the efficiency of mining tasks. In addition, in the real world, users may prefer recording more than one attribute and setting multi-dimensional constraints. Thus, this study intends to solve the multi-dimensional constraints problem for association rules generation.The ant colony system (ACS) is one of the newest meta-heuristics for combinatorial optimization problems, and this study uses the ant colony system to mine a large database to find the association rules effectively. If this system can consider multi-dimensional constraints, the association rules will be generated more effectively. Therefore, this study proposes a novel approach of applying the ant colony system for extracting the association rules from the database. In addition, the multi-dimensional constraints are taken into account. The results using a real case, the National Health Insurance Research Database, show that the proposed method is able to provide more condensed rules than the Apriori method. The computational time is also reduced. 相似文献
6.
《Ergonomics》2012,55(9):1021-1028
This study investigated the external power output in kgm s?1 and vertical velocity in m s?1 attained by 24 female and 24 male subjects during the following stair run tests: 2 m run-up negotiating two steps at a time, 6 m run-up negotiating three steps at a time, 2 m run-up negotiating three steps at a time and 6 m run-up negotiating two steps at a time. The steps were approximately 16.5 cm in height. Two timers were connected to photoelectric beam circuits and switchmats which were placed on the 8th and 12th steps when the subjects ran up the steps two at a time and on the 3rd and 9th steps when they ran up the steps three at a time. The photoelectric beam circuit and switchmat data were analysed separately for each sex by a 3 (high, medium and low leg length) × 2 (2 and 6 m run-up) × 2 (2 and 3 steps at a time) ANOVA repeated measures factorial design. In each of the eight analyses, significantly (p < 0 05) greater scores were attained with a 6 as compared to a 2 m run-up and with negotiating three as compared to two steps at a time. These main effects must be interpreted in conjunction with a significant run-up × steps interaction which indicated that the length of the run-up had no significant effect when two steps were negotiated at a time. However, increasing the length of the run-up resulted in a significant increase when three steps were negotiated at a time. There was a significant main effect for leg length with the external power output in kg m s-1 for the males. Those subjects in the high group scored significantly greater than those in the low group. A similar, though non-significant trend (p < 007), was observed with females. Photoelectric beam circuits yielded significantly higher scores than switchmats. Of the four stair run protocols investigated in this study, the highest scores occurred with a 6 mrun-up, negotiating three 16.5 cm steps at a time and placing photoelectric beam circuits connected to a timer on the 3rd and 9th steps. 相似文献
7.
Work-related neck disorders are common among various occupational groups. Despite clear epidemiological evidence for the association of these disorders with forceful arm exertions, the effect of such exertions on the biomechanical behavior of the neck muscles is currently not well understood. In this study, the effect of lifting tasks on the biomechanical loading of neck muscles was investigated for males and females. Twenty-six participants (13 males and 13 females) performed bi-manual isometric lifting tasks at knuckle, elbow, shoulder, and overhead heights by exerting 25%, 50%, and 75% of their maximum strength. The activity of the cervical trapezius and sternocleidomastoid muscles was recorded bilaterally using surface electromyography. Higher activity of the cervical trapezius muscle (10% MVC–43% MVC) compared to the sternocleidomastoid muscle (4% MVC–18% MVC) was observed. Females tend to use the sternocleidomastoid muscle to a greater extent than males, whereas, higher cervical trapezius muscle activation was observed for males than females. The main effect of weight and height, and weight by height interaction on the activity of neck muscles was statistically significant (all p < 0.001). The results of this study demonstrate that the neck muscles play an active role during lifting activities and may influence development of musculoskeletal disorders due to resulting physiological changes. 相似文献
8.
《Ergonomics》2012,55(10):1226-1239
This paper combines epidemiological data on musculoskeletal morbidity in 40 female and 15 male occupational groups (questionnaire data 3720 females, 1241 males, physical examination data 1762 females, 915 males) in order to calculate risk for neck and upper limb disorders in repetitive/constrained vs. varied/mobile work and further to compare prevalence among office, industrial and non-office/non-industrial settings, as well as among jobs within these. Further, the paper aims to compare the risk of musculoskeletal disorders from repetitive/constrained work between females and males. Prevalence ratios (PR) for repetitive/constrained vs. varied/mobile work were in neck/shoulders: 12-month complaints females 1.2, males 1.1, diagnoses at the physical examination 2.3 and 2.3. In elbows/hands PRs for complaints were 1.7 and 1.6, for diagnoses 3.0 and 3.4. Tension neck syndrome, cervicalgia, shoulder tendonitis, acromioclavicular syndrome, medial epicondylitis and carpal tunnel syndrome showed PRs > 2. In neck/shoulders PRs were similar across office, industrial and non-office/non-industrial settings, in elbows/hands, especially among males, somewhat higher in industrial work. There was a heterogeneity within the different settings (estimated by bootstrapping), indicating higher PRs for some groups. As in most studies, musculoskeletal disorders were more prevalent among females than among males. Interestingly, though, the PRs for repetitive/constrained work vs. varied/mobile were for most measures approximately the same for both genders. In conclusion, repetitive/constrained work showed elevated risks when compared to varied/mobile work in all settings. Females and males showed similar risk elevations. This article enables comparison of risk of musculoskeletal disorders among many different occupations in industrial, office and other settings, when using standardised case definitions. It confirms that repetitive/constrained work is harmful not only in industrial but also in office and non-office/non-industrial settings. The reported data can be used for comparison with future studies. 相似文献
9.
10.
11.
Wei Ding Christoph F. Eick Xiaojing Yuan Jing Wang Jean-Philippe Nicot 《GeoInformatica》2011,15(1):1-28
The motivation for regional association rule mining and scoping is driven by the facts that global statistics seldom provide
useful insight and that most relationships in spatial datasets are geographically regional, rather than global. Furthermore,
when using traditional association rule mining, regional patterns frequently fail to be discovered due to insufficient global
confidence and/or support. In this paper, we systematically study this problem and address the unique challenges of regional
association mining and scoping: (1) region discovery: how to identify interesting regions from which novel and useful regional
association rules can be extracted; (2) regional association rule scoping: how to determine the scope of regional association
rules. We investigate the duality between regional association rules and regions where the associations are valid: interesting
regions are identified to seek novel regional patterns, and a regional pattern has a scope of a set of regions in which the
pattern is valid. In particular, we present a reward-based region discovery framework that employs a divisive grid-based supervised
clustering for region discovery. We evaluate our approach in a real-world case study to identify spatial risk patterns from
arsenic in the Texas water supply. Our experimental results confirm and validate research results in the study of arsenic
contamination, and our work leads to the discovery of novel findings to be further explored by domain scientists. 相似文献
12.
An architecture for making recommendations to courseware authors using association rule mining and collaborative filtering 总被引:1,自引:0,他引:1
Enrique García Cristóbal Romero Sebastián Ventura Carlos de Castro 《User Modeling and User-Adapted Interaction》2009,19(1-2):99-132
Nowadays we find more and more applications for data mining techniques in e-learning and web-based adaptive educational systems. The useful information discovered can be used directly by the teacher or author of the course in order to improve instructional/learning performance. This can, however, imply a lot of work for the teacher who can greatly benefit from the help of educational recommender systems for doing this task. In this paper we propose a system oriented to find, share and suggest the most appropriate modifications to improve the effectiveness of the course. We describe an iterative methodology to develop and carry out the maintenance of web-based courses to which we have added a specific data mining step. We apply association rule mining to discover interesting information through students’ usage data in the form of IF-THEN recommendation rules. We have also used a collaborative recommender system to share and score the recommendation rules obtained by teachers with similar profiles along with other experts in education. Finally, we have carried out experiments with several real groups of students using a web-based adaptive course. The results obtained demonstrate that the proposed architecture constitutes a good starting point to future investigations in order to generalize the results over many course contents. 相似文献
13.
Empirical Bayes and Fully Bayes procedures to detect high-risk areas in disease mapping 总被引:1,自引:0,他引:1
Disease mapping studies have experienced an enormous development in the last twenty years. Both an Empirical Bayes (EB) and a Fully Bayes (FB) approach have been used for smoothing purposes. However, an excess of smoothing might hinder the detection of true high-risk areas. Identifying these extreme regions minimizing the misclassification of background or normal areas, and then, avoiding false alarms is crucial in epidemiology. Bayesian decision rules, based on the posterior distribution of the relative risks, have been investigated for this task, but no similar studies have been conducted under the EB approach. Within this framework, second order correct estimators of the MSE of the log-relative risk predictor can be used to build appropriate confidence intervals for the relative risks. Their ability to detect high-risk areas is investigated through a simulation study using the geographical structure of the well-known Scottish lip cancer data. Bayesian credibility intervals and decision rules, based on the posterior distribution of the relative risks, are also investigated to check if any of the approaches outperforms the others when classifying high-risk regions. The conclusion is that Bayesian decision rules, exploiting the posterior distribution of the relative risks, are more powerful to detect high-risk areas than EB confidence intervals, but no general rules can be defined as a global criterion to be routinely applied in every real setting. 相似文献
14.
Linux malware can pose a significant threat—its (Linux) penetration is exponentially increasing—because little is known or
understood about Linux OS vulnerabilities. We believe that now is the right time to devise non-signature based zero-day (previously
unknown) malware detection strategies before Linux intruders take us by surprise. Therefore, in this paper, we first do a
forensic analysis of Linux executable and linkable format (ELF) files. Our forensic analysis provides insight into different
features that have the potential to discriminate malicious executables from benign ones. As a result, we can select a features’
set of 383 features that are extracted from an ELF headers. We quantify the classification potential of features using information
gain and then remove redundant features by employing preprocessing filters. Finally, we do an extensive evaluation among classical
rule-based machine learning classifiers—RIPPER, PART, C4.5 Rules, and decision tree J48—and bio-inspired classifiers—cAnt
Miner, UCS, XCS, and GAssist—to select the best classifier for our system. We have evaluated our approach on an available
collection of 709 Linux malware samples from vx heavens and offensive computing. Our experiments show that ELF-Miner provides more than 99% detection accuracy with less than 0.1% false alarm rate. 相似文献
15.
16.
Samet Çokp?nar Taflan ?mre Gündem 《Expert systems with applications》2012,39(8):7503-7511
In recent years, data mining has become one of the most popular techniques for data owners to determine their strategies. Association rule mining is a data mining approach that is used widely in traditional databases and usually to find the positive association rules. However, there are some other challenging rule mining topics like data stream mining and negative association rule mining. Besides, organizations want to concentrate on their own business and outsource the rest of their work. This approach is named “database as a service concept” and provides lots of benefits to data owner, but, at the same time, brings out some security problems. In this paper, a rule mining system has been proposed that provides efficient and secure solution to positive and negative association rule computation on XML data streams in database as a service concept. The system is implemented and several experiments have been done with different synthetic data sets to show the performance and efficiency of the proposed system. 相似文献
17.
18.
While data mining is well established in practice, opinion mining is still in its infancy, with issues in particular around the development of methodologies which effectively extract accurate, reliable, influential and useful information from the raw opinion data collected from informal product reviews. Current approaches adopt a single-variable approach, focusing on individual metrics—word length, the presence of keywords, or the overall semantic orientation of terms within the data—while neglecting to evaluate whether these individual artifacts are indicative of the tone of a given review. This approach has significant limitations when we move from trying to merely evaluate whether an online opinion is positive or negative, to trying to evaluate how likely it is that the opinion will influence others. Given this issue, one promising avenue would be to evaluate the general analysis approaches utilized by opinion mining algorithms and identified in the literature in terms of how accurately they reflect how people actually interpret and are influenced by electronic online reviews. Through interviewing and a follow up survey of 136?participants, the validity of the approach in terms of ascertaining the tone of a piece of text can be evaluated, as well as the identification of measurable factors within text which make a given opinionated text more or less influential in an online context, further facilitating the development of more effective multivariate opinion mining approaches. Furthermore, the identification of factors which make an online opinion text more or less persuasive helps to facilitate the development of opinion mining approaches which can evaluate how likely a review is to affect an individual’s decision making. 相似文献
19.
《微型机与应用》2017,(3):79-81
基于文本主题模型与眼动仪技术,从主题挖掘的客观角度与阅读兴趣的主观角度研究文本内容提取技术。传统文本挖掘多基于文本本身内容等客观因素,而主观取向的重要因素很少在文本挖掘中发挥作用。文章利用眼部追踪技术,先将眼动数据转换为阅读兴趣等主观结果形式,并利用LDA(Latent Dirichlet Allocation)模型对文本进行客观主题提取,继而对眼部数据与主题建模结果进行比较,提取分析主客观因素对文本挖掘的影响。新闻数据集的眼部追踪实验与主题提取实验显示了主客观因素对结果影响的具体差异性与相似性,未来两者结合并调控比率可作为对文本挖掘效果提升的基本方向。 相似文献
20.
Daytime running lamps (DRL) on vehicles have proven to be an effective measure to prevent accidents during the daytime, particularly when pedestrians and cyclists are involved. However, there are negative interactions of DRL with other functions in automotive lighting, such as delays in pedestrians’ visual reaction time (VRT) when turn indicators are activated in the presence of DRL. These negative interactions need to be reduced. This work analyses the influence of variables inherent to pedestrians, such as height, gender and visual defects, on the VRT using a classification and regression tree as an exploratory analysis and a generalized linear model to validate the results. Some pedestrian characteristics, such as gender, alone or combined with the DRL colour, and visual defects, were found to have a statistically significant influence on VRT and, hence, on traffic safety. These results and conclusions concerning the interaction between pedestrians and vehicles are presented and discussed.
Practitioner Summary: Visual interactions of vehicle daytime running lamps (DRL) with other functions in automotive lighting, such as turn indicators, have an important impact on a vehicle’s conspicuity for pedestrians. Depending on several factors inherent to pedestrians, the visual reaction time (VRT) can be remarkably delayed, which has implications in traffic safety. 相似文献