首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Data collection is a necessary step in data mining process. Due to privacy reasons, collecting data from different parties becomes difficult. Privacy concerns may prevent the parties from directly sharing the data and some types of information about the data. How multiple parties collaboratively conduct data mining without breaching data privacy presents a challenge. The objective of this paper is to provide solutions for privacy-preserving collaborative data mining problems. In particular, we illustrate how to conduct privacy-preserving naive Bayesian classification which is one of the data mining tasks. To measure the privacy level for privacy- preserving schemes, we propose a definition of privacy and show that our solutions preserve data privacy.  相似文献   

2.
ABSTRACT

The ability to exploit students’ sentiments using different machine learning techniques is considered an important strategy for planning and manoeuvring in a collaborative educational environment. The advancement of machine learning technology is energised by the healthy growth of big data technologies. This helps the applications based on Sentiment Mining (SM) using big data to become a common platform for data mining activities. However, very little has been studied on the sentiment application using a huge amount of available educational data. Therefore, this paper has made an attempt to mine the academic data using different efficient machine learning algorithms. The contribution of this paper is two-fold: (i) studying the sentiment polarity (positive, negative and neutral) from students’ data using machine learning techniques, and (ii) modelling and predicting students’ emotions (Amused, Anxiety, Bored, Confused, Enthused, Excited, Frustrated, etc.) using the big data frameworks. The developed SM techniques using big data frameworks can be scaled and made adaptable for source variation, velocity and veracity to maximise value mining for the benefit of students, faculties and other stakeholders.  相似文献   

3.
The purpose of this study was to explore sequences of social regulatory processes during a computer-supported collaborative learning task and their relationship to group performance. Analogous to self-regulation during individual learning, we conceptualized social regulation both as individual and as collaborative activities of analyzing, planning, monitoring and evaluating cognitive and motivational aspects during collaborative learning. We analyzed the data of 42 participants working together in dyads. They had 90 min to develop a common handout on a statistical topic while communicating only via chat and common editor. The log files of chat and editor were coded regarding activities of social regulation. Results show that participants in dyads with higher group performance (N = 20) did not differ from participants with lower group performance (N = 22) in the frequencies of regulatory activities. In an exploratory way, we used process mining to identify process patterns for high versus low group performance dyads. The resulting models show clear parallels between high and low achieving dyads in a double loop of working on the task, monitoring, and coordinating. Moreover, there are no major differences in the process of high versus low achieving dyads. Both results are discussed with regard to theoretical and empirical issues. Furthermore, the method of process mining is discussed.  相似文献   

4.
In this paper, we used data mining techniques for the automatic discovering of useful temporal abstraction in reinforcement learning. This idea was motivated by the ability of data mining algorithms in automatic discovering of structures and patterns, when applied to large data sets. The state transitions and action trajectories of the learning agent are stored as the data sets for data mining techniques. The proposed state clustering algorithms partition the state space to different regions. Policies for reaching different parts of the space are separately learned and added to the model in a form of options (macro-actions). The main idea of the proposed action sequence mining is to search for patterns that occur frequently within an agent’s accumulated experience. The mined action sequences are also added to the model in a form of options. Our experiments with different data sets indicate a significant speedup of the Q-learning algorithm using the options discovered by the state clustering and action sequence mining algorithms.  相似文献   

5.
The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model—machine learning (ML) residuals sequential simulations—MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. ML algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process.  相似文献   

6.
Recommendation systems have been investigated and implemented in many ways. In particular, in the case of a collaborative filtering system, the most important issue is how to manipulate the personalized recommendation results for better user understandability and satisfaction. A collaborative filtering system predicts items of interest for users based on predictive relationships discovered between each item and others. This paper proposes a categorization for grouping associative items discovered by mining, for the purpose of improving the accuracy and performance of item-based collaborative filtering. It is possible that, if an associative item is required to be simultaneously associated with all other groups in which it occurs, the proposed method can collect associative items into relevant groups. In addition, the proposed method can result in improved predictive performance under circumstances of sparse data and cold-start initiation of collaborative filtering starting from a small number of items. In addition, this method can increase prediction accuracy and scalability because it removes the noise generated by ratings on items of dissimilar content or level of interest. The approach is empirically evaluated by comparison with k-means, average link, and robust, using the MovieLens dataset. The method was found to outperform existing methods significantly.  相似文献   

7.
This paper focuses on modeling collaborative interaction in Ubiquitous Learning Environment (ULE) based on the assumption that the collaborative interaction can be perceived through interpersonal interactions, which can be described as local dynamic behaviors of the team. In this paper, the collaborative interaction is collected from the experiment with 50 students having 5 members per team. Then the collaborative interaction is coded with 16 participation shift (P-shifts) from 5 different types of turns including turn receiving, turn claiming, turn usurping, turn continuing, and turn noreturning to represent the participation status of each member. Three types of participation statuses used in this paper are the contributor, the target and the unaddressed recipient. Then the discovered local dynamic behavior is used for constructing the model by using agent-based modeling. The model consists of student agents working together according to the discovered behavior. Then, the constructed model is verified by comparing the actual behavior with the simulated behavior. Finally, the comparison result shows that the constructed model can reasonably be the model for modeling collaborative interaction in ULE.  相似文献   

8.
Models represent a set of generic patterns to test hypotheses. This paper presents the CogMoLab student model in the context of an integrated learning environment. Three aspects are discussed: diagnostic and predictive modeling with respect to the issues of credit assignment and scalability and compositional modeling of the student profile in the context of an intelligent tutoring system/adaptive hypermedia learning system architectural pattern. The SOM–PCA, a collaborative-based data mining approach, is shown to be reusable for all three purposes above, enabling fast, objective implementations without requiring much intensive data collection.  相似文献   

9.
An inevitable consequence of the technology-driven economy has led to the increased importance of intellectual property protection through patents. Recent global pro-patenting shifts have further resulted in high technology overlaps. Technology components are now spread across a huge corpus of patent documents making its interpretation a knowledge-intensive engineering activity. Intelligent collaborative patent mining facilitates the integration of inputs from patented technology components held by diverse stakeholders. Topic generative models are powerful natural language tools used to decompose data corpus topics and associated word bag distributions. This research develops and validates a superior text mining methodology, called Excessive Topic Generation (ETG), as a preprocessing framework for topic analysis and visualization. The presented ETG methodology adapts the topic generation characteristics from Latent Dirichlet Allocation (LDA) with added capability to generate word distance relationships among key terms. The novel ETG approach is used as the core process for intelligent collaborative patent mining. A case study of 741 global Industrial Immersive Technology (IIT) patents covering inventive and novel concepts of Virtual Reality (VR), Augmented Reality (AR), and Brain Machine Interface (BMI) are systematically processed and analyzed using the proposed methodology. Based on the discovered topics of the IIT patents, patent classification (IPC/CPC) predictions are analyzed to validate the superior ETG results.  相似文献   

10.
陈琳  邓万宇  王昕 《计算机工程与设计》2011,32(4):1430-1433,1437
协作过滤是一种有效的个性化推荐技术,针对该技术随着用户和资源的增多,数据的高维稀疏特性严重导致推荐质量的下降和计算速度减慢的问题,研究并实现了一种基于极速神经网络的协作过滤方法。采用主成分分析解决数据高维稀疏性问题,采用极速神经网络技术解决计算速度慢的问题。实验结果表明,该方法具有良好的泛化性能和学习速度,能很好的满足个性化资源推荐的需求。  相似文献   

11.
Universal Access in the Information Society - Student modeling approaches are important to identify students’ needs, learning styles, and to monitor their improvements for individual modules....  相似文献   

12.
13.
随着我国现代化的迅速发展,伴随着快速发展的脚步其数据也越来越多,如何处理这些数据成为了越来越受人关注的问题。因为大数据多,种类复杂的特征,使得数据挖掘越来越重要,而自我学习可以对数据进行分析,并找出其相关模式,因此在商业领域应用广泛。本文主要就是研究数据挖掘中的自我学习算法,了解自我学习算法的特征,并分析在实际中可以应用到哪些领域。  相似文献   

14.
Some methods from statistical machine learning and from robust statistics have two drawbacks. Firstly, they are computer-intensive such that they can hardly be used for massive data sets, say with millions of data points. Secondly, robust and non-parametric confidence intervals for the predictions according to the fitted models are often unknown. A simple but general method is proposed to overcome these problems in the context of huge data sets. An implementation of the method is scalable to the memory of the computer and can be distributed on several processors to reduce the computation time. The method offers distribution-free confidence intervals for the median of the predictions. The main focus is on general support vector machines (SVM) based on minimizing regularized risks. As an example, a combination of two methods from modern statistical machine learning, i.e. kernel logistic regression and ε-support vector regression, is used to model a data set from several insurance companies. The approach can also be helpful to fit robust estimators in parametric models for huge data sets.  相似文献   

15.
Software and Systems Modeling - The need for real-time collaborative solutions in model-driven engineering has been increasing over the past years. Conflict-free replicated data types (CRDT)...  相似文献   

16.
The visual senses for humans have a unique status, offering a very broadband channel for information flow. Visual approaches to analysis and mining attempt to take advantage of our abilities to perceive pattern and structure in visual form and to make sense of, or interpret, what we see. Visual Data Mining techniques have proven to be of high value in exploratory data analysis and they also have a high potential for mining large databases. In this work, we try to investigate and expand the area of visual data mining by proposing new visual data mining techniques for the visualization of mining outcomes.  相似文献   

17.
王鑫  刘方爱 《计算机应用》2016,36(7):1988-1992
针对已有的多数据流协同频繁项集挖掘算法存在内存占用率高以及发现频繁项集效率低的问题,提出了改进的多数据流协同频繁项集挖掘(MCMD-Stream)算法。首先,该算法利用单遍扫描数据库的字节序列滑动窗口挖掘算法发现数据流中的潜在频繁项集和频繁项集;其次,构建类似频繁模式树(FP-Tree)的压缩频繁模式树(CP-Tree)存储已发现的潜在频繁项集和频繁项集,同时更新CP-Tree树中每个节点生成的对数倾斜时间表中的频繁项计数;最后,通过汇总分析得出在多条数据流中多次出现的且有价值的频繁项集,即协同频繁项集。相比A-Stream和H-Stream算法,MCMD-Stream算法不仅能够提高多数据流中协同频繁项集挖掘的效率,并且还降低了内存空间的使用率。实验结果表明MCMD-Stream算法能够有效地应用于多数据流的协同频繁项集挖掘。  相似文献   

18.
结合Web数据挖掘在E-learning平台中的应用,分析了Web数据挖掘的基本过程与关键技术,提出了一种基于Web挖掘的个性化学习平台模型,并阐述了Web挖掘在平台中的应用及其个性化搜索引擎的实现.  相似文献   

19.
Implement web learning environment based on data mining   总被引:2,自引:0,他引:2  
Qinglin Guo  Ming Zhang   《Knowledge》2009,22(6):439-442
The need for providing learners with web-based learning content that match their accessibility needs and preferences, as well as providing ways to match learning content to user’s devices has been identified as an important issue in accessible educational environment. For a web-based open and dynamic learning environment, personalized support for learners becomes more important. In order to achieve optimal efficiency in a learning process, individual learner’s cognitive learning style should be taken into account. Due to different types of learners using these systems, it is necessary to provide them with an individualized learning support system. However, the design and development of web-based learning environments for people with special abilities has been addressed so far by the development of hypermedia and multimedia based on educational content. In this paper a framework of individual web-based learning system is presented by focusing on learner’s cognitive learning process, learning pattern and activities, as well as the technology support needed. Based on the learner-centered mode and cognitive learning theory, we demonstrate an online course design and development that supports the students with the learning flexibility and the adaptability. The proposed framework utilizes data mining algorithm for representing and extracting a dynamic learning process and learning pattern to support students’ deep learning, efficient tutoring and collaboration in web-based learning environment. And experiments do prove that it is feasible to use the method to develop an individual web-based learning system, which is valuable for further study in more depth.  相似文献   

20.
A core issue of the association rule extracting process in the data mining field is to find the frequent patterns in the database of operational transactions. If these patterns discovered, the decision making process and determining strategies in organizations will be accomplished with greater precision. Frequent pattern is a pattern seen in a significant number of transactions. Due to the properties of these data models which are unlimited and high-speed production, these data could not be stored in memory and for this reason it is necessary to develop techniques that enable them to be processed online and find repetitive patterns. Several mining methods have been proposed in the literature which attempt to efficiently extract a complete or a closed set of different types of frequent patterns from a dataset. In this paper, a method underpinned upon Cellular Learning Automata (CLA) is presented for mining frequent itemsets. The proposed method is compared with Apriori, FP-Growth and BitTable methods and it is ultimately concluded that the frequent itemset mining could be achieved in less running time. The experiments are conducted on several experimental data sets with different amounts of minsup for all the algorithms as well as the presented method individually. Eventually the results prod to the effectiveness of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号