首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We present some basic concepts of a modelling environment for data integration in business analytics. Main emphasis is on defining a process model for the different activities occurring in connection with data integration, which allow later on assessment of the quality of the data. The model is based on combination of knowledge and techniques from statistical metadata management and from workflow processes. The modelling concepts are presented in a problem oriented formulation. The approach is embedded into an open model framework which aims for a modelling platform for all kinds of models useful in business applications.  相似文献   

2.
MOOC近几年发展迅猛,在使用过程中,大规模的学习者和海量的教学资源积累了庞大的学习行为数据。因此,基于MOOC的大数据分析成为了一个新兴的研究热点,其分析框架中涉及的四大核心是:大数据从哪里获取(Where)、MOOC大数据的类型(What)、如何进行大数据分析(How)和大数据分析应用(Do)。本文通过对MOOC现状的分析、特征及分类的梳理,提出一种Where-What-How-Do大数据分析框架,并对上述的四大核心进行阐述和回答。最后,结合Canvas Network数据集进行聚类分析和多元回归分析,得出关于MOOC数据的一些启示和应用。   相似文献   

3.
大数据可视分析综述   总被引:8,自引:0,他引:8  
任磊  杜一  马帅  张小龙  戴国忠 《软件学报》2014,25(9):1909-1936
可视分析是大数据分析的重要方法.大数据可视分析旨在利用计算机自动化分析能力的同时,充分挖掘人对于可视化信息的认知能力优势,将人、机的各自强项进行有机融合,借助人机交互式分析方法和交互技术,辅助人们更为直观和高效地洞悉大数据背后的信息、知识与智慧.主要从可视分析领域所强调的认知、可视化、人机交互的综合视角出发,分析了支持大数据可视分析的基础理论,包括支持分析过程的认知理论、信息可视化理论、人机交互与用户界面理论.在此基础上,讨论了面向大数据主流应用的信息可视化技术——面向文本、网络(图)、时空、多维的可视化技术.同时探讨了支持可视分析的人机交互技术,包括支持可视分析过程的界面隐喻与交互组件、多尺度/多焦点/多侧面交互技术、面向Post-WIMP的自然交互技术.最后,指出了大数据可视分析领域面临的瓶颈问题与技术挑战.  相似文献   

4.
Big data analytics applications are increasingly deployed on cloud computing infrastructures,and it is still a big challenge to pick the optimal cloud configurations in a cost-effective way.In this paper,we address this problem with a high accuracy and a low overhead.We propose Apollo,a data-driven approach that can rapidly pick the optimal cloud configurations by reusing data from similar workloads.We first classify 12 typical workloads in BigDataBench by characterizing pairwise correlations in our offline benchmarks.When a new workload comes,we run it with several small datasets to rank its key characteristics and get its similar workloads.Based on the rank,we then limit the search space of cloud configurations through a classification mechanism.At last,we leverage a hierarchical regression model to measure which cluster is more suitable and use a local search strategy to pick the optimal cloud configurations in a few extra tests.Our evaluation on 12 typical workloads in HiBench shows that compared with state-of-the-art approaches,Apollo can improve up to 30% search accuracy,while reducing as much as 50% overhead for picking the optimal cloud configurations.  相似文献   

5.
大数据时代,越来越多的领域出现了对海量、高速数据进行实时处理的需求.如何对大数据流进行抽取转化成有用的信息并应用于各行各业变得越来越重要.传统的批量机器学习技术在大数据分析的应用中存在许多限制.在线学习技术采用流式计算模式,在内存中直接进行数据的实时计算,为流数据的学习提供了有利的工具.介绍了大数据分析的动机与背景,集中展示经典和最新的在线学习方法与算法,这种在线学习体系很有希望解决各种大数据挖掘任务面临的困难与挑战.主要技术内容包括3方面: 1) 线性模型在线学习;2) 基于核的非线性模型在线学习;3) 非传统的在线学习方法.各类方法尽量给出详细的模型和伪代码,讨论面向大数据分析的大规模机器学习研究与应用中的关键问题;给出大数据在线学习的3种典型应用场景,并探讨现今或将来在线学习领域进一步的研究方向.  相似文献   

6.
大数据分析中的计算智能研究现状与展望   总被引:2,自引:0,他引:2  
郭平  王可  罗阿理  薛明志 《软件学报》2015,26(11):3010-3025
随着产业界和科学界数据量的爆炸式增长,大数据技术和应用吸引了众多的关注.如何分析大数据,充分挖掘大数据的潜在价值,成为需要深入探讨的科学问题.计算智能是科学研究和工程实践中解决复杂问题的有效手段,是人工智能和信息科学的重要研究方向,应用计算智能方法进行大数据分析具有巨大的潜力.对大数据分析中的计算智能方法进行综述,结合大数据的特征,讨论了大数据分析中计算智能研究存在的问题和进一步的研究方向,阐述了数据源共享问题,并建议利用以天文学为代表的数据密集型基础科研领域的数据开展大数据分析研究.  相似文献   

7.
学习分析是大数据在教育应用中的焦点,本文对学习分析的核心环节进行技术剖析,梳理主要的学习分析工具,以实证研究的方式,从课程建设者、教学管理者和辅导教师这3种不同用户视角展示学习分析技术的应用过程。研究以某课程平台的学习行为数据作为研究样本,应用统计、可视化、聚类、关联规则等方法,采用Excel,SPSS,Weka等工具,分析课程模块访问频次,了解不同教学组对学生登录周数的影响,刻画学生的分类特征,发现隐含的内在规律。研究表明,学习分析技术充分发挥了教育大数据的价值,使数据成为教学干预、实施决策的重要依据。  相似文献   

8.
吴悦文  吴恒  任杰  张文博  魏峻  王焘  钟华 《软件学报》2020,31(6):1860-1874
云计算已成为大数据分析作业的主流运行支撑环境,选择合适的云资源优化其性能面临巨大挑战.当前研究主要考虑大数据分析框架(如Hadoop,Spark等)的多样性,采用机器学习方法进行资源供给,但样本少容易陷入局部最优解.提出了大数据环境下基于负载分类的启发式云资源供给方法RP-CH,基于云资源共享特点,获取其他大数据分析作业的运行时监测和云资源配置信息,建立负载分类与优化云资源配置的启发式规则,并将该规则作用到贝叶斯优化算法的收益函数.基于HiBench,SparkBench测试基准的结果显示:RP-CH相对于已有方法 CherryPick、大数据分析作业的性能平均提升了58%,成本平均减少了44%.  相似文献   

9.
This paper presents an integrated framework that comprises an automatic weighting method for assessing data quality (DQ) of the framework so as to better support the business intelligence (BI) usage. Specifically, we utilize business process modeling (BPM) notation and information product map and frame them into a hierarchical mapping structure. Furthermore, we develop and demonstrate an automatic weight-assignment method for evaluating critical dimensions (i.e., completeness and accuracy) of DQ of the integrated framework. Through a design science paradigm, the effectiveness of the framework and the associated DQ weighting method has been rigorously validated by faculty management users of a university. The framework together with the DQ weighting method builds user confidence by enhancing the traceability of a BI product. The automatic DQ weight assignment also provides better time efficiency because the weight of each data attribute is determined automatically based on its usage on the BI dashboard.  相似文献   

10.
Business Intelligence: An Analysis of the Literature   总被引:1,自引:0,他引:1  
This research collects, synthesizes, and analyzes 167 articles on a variety of topics closely related to business intelligence (BI) published from 1997 to 2006 in ten leading Information Systems (IS) journals. We found a generally increasing level of activity during the 10-year period and a focus on exploratory research methodologies. We noted that several methodologies were either underrepresented or absent from the pool of BI research. We also identified several subject areas that need further exploration.  相似文献   

11.
In recent years, huge volumes of healthcare data are getting generated in various forms. The advancements made in medical imaging are tremendous owing to which biomedical image acquisition has become easier and quicker. Due to such massive generation of big data, the utilization of new methods based on Big Data Analytics (BDA), Machine Learning (ML), and Artificial Intelligence (AI) have become essential. In this aspect, the current research work develops a new Big Data Analytics with Cat Swarm Optimization based deep Learning (BDA-CSODL) technique for medical image classification on Apache Spark environment. The aim of the proposed BDA-CSODL technique is to classify the medical images and diagnose the disease accurately. BDA-CSODL technique involves different stages of operations such as preprocessing, segmentation, feature extraction, and classification. In addition, BDA-CSODL technique also follows multi-level thresholding-based image segmentation approach for the detection of infected regions in medical image. Moreover, a deep convolutional neural network-based Inception v3 method is utilized in this study as feature extractor. Stochastic Gradient Descent (SGD) model is used for parameter tuning process. Furthermore, CSO with Long Short-Term Memory (CSO-LSTM) model is employed as a classification model to determine the appropriate class labels to it. Both SGD and CSO design approaches help in improving the overall image classification performance of the proposed BDA-CSODL technique. A wide range of simulations was conducted on benchmark medical image datasets and the comprehensive comparative results demonstrate the supremacy of the proposed BDA-CSODL technique under different measures.  相似文献   

12.
首先,对大数据时代下大众广泛深度交互的互联网环境进行了分析;其次,提出并释义了网络群体智能,指出网络群体智能具有"网络数据驱动,交互形式复杂,网络效应强大,知识生产为主,不确定性认知"等特性;然后,提出网络群体智能研究方法,该研究方法以复杂性科学方法论为指导,坚持融贯论,以复杂性科学、网络化数据挖掘和不确定性人工智能为支撑理论方法,突出网络群体智能特色和多学科交叉融合研究,采用系统分析、建模分析和仿真分析相结合技术途径从结构和动力学视角对网络群体智能科学问题进行多尺度多层次研究,解决网络群体智能研究理论方法不足的问题,深化了对网络群体智能和社会计算的认识。  相似文献   

13.
Lately, the Internet of Things (IoT) application requires millions of structured and unstructured data since it has numerous problems, such as data organization, production, and capturing. To address these shortcomings, big data analytics is the most superior technology that has to be adapted. Even though big data and IoT could make human life more convenient, those benefits come at the expense of security. To manage these kinds of threats, the intrusion detection system has been extensively applied to identify malicious network traffic, particularly once the preventive technique fails at the level of endpoint IoT devices. As cyberattacks targeting IoT have gradually become stealthy and more sophisticated, intrusion detection systems (IDS) must continually emerge to manage evolving security threats. This study devises Big Data Analytics with the Internet of Things Assisted Intrusion Detection using Modified Buffalo Optimization Algorithm with Deep Learning (IDMBOA-DL) algorithm. In the presented IDMBOA-DL model, the Hadoop MapReduce tool is exploited for managing big data. The MBOA algorithm is applied to derive an optimal subset of features from picking an optimum set of feature subsets. Finally, the sine cosine algorithm (SCA) with convolutional autoencoder (CAE) mechanism is utilized to recognize and classify the intrusions in the IoT network. A wide range of simulations was conducted to demonstrate the enhanced results of the IDMBOA-DL algorithm. The comparison outcomes emphasized the better performance of the IDMBOA-DL model over other approaches.  相似文献   

14.
基于Web Services的商务智能研究   总被引:1,自引:0,他引:1  
在信息化过程中,企业需要实时准确地寻找信息模型,不同企业之间知识信息和智能分析能力的共享和交互的需求变得越来越迫切。本文分析了Web服务的动态性和实时性的优势,提出并阐述了基于Web服务的商务智能网络的体系结构及其实现,从而将商务智能的前瞻性和Web服务的时效性有机结合起来,提升了企业的决策能力。  相似文献   

15.
Collective intelligence has been an important research topic in many AI communities. With The big data phenomenon, we have been facing on many research problems on how to integrate the big data with collective intelligence. This special issue has selected 9 high quality papers covering various research issues.  相似文献   

16.
【目的】本文主要分析人工智能和大数据应用随着迅速增大的数据规模,给计算机系统带来的主要挑战,并针对计算机系统的发展趋势给出了一些面向人工智能和大数据亟待解决的高效能计算的若干研究方向。【文献范围】本文广泛查阅国内外在超级计算和高性能计算平台进行大数据和人工智能计算的最新研究成果及解决的挑战性问题。【方法】大数据既为人工智能提供了日益丰富的训练数据集合,但也给计算机系统的算力提出了更高的要求。近年来我国超级计算机处于世界的前列,为大数据和人工智能的大规模应用提供了强有力的计算平台支撑。【结果】而目前以超级计算机为代表的高性能计算平台大多采用CPU+加速器构成的异构并行计算系统,其数量众多的计算核心能够为人工智能和大数据应用提供强大的计算能力。【局限性】由于体系结构复杂,在充分发挥计算能力和提高计算效率方面存在较大挑战。尤其针对有别于科学计算的人工智能和大数据领域,其并行计算效率的提升更为困难。【结论】因此需要从底层的资源管理、任务调度、以及基础算法设计、通信优化,到上层的模型并行化和并行编程等方面展开高效能计算的研究,全面提升人工智能和大数据应用在高性能计算平台上的计算能效。  相似文献   

17.
In the digital area, Internet of Things (IoT) and connected objects generate a huge quantity of data traffic which feeds big data analytic models to discover hidden patterns and detect abnormal traffic. Though IoT networks are popular and widely employed in real world applications, security in IoT networks remains a challenging problem. Conventional intrusion detection systems (IDS) cannot be employed in IoT networks owing to the limitations in resources and complexity. Therefore, this paper concentrates on the design of intelligent metaheuristic optimization based feature selection with deep learning (IMFSDL) based classification model, called IMFSDL-IDS for IoT networks. The proposed IMFSDL-IDS model involves data collection as the primary process utilizing the IoT devices and is preprocessed in two stages: data transformation and data normalization. To manage big data, Hadoop ecosystem is employed. Besides, the IMFSDL-IDS model includes a hill climbing with moth flame optimization (HCMFO) for feature subset selection to reduce the complexity and increase the overall detection efficiency. Moreover, the beetle antenna search (BAS) with variational autoencoder (VAE), called BAS-VAE technique is applied for the detection of intrusions in the feature reduced data. The BAS algorithm is integrated into the VAE to properly tune the parameters involved in it and thereby raises the classification performance. To validate the intrusion detection performance of the IMFSDL-IDS system, a set of experimentations were carried out on the standard IDS dataset and the results are investigated under distinct aspects. The resultant experimental values pointed out the betterment of the IMFSDL-IDS model over the compared models with the maximum accuracy 95.25% and 97.39% on the applied NSL-KDD and UNSW-NB15 dataset correspondingly.  相似文献   

18.
19.
Big data analytics and business analytics are a disruptive technology and innovative solution for enterprise development. However, what is the relationship between business analytics, big data analytics, and enterprise information systems (EIS)? How can business analytics enhance the development of EIS? How can analytics be incorporated into EIS? These are still big issues. This article addresses these three issues by proposing ontology of business analytics, presenting an analytics service-oriented architecture (ASOA) and applying ASOA to EIS, where our surveyed data analysis showed that the proposed ASOA is viable for developing EIS. This article then examines incorporation of business analytics into EIS through proposing a model for business analytics service-based EIS, or ASEIS for short. The proposed approach in this article might facilitate the research and development of EIS, business analytics, big data analytics, and business intelligence.  相似文献   

20.
This paper focuses on facilitating state-of-the-art applications of big data analytics(BDA) architectures and infrastructures to telecommunications(telecom) industrial sector.Telecom companies are dealing with terabytes to petabytes of data on a daily basis. Io T applications in telecom are further contributing to this data deluge. Recent advances in BDA have exposed new opportunities to get actionable insights from telecom big data. These benefits and the fast-changing BDA technology landscape make it important to investigate existing BDA applications to telecom sector. For this, we initially determine published research on BDA applications to telecom through a systematic literature review through which we filter 38 articles and categorize them in frameworks, use cases, literature reviews, white papers and experimental validations. We also discuss the benefits and challenges mentioned in these articles. We find that experiments are all proof of concepts(POC) on a severely limited BDA technology stack(as compared to the available technology stack), i.e.,we did not find any work focusing on full-fledged BDA implementation in an operational telecom environment. To facilitate these applications at research-level, we propose a state-of-the-art lambda architecture for BDA pipeline implementation(called Lambda Tel) based completely on open source BDA technologies and the standard Python language, along with relevant guidelines.We discovered only one research paper which presented a relatively-limited lambda architecture using the proprietary AWS cloud infrastructure. We believe Lambda Tel presents a clear roadmap for telecom industry practitioners to implement and enhance BDA applications in their enterprises.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号