首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
基于数据科学与大数据技术专业的背景和特点,围绕数据科学与大数据技术专业的培养目标,探讨该专业导论性课程数据科学与分析的课程定位、教学大纲、教学手段、实践教学、课程网站建设等相关问题,介绍线上线下混合教学实践过程。  相似文献   

2.
数据科学与大数据技术专业作为一门新兴专业,对我国信息技术发展及综合实力的提高有举足轻重的意义.文章首先指出数据科学与大数据技术专业在师资、科研、教学方面存在的主要问题,其次围绕"新工科","工程认证"等理念,从数据科学与大数据技术专业的人才培养模式以及课程教学模式创新两方面作出了实践,制定了数据科学与大数据技术专业的课程体系,给出了针对数据科学与大数据技术专业学生的能力培养矩阵方案.  相似文献   

3.
尹波 《计算机时代》2021,(7):98-100,103
数据科学与大数据技术专业是一门典型的新工科专业,课程体系是该专业建设的核心.文章分析了该专业建设存在的主要问题,针对目前该专业的课程体系缺少统一标准的现状,依据新工科建设的要求,制定了该专业的课程体系.以长沙理工大学为例,着重介绍其在培养目标、课程设置、培养方向、实践教学等方面的教学改革和实践,为数据科学与大数据专业的建设提供参考.  相似文献   

4.
熊斐 《互联网周刊》2023,(17):58-60
随着整个社会对大数据人才需求的井喷,如何培养出契合社会需求的数据科学与大数据技术专业人才成为高校急需解决的问题。文章通过分析大数据技术专业人才培养当前面临的主要问题,结合大数据专业的特点,从多个维度对普通应用型本科院校大数据人才培养体系进行了探究,并指出作为普通应用型本科院校,应结合自身的专业特色,形成以应用为主导的特色大数据人才培养体系。  相似文献   

5.
依据课程思政建设要求,结合数据科学与大数据技术导论课程特点,提出以工程教育认证为指导、以OBE教育理念为核心的课程思政建设与实施方案,从课程教学内容、课程目标、教学过程、教学方法与教学模式、教学质量评价与持续改进等方面介绍基于该课程的课程思政与工程教育专业认证融合的整体教学设计,为基于课程的思政建设与工程教育专业认证融合提供参考。  相似文献   

6.
在调研国内外数据科学与大数据技术专业建设情况的基础上,提出培养具有行业特色和可持续竞争力的大数据卓越人才的建设目标,阐述如何构建贯通式能力培养的课程体系,构建校企融合协同育人体系,构建多层次一体化的实验环境,培养师资队伍以及构建教学质量持续改进体系,从而形成多层次、多类型、健全的卓越人才培养体系。  相似文献   

7.
根据数据科学与大数据专业的培养目标和培养方式,提出结合大数据平台的教学实验方法,介绍大数据平台建设内容和实验课程体系,并对课程教学实践中的经验与不足进行总结,给出大数据专业需求的实践类课程设计建议。  相似文献   

8.
基于计算思维以及大数据时代等背景下,应该不断的顺应计算机相关领域的理论以及技术的发展和社会发展的需求,遵循重视内涵、强化特色、突出优势以及准确定位等原则,更好的实现教学方式、教学团队和教学管理以及课程教材和培养模式等方面的综合改革和创新,构建先进的教育教学理念、建设办学水平高、具有鲜明特色的新型人才培养模式。基于此,本文针对大数据时代背景下计算机科学与技术专业的综合改革方法与策略展开深入的分析和研究。  相似文献   

9.
结合大数据、计算思维及CS2013等时代背景,论述计算机科学与技术专业综合改革的必要性,从改革的基本理念、专业方向调整、教学团队建设、深化企业合作、完善课程与教学资源建设、教育教学方式改革、实践教学环节改革、教学管理改革等方面详细阐述计算机科学与技术专业综合改革的方案。  相似文献   

10.
针对数据人才培养的时代需求,以培养数据科学与工程特色的计算机科学与技术人才为目标,提出优化传统的计算机科学与技术专业课程体系,以特色研究型课程为抓手,建设"数据分析与挖掘课程群",具体阐述改革思路目标,并结合山西大学的学科平台建设说明实施办法。  相似文献   

11.
Ren  Rui  Cheng  Jiechao  He  Xi-Wen  Wang  Lei  Zhan  Jian-Feng  Gao  Wan-Ling  Luo  Chun-Jie 《计算机科学技术学报》2019,34(6):1167-1184
Journal of Computer Science and Technology - With tremendous growing interests in Big Data, the performance improvement of Big Data systems becomes more and more important. Among many steps, the...  相似文献   

12.
This paper presents CirroData,a high-performance SQL-on-Hadoop system designed for Big Data analytics workloads.As a home-grown enterprise-level online analytical processing(OLAP)system with more than seven-year research and development(R&D)experiences,we share our design details to the community about how to achieve high performance in CirroData.Multiple optimization techniques have been discussed in the paper.The effectiveness and the efficiency of all these techniques have been proved by our customers'daily usage.Benchmark-level studies,as well as several real application case studies of CirroData,have been presented in this paper.Our evaluations show that CirroData can outperform various types of counterpart database systems in the community,such as"Spark+Hive","Spark+HBase",Impala,DB-X/Y,Greenplum,HAWQ,and others.CirroData can achieve up to 4.99x speedup compared with Greenplum,HAWQ,and Spark in the standard TPC-H queries.Application-level evaluations demonstrate that CirroData outperforms"Spark+Hive"and"Spark+HBase"by up to 8.4x and 38.8x,respectively.In the meantime,CirroData achieves the performance speedups for some application workloads by up to 20x,100x,182.5x,92.6x,and 55.5x as compared with Greenplum,DB-X,Impala,DB-Y,and HAWQ,respectively.  相似文献   

13.
On High Dimensional Projected Clustering of Data Streams   总被引:3,自引:0,他引:3  
The data stream problem has been studied extensively in recent years, because of the great ease in collection of stream data. The nature of stream data makes it essential to use algorithms which require only one pass over the data. Recently, single-scan, stream analysis methods have been proposed in this context. However, a lot of stream data is high-dimensional in nature. High-dimensional data is inherently more complex in clustering, classification, and similarity search. Recent research discusses methods for projected clustering over high-dimensional data sets. This method is however difficult to generalize to data streams because of the complexity of the method and the large volume of the data streams.In this paper, we propose a new, high-dimensional, projected data stream clustering method, called HPStream. The method incorporates a fading cluster structure, and the projection based clustering methodology. It is incrementally updatable and is highly scalable on both the number of dimensions and the size of the data streams, and it achieves better clustering quality in comparison with the previous stream clustering methods. Our performance study with both real and synthetic data sets demonstrates the efficiency and effectiveness of our proposed framework and implementation methods.Charu C. Aggarwal received his B.Tech. degree in Computer Science from the Indian Institute of Technology (1993) and his Ph.D. degree in Operations Research from the Massachusetts Institute of Technology (1996). He has been a Research Staff Member at the IBM T. J. Watson Research Center since June 1996. He has applied for or been granted over 50 US patents, and has published over 75 papers in numerous international conferences and journals. He has twice been designated Master Inventor at IBM Research in 2000 and 2003 for the commercial value of his patents. His contributions to the Epispire project on real time attack detection were awarded the IBM Corporate Award for Environmental Excellence in 2003. He has been a program chair of the DMKD 2003, chair for all workshops organized in conjunction with ACM KDD 2003, and is also an associate editor of the IEEE Transactions on Knowledge and Data Engineering Journal. His current research interests include algorithms, data mining, privacy, and information retrieval.Jiawei Han is a Professor in the Department of Computer Science at the University of Illinois at Urbana–Champaign. He has been working on research into data mining, data warehousing, stream and RFID data mining, spatiotemporal and multimedia data mining, biological data mining, social network analysis, text and Web mining, and software bug mining, with over 300 conference and journal publications. He has chaired or served in many program committees of international conferences and workshops, including ACM SIGKDD Conferences (2001 best paper award chair, 1996 PC co-chair), SIAM-Data Mining Conferences (2001 and 2002 PC co-chair), ACM SIGMOD Conferences (2000 exhibit program chair), International Conferences on Data Engineering (2004 and 2002 PC vice-chair), and International Conferences on Data Mining (2005 PC co-chair). He also served or is serving on the editorial boards for Data Mining and Knowledge Discovery, IEEE Transactions on Knowledge and Data Engineering, Journal of Computer Science and Technology, and Journal of Intelligent Information Systems. He is currently serving on the Board of Directors for the Executive Committee of ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD). Jiawei has received three IBM Faculty Awards, the Outstanding Contribution Award at the 2002 International Conference on Data Mining, ACM Service Award (1999) and ACM SIGKDD Innovation Award (2004). He is an ACM Fellow (since 2003). He is the first author of the textbook “Data Mining: Concepts and Techniques” (Morgan Kaufmann, 2001).Jianyong Wang received the Ph.D. degree in computer science in 1999 from the Institute of Computing Technology, the Chinese Academy of Sciences. Since then, he ever worked as an assistant professor in the Department of Computer Science and Technology, Peking (Beijing) University in the areas of distributed systems and Web search engines (May 1999–May 2001), and visited the School of Computing Science at Simon Fraser University (June 2001–December 2001), the Department of Computer Science at the University of Illinois at Urbana-Champaign (December 2001–July 2003), and the Digital Technology Center and Department of Computer Science and Engineering at the University of Minnesota (July 2003–November 2004), mainly working in the area of data mining. He is currently an associate professor in the Department of Computer Science and Technology, Tsinghua University, Beijing, China.Philip S. Yuis the manager of the Software Tools and Techniques group at the IBM Thomas J. Watson Research Center. The current focuses of the project include the development of advanced algorithms and optimization techniques for data mining, anomaly detection and personalization, and the enabling of Web technologies to facilitate E-commerce and pervasive computing. Dr. Yu,s research interests include data mining, Internet applications and technologies, database systems, multimedia systems, parallel and distributed processing, disk arrays, computer architecture, performance modeling and workload analysis. Dr. Yu has published more than 340 papers in refereed journals and conferences. He holds or has applied for more than 200 US patents. Dr. Yu is an IBM Master Inventor.Dr. Yu is a Fellow of the ACM and a Fellow of the IEEE. He will become the Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering on Jan. 2001. He is an associate editor of ACM Transactions of the Internet Technology and also Knowledge and Information Systems Journal. He is a member of the IEEE Data Engineering steering committee. He also serves on the steering committee of IEEE Intl. Conference on Data Mining. He received an IEEE Region 1 Award for “promoting and perpetuating numerous new electrical engineering concepts”. Philip S. Yu received the B.S. Degree in E.E. from National Taiwan University, Taipei, Taiwan, the M.S. and Ph.D. degrees in E.E. from Stanford University, and the M.B.A. degree from New York University.  相似文献   

14.
Cloud computing offers massive scalability and elasticity required by many scientific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new opportunities for application developers. This paper investigates how workflow systems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.  相似文献   

15.
Beyond the hype of Big Data, something within business intelligence projects is indeed changing. This is mainly because Big Data is not only about data, but also about a complete conceptual and technological stack including raw and processed data, storage, ways of managing data, processing and analytics. A challenge that becomes even trickier is the management of the quality of the data in Big Data environments. More than ever before the need for assessing the Quality-in-Use gains importance since the real contribution–business value–of data can be only estimated in its context of use. Although there exists different Data Quality models for assessing the quality of regular data, none of them has been adapted to Big Data. To fill this gap, we propose the “3As Data Quality-in-Use model”, which is composed of three Data Quality characteristics for assessing the levels of Data Quality-in-Use in Big Data projects: Contextual Adequacy, Operational Adequacy and Temporal Adequacy. The model can be integrated into any sort of Big Data project, as it is independent of any pre-conditions or technologies. The paper shows the way to use the model with a working example. The model accomplishes every challenge related to Data Quality program aimed for Big Data. The main conclusion is that the model can be used as an appropriate way to obtain the Quality-in-Use levels of the input data of the Big Data analysis, and those levels can be understood as indicators of trustworthiness and soundness of the results of the Big Data analysis.  相似文献   

16.
Forensic examiners are in an uninterrupted battle with criminals in the use of Big Data technology. The underlying storage system is the main scene to trace the criminal activities. Big Data Storage System is identified as an emerging challenge to digital forensics. Thus, it requires the development of a sound methodology to investigate Big Data Storage System. Since the use of Hadoop as Big Data Storage System continues to grow rapidly, investigation process model for forensic analysis on Hadoop Storage and attached client devices is compulsory. Moreover, forensic analysis on Hadoop Big Data Storage System may take additional time without knowing where the data remnants can reside. In this paper, a new forensic investigation process model for Hadoop Big Data Storage System is proposed and discovered data remnants are presented. By conducting forensic research on Hadoop Big Data Storage System, the resulting data remnants assist the forensics examiners and practitioners for generating the evidences.  相似文献   

17.
越来越多的物联网数据呈现高维度特征,针对目前传感器数据异常检测算法对高维数据在线检测的困难,提出一种基于深度信念网络的高维传感器数据异常检测算法。首先利用深度信念网络对高维数据进行特征提取,降低原始数据维度,再对降维后的数据进行异常检测。在检测过程中将QSSVM(Quarter-Sphere Support Vector Machine)与滑动窗口模型相结合,实现了在线式的异常检测。通过在四组真实传感器数据上的大量实验,与先前的异常检测算法做了对比,实验结果表明,新算法相对于OCSVM(One-Class Support Vector Machine)仅利用原有算法50%的计算时间,将检测准确度提高了约20%。  相似文献   

18.
通过查阅大量相关文献资料,概述大数据和普适云的基本概念,提出基于普适云的大数据挖掘架构,在理论方面论证其可行性、论述其运行模式并进行了性能分析,分析总结了基于普适云的大数据挖掘所涉及的关键技术.  相似文献   

19.
The quality of the data is directly related to the quality of the models drawn from that data. For that reason, many research is devoted to improve the quality of the data and to amend errors that it may contain. One of the most common problems is the presence of noise in classification tasks, where noise refers to the incorrect labeling of training instances. This problem is very disruptive, as it changes the decision boundaries of the problem. Big Data problems pose a new challenge in terms of quality data due to the massive and unsupervised accumulation of data. This Big Data scenario also brings new problems to classic data preprocessing algorithms, as they are not prepared for working with such amounts of data, and these algorithms are key to move from Big to Smart Data. In this paper, an iterative ensemble filter for removing noisy instances in Big Data scenarios is proposed. Experiments carried out in six Big Data datasets have shown that our noise filter outperforms the current state-of-the-art noise filter in Big Data domains. It has also proved to be an effective solution for transforming raw Big Data into Smart Data.  相似文献   

20.
Research associated with Big Data in the Cloud will be important topic over the next few years. The topic includes work on demonstrating architectures, applications, services, experiments and simulations in the Cloud to support the cases related to adoption of Big Data. A common approach to Big Data in the Cloud to allow better access, performance and efficiency when analysing and understanding the data is to deliver Everything as a Service. Organisations adopting Big Data this way find the boundaries between private clouds, public clouds and Internet of Things (IoT) can be very thin. Volume, variety, velocity, veracity and value are the major factors in Big Data systems but there are other challenges to be resolved.The papers of this special issue address a variety of issues and concerns in Big Data, including: searching and processing Big Data, implementing and modelling event and workflow systems, visualisation modelling and simulation and aspects of social media.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号