首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
李国良  周煊赫 《软件学报》2020,31(3):831-844
大数据时代下,数据库系统主要面临3个方面的挑战:首先,基于专家经验的传统优化技术(如代价估计、连接顺序选择、参数调优)已经不能满足异构数据、海量应用和大规模用户对性能的需求,可以设计基于学习的数据库优化技术,使数据库更智能;其次,AI时代,很多数据库应用需要使用人工智能算法,如数据库中的图像搜索,可以将人工智能算法嵌入到数据库,利用数据库技术加速人工智能算法,并在数据库中提供基于人工智能的服务;再者,传统数据库侧重于使用通用硬件(如CPU),不能充分发挥新硬件(如ARM、AI芯片)的优势.此外,除了关系模型,数据库需要支持张量模型来加速人工智能操作.为了解决这些挑战,提出了原生支持人工智能(AI)的数据库系统,将各种人工智能技术集成到数据库中,以提供自监控、自配置、自优化、自诊断、自愈、自安全和自组装功能;另一方面,通过使用声明性语言,让数据库提供人工智能功能,以降低人工智能的使用门槛.介绍了实现人工智能原生数据库的5个阶段,并给出了设计人工智能原生数据库的挑战.以自主数据库调优、基于深度强化学习的查询优化、基于机器学习的基数估计和自主索引/视图推荐为例,展示了人工智能原生数据库的优势.  相似文献   

2.
Visualization and artificial intelligence (AI) are well-applied approaches to data analysis. On one hand, visualization can facilitate humans in data understanding through intuitive visual representation and interactive exploration. On the other hand, AI is able to learn from data and implement bulky tasks for humans. In complex data analysis scenarios, like epidemic traceability and city planning, humans need to understand large-scale data and make decisions, which requires complementing the strengths of both visualization and AI. Existing studies have introduced AI-assisted visualization as AI4VIS and visualization-assisted AI as VIS4AI. However, how can AI and visualization complement each other and be integrated into data analysis processes are still missing. In this paper, we define three integration levels of visualization and AI. The highest integration level is described as the framework of VIS+AI, which allows AI to learn human intelligence from interactions and communicate with humans through visual interfaces. We also summarize future directions of VIS+AI to inspire related studies.  相似文献   

3.
PrDB: managing and exploiting rich correlations in probabilistic databases   总被引:2,自引:0,他引:2  
Due to numerous applications producing noisy data, e.g., sensor data, experimental data, data from uncurated sources, information extraction, etc., there has been a surge of interest in the development of probabilistic databases. Most probabilistic database models proposed to date, however, fail to meet the challenges of real-world applications on two counts: (1) they often restrict the kinds of uncertainty that the user can represent; and (2) the query processing algorithms often cannot scale up to the needs of the application. In this work, we define a probabilistic database model, PrDB, that uses graphical models, a state-of-the-art probabilistic modeling technique developed within the statistics and machine learning community, to model uncertain data. We show how this results in a rich, complex yet compact probabilistic database model, which can capture the commonly occurring uncertainty models (tuple uncertainty, attribute uncertainty), more complex models (correlated tuples and attributes) and allows compact representation (shared and schema-level correlations). In addition, we show how query evaluation in PrDB translates into inference in an appropriately augmented graphical model. This allows us to easily use any of a myriad of exact and approximate inference algorithms developed within the graphical modeling community. While probabilistic inference provides a generic approach to solving queries, we show how the use of shared correlations, together with a novel inference algorithm that we developed based on bisimulation, can speed query processing significantly. We present a comprehensive experimental evaluation of the proposed techniques and show that even with a few shared correlations, significant speedups are possible.  相似文献   

4.
Sustainability policies to mitigate transportation energy impacts on the urban environment are urgently needed. Energy prediction models provide critical information to decision-makers who develop sustainability policies to reduce energy use and emissions. We present a transportation energy model (TEM) that uses Explainable Artificial Intelligence (XAI) methods to predict household transportation energy consumption in this study. The TEM model uses data-driven approaches for household transportation energy prediction. Machine learning techniques in artificial intelligence (AI) predictive modeling have become popular due to their ability to capture nonlinear and complex relationships. On the other hand, developing comprehensive understanding the inference mechanisms in AI models and ensuring trust in their predictions is challenging. This is because AI models are mostly of high complexity and low interpretability; in other words, they are black-box models. This study presents a case study of how model transparency and explanation can be generated using the Local Interpretable Model-Agnostic Explanation (LIME) to support advanced machine learning techniques in the transportation energy field. The methodology has been implemented based on the Household Travel Survey (HTS) data, which is used to train the artificial neural network with a relatively high degree of accuracy. The importance and effect (local explanation) of HTS inputs (such as household travel, demographics, and neighborhood data) on transportation energy consumption for specific traffic analysis zones (TAZs) are analyzed. The results are valuable to promote intelligent and user-friendly transportation energy planning models in urban regions across the world.  相似文献   

5.
人工智能技术因其强大的学习和泛化能力已被广泛应用于各种真实场景中.然而,现有的人工智能技术仍然面临着三大挑战:第一,现有的AI技术使用门槛高,依赖于AI从业者选择合适模型、设计合理参数、编写程序,因此很难被广泛应用到非计算机领域;第二,现有的AI算法训练效率低,造成了大量计算资源的浪费,甚至延误决策时机;第三,现有的A...  相似文献   

6.
This article describes recent jurisprudential accountsof analogical legal reasoning andcompares them in detail to the computational modelof case-based legal argument inCATO. The jurisprudential models provide a theoryof relevance based on low-levellegal principles generated in a process ofcase-comparing reflective adjustment. Thejurisprudential critique focuses on the problemsof assigning weights to competingprinciples and dealing with erroneously decidedprecedents. CATO, a computerizedinstructional environment, employs ArtificialIntelligence techniques to teach lawstudents how to make basic legal argumentswith cases. The computational modelhelps students test legal hypotheses againsta database of legal cases, draws analogiesto problem scenarios from the database, andcomposes arguments by analogy with a setof argument moves. The CATO model accountsfor a number of the important featuresof the jurisprudential accounts, includingimplementing a kind of reflective adjustment.It also avoids some of the problems identifiedin the critique; for instance, it deals withweights in a non-numeric, context-sensitivemanner. The article concludes by describingthe contributions AI research can make tojurisprudential investigations of complexcognitive phenomena of legal reasoning. Forinstance, unlike the jurisprudential models,CATO provides a detailed account of how togenerate multiple interpretations of a citedcase, downplaying or emphasizing the legalsignificance of distinctions in terms of thepurposes of the law as the argument contextdemands.  相似文献   

7.
毛晓岚  陈松 《软件》2011,32(9):38-42,44
本文致力于研究如何将多代理技术应用于分布式数据挖掘中的课题,通过分析典型分布式结构的Web服务器日志,设计Web日志挖掘系统的体系结构、各个代理的具体研究设计工作等。本文将多代理技术与Web日志挖掘技术结合起来,一方面可以更清晰的进行数据挖掘系统的设计,另一方面可以充分利用多代理技术来提高数据挖掘的效率,从而对基于Agent的分布式数据挖掘系统的理论意义和应用层面价值进行探索。  相似文献   

8.
数据库中间件技术及在三层客户机/服务器模型中的实现   总被引:15,自引:0,他引:15  
研究和分析了常用各种数据库中间件的工作原理以及各自的优缺点,在此基础上,提出并讨论了三层客户机/服务器环境下的一个具体应用实例。对其所采有的关键技术和设计思想进行了详细的分析。  相似文献   

9.
Our goal is to make the data in the database more accessible to end-users. We introduce a new data model, called MIX, which incorporates the concepts from semantic data models such as entities, ISA, aggregation, characteristic property, and shared property hierarchies into the universal relation data models (URM) to model a database in a top-down modular way. We then show how a consistent URM style's language can be provided for both database retrievals and database updates. The database schema graph of this model further facilitates us to implement an integrated graphic user interface for the database system design life cycle.  相似文献   

10.
Modelling is an integral part of engineering processes. Consequently, database design for engineering applications should take into account the modelling concepts used by engineers. On the other hand, these applications exhibit a wide diversity of modelling concepts. Rather than consolidating these into one single semantic data model one should aim for correspondingly specialized semantic models. This paper takes a constructive approach to developing such specialized models by proposing an Extensible Semantic Model (ESM) as the basis for declaring specialized semantic data models. The paper introduces a computerized environment for database design based on an ESM, and discusses the consequences of the ESM for a number of design tools: the need for a formal definition of the notion of modelling concept in order to have reliable and precise foundation for the extensions, declarative techniques for quickly introducing graphical representations for new concepts and for using them during schema design, conceptual-level test data generation for a designer-oriented evaluation of designs, and optimization techniques to control the wide latitude in mapping a conceptual schema to a logical schema. First experiences seem to point to considerable productivity gains during database design.  相似文献   

11.
生成式人工智能技术自ChatGPT发布以来,不断突破瓶颈,吸引了资本规模投入、多领域革命和政府重点关注。本文首先分析了大模型的发展动态、应用现状和前景,然后从以下3个方面对大模型相关技术进行了简要介绍:1)概述了大模型相关构造技术,包括构造流程、研究现状和优化技术;2)总结了3类当前主流图像—文本的大模型多模态技术;3)介绍了根据评估方式不同而划分的3类大模型评估基准。参数优化与数据集构建是大模型产品普及与技术迭代的核心问题;多模态能力是大模型重要发展方向之一;设立评估基准是比较与约束大模型的关键方法。此外,本文还讨论了现有相关技术面临的挑战与未来可能的发展方向。现阶段的大模型产品已有强大的理解能力和创造能力,在教育、医疗和金融等领域已展现出广阔的应用前景。但同时,它们也存在训练部署困难、专业知识不足和安全隐患等问题。因此,完善参数优化、优质数据集构建、多模态等技术,并建立统一、全面、便捷的评估基准,将成为大模型突破现有局限的关键。  相似文献   

12.
The jABC is a framework for process modelling and execution according to the XMDD (eXtreme model-driven design) paradigm, which advocates the rigorous use of user-level models in the software development process and software life cycle. We have used the jABC in the domain of scientific workflows for more than a decade now—an occasion to look back and take stock of our experiences in the field. On the one hand, we discuss results from the analysis of a sample of nearly 100 scientific workflow applications that have been implemented with the jABC. On the other hand, we reflect on our experiences and observations regarding the workflow development process with the framework. We then derive and discuss ongoing further developments and future perspectives for the framework, all with an emphasis on simplicity for end users through increased domain specificity. Concretely, we describe how the use of the PROPHETS synthesis plugin can enable a semantics-based simplification of the workflow design process, how with the jABC4 and DyWA frameworks more attention is paid to the ease of data management, and how the Cinco SCCE Meta-Tooling Suite can be used to generate tailored workflow management tools.  相似文献   

13.
空间关联规则的双向挖掘   总被引:9,自引:0,他引:9  
空间数据库中关联规则挖掘不仅需要考虑关系元组属性之间的关系——纵向关系,更需要挖掘元组之间的关系——横向关系,如相邻、相交、重叠等。本文通过分析空间数据库的存储模式,借鉴事务数据库关联规则的挖掘方法,对空间关联规则进行完整定义,并对规则的兴趣度度量进行探讨。根据挖掘的方向将空间数据挖掘归纳为纵向挖掘、横向挖掘、双向挖掘。在双向挖掘中,提出一种新算法,该算法根据挖掘任务进行约束,缩小挖掘空间,然后通过空间计算将空间关系转化为非空间关系,经过多次循环,获取非空间项集,进而挖掘出空间关联规则。据此提出空间数据双向挖掘工作流程,并通过实例进行了验证。  相似文献   

14.
朱涛  郭进伟  周欢  周烜  周傲英 《软件学报》2018,29(1):131-149
随着各类应用在数据量和业务量上的扩展,单机数据库系统越发难以应对现实需求。分布式数据库能够根据业务的需求动态地扩容,因此逐步开始受到应用的青睐。近年来,分布式数据库产品层出不穷,并在互联网应用中被大量投入使用。然而,分布式数据库的系统复杂度前所未有。为了让系统可用,设计者需要在多种属性中作合理选择和折中。这造成现有的数据库产品形态各异、优缺点对比分明。至今为止,尚未有人对分布式数据库的设计空间和折中方案进行过深入分析和整理。本文作者在对多个分布式数据库产品进行深入理解之后认识到:分布式数据库系统的设计方案可以通过三个属性进行基本刻画–操作一致性、事务一致性和系统可用性。虽然这三个属性并不新颖,但它们在数据库语境下的含义在文献中尚未得到充分澄清。本文对这三个属性进行澄清,并通过它们对典型数据库产品的格局进行概括、对现有的分布式数据库技术进行综述。此外,本文还对这三个属性之间的相互关系进行深入分析,以期帮助未来的开发者在分布式数据库的设计过程中作出合理选择。  相似文献   

15.
With the actual penetration of expert systems into the business world, the question is, how the expert system idea can be used to enhance the existing information systems with more intelligence in usage and operation. This interest is not surprising due to the advancement of the fifth generation of computer technology, and avid interest in the field of Artificial Intelligence. Therefore design of an information system for an application becomes more complex, and the inability of the human designer to deal with it increases. For designing intelligent systems, we have to be able to forecast the behavior of the information system more precisely before implementing it, i.e. we'have to support the specification process.Clearly the technology, such as Data base systems, is leading on efficiency issues as those needed for the construction, retrieval and manipulation of large shared data base. On the other hand, the AI techniques have improved significantly with function such as deductive reasoning and natural language processing. It is important to find way to merge these technologies into one mainstream of computing. A meeting point for the two areas is the issue of conceptual knowledge modelling, so that models can be created that will define the role and the ways to use data in AI systems. In the framework of this study, one possible expert system design aid environment has been suggested to assist the designer in his work.In a conceptual modelling environment a model is given for analysing complex real world problems known as the Conceptual Knowledge Model (CKM), represented by a Graphical and a Formal Representation. The Graphical Representation consists of three graphs: Conceptual Requirement Graph, Conceptual Behavior Graph, and Conceptual Structure Graph. These graphs are developed by involving the expert during the design process. The graphs are then transformed into first-order predicate logic to represent the logical axioms of a theory, which constitutes the knowledge base of the Expert System. The model suggested here is a step towards closing the gap between the theory of the conventional data base theory and AI databases.  相似文献   

16.
房产信息系统数据库设计中的三库分离技术   总被引:4,自引:0,他引:4  
石伟伟  谭秀娟 《计算机工程》2006,32(5):58-59,130
三库分离技术是在房产信息系统数据库设计实践中提出的一种实用的数据库优化技术。它是在数据库物理设计中将同一对象的不同生命周期在数据库中划分为正式库、工作库、历史库。该文分析了三库分离技术的应用背景、定义及相应的数据库设计方法,并总结了其在效率和保证数据安全性方面的优点和应用原则。最后还介绍了三库分离技术在杭州市房产管理信息系统中的实际应用情况。  相似文献   

17.
DB2数据库是IBM公司研制的一种大型的关系型数据库,它在大数据量存储中具有很高的效率和很好的安全性。本文研究了基于DB2数据库的设计和数据库的一些优化技术。对数据库设计和优化的规则做了研究。  相似文献   

18.
《Knowledge》2007,20(4):382-387
Michael Polanyi’s idea of tacit knowing and Martin Heidegger’s concept of pre-theoretical shared practice are presented as providing a strong rationale for the notion of practice based knowledge. Artificial Intelligence (AI) approaches such as Artificial Neural Networks (ANN), Case Based Reasoning (CBR) and Grounded Theory (with Interval Probability Theory) are able to model these philosophical concepts related to practice based knowledge. The AI techniques appropriate for modeling Polanyi’s and Heidegger’s ideas should be founded more on a connectionist rather than a cognitivist paradigm. Examples from engineering practice are used to demonstrate how the above techniques can capture, structure and make available such knowledge to practitioners.  相似文献   

19.
Compression can sometimes improve performance by making more of the data available to the processors faster. We consider the compression of integer keys in a B+-tree index. For this purpose, systems such as IBM DB2 use variable-byte compression over differentially coded keys. We revisit this problem with various compression alternatives such as Google's VarIntGB, Binary Packing and Frame-of-Reference. In all cases, we describe algorithms that can operate directly on compressed data. Many of our alternatives exploit the single-instruction-multiple-data (SIMD) instructions supported by modern CPUs. We evaluate our techniques in a database environment provided by Upscaledb, a production-quality key-value database. Our best techniques are SIMD accelerated: they simultaneously reduce memory usage while improving single-threaded speeds. In particular, a differentially coded SIMD binary-packing techniques (BP128) can offer a superior query speed (e.g., 40% better than an uncompressed database) while providing the best compression (e.g., by a factor of ten). For analytic workloads, our fast compression techniques offer compelling benefits. Our software is available as open source.  相似文献   

20.
New technologies are transforming medicine, and this revolution starts with data. Health data, clinical images, genome sequences, data on prescribed therapies and results obtained, data that each of us has helped to create. Although the first uses of artificial intelligence (AI) in medicine date back to the 1980s, it is only with the beginning of the new millennium that there has been an explosion of interest in this sector worldwide. We are therefore witnessing the exponential growth of health-related information with the result that traditional analysis techniques are not suitable for satisfactorily management of this vast amount of data. AI applications (especially Deep Learning), on the other hand, are naturally predisposed to cope with this explosion of data, as they always work better as the amount of training data increases, a phase necessary to build the optimal neural network for a given clinical problem. This paper proposes a comprehensive and in-depth study of Deep Learning methodologies and applications in medicine. An in-depth analysis of the literature is presented; how, where and why Deep Learning models are applied in medicine are discussed and reviewed. Finally, current challenges and future research directions are outlined and analysed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号