期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

FunctionFlow: coordinating parallel tasks

Xuepeng FAN Xiaofei LIAO Hai JIN 《Frontiers of Computer Science》2019,13(1):73

With the growing popularity of task-based parallel programming, nowadays task-parallel programming libraries and languages are still with limited support for coordinating parallel tasks. Such limitation forces programmers to use additional independent components to coordinate the parallel tasks — the components can be third-party libraries or additional components in the same programming library or language. Moreover, mixing tasks and coordination components increase the difficulty of task-based programming, and blind schedulers for understanding tasks’ dependencies. In this paper, we propose a task-based parallel programming library, FunctionFlow, which coordinates tasks in the purpose of avoiding additional independent coordination components. First, we use dependency expression to represent ubiquitous tasks’ termination. The key idea behind dependency expression is to use && for both task’s termination and || for any task termination, along with the combination of dependency expressions. Second, as runtime support, we use a lightweight representation for dependency expression. Also, we use suspended-task queue to schedule tasks that still have prerequisites to run. Finally, we demonstrate FunctionFlow’s effectiveness in two aspects, case study about implementing popular parallel patterns with FunctionFlow, and performance comparision with state-of-the-art practice, TBB. Our demonstration shows that FunctionFlow can generally coordinate parallel tasks without involving additional components, along with comparable performance with TBB. 相似文献

2.

“数据库主成份提取”方法及其应用 总被引：2，自引：0，他引：2

夏骄雄徐俊吴耿锋《计算机工程与应用》2006,42(20):134-137,202

庞大数据库中所蕴藏着丰富而有益的数据信息正随着数据挖掘技术的发展得到进一步分析和挖掘。数据仓库作为数据挖掘的重要平台,其质量的高低将直接影响数据挖掘的效率。构建数据仓库是数据预处理的主要目标之一,“数据库主成份提取”方法可以在信息损失最小的前提下,利用了一种降维的方法,用少数综合变量来概括原多变量的数据库,使重新构建的数据仓库的数据量相对减少,使得数据类的概率分布尽可能的接近使用所有属性的原分布,从而使重新构建的数据仓库中的数据挖掘更加容易执行和高效率。数据库主成份提取分析方法对主成份的解释可以进一步明确影响整个数据仓库构成的主要因素和构成数据仓库系统的主要特征。相似文献

3.

基于对象数据库的扩展Java集合框架

陆登李善平郑春昭《计算机应用与软件》2011,28(1)

常规的数据持久化方法是通过对象关系映射把对象存储到关系数据库中,但是易用性和效率一直是个问题。提出了一种更加方便、性能更强的方法,即用对象数据库来存储海量数据。然而目前对象数据库的使用还不是非常广泛,不少程序员可能还不了解对象数据库的使用。提出的基于对象数据库的扩展Java集合框架(Java Collection Fram ework),可以使程序员操作对象数据库就像使用普通的Java集合框架一样方便,而且在性能上也优于普通的关系数据库。相似文献

4.

Effective timestamping in databases 总被引：3，自引：0，他引：3

Kristian Torp Christian S. Jensen Richard T. Snodgrass 《The VLDB Journal The International Journal on Very Large Data Bases》2000,8(3-4):267-288

Many existing database applications place various timestamps on their data, rendering temporal values such as dates and times prevalent in database tables. During the past two decades, several dozen temporal data models have appeared, all with timestamps being integral components. The models have used timestamps for encoding two specific temporal aspects of database facts, namely transaction time, when the facts are current in the database, and valid time, when the facts are true in the modeled reality. However, with few exceptions, the assignment of timestamp values has been considered only in the context of individual modification statements. This paper takes the next logical step: It considers the use of timestamping for capturing transaction and valid time in the context of transactions. The paper initially identifies and analyzes several problems with straightforward timestamping, then proceeds to propose a variety of techniques aimed at solving these problems. Timestamping the results of a transaction with the commit time of the transaction is a promising approach. The paper studies how this timestamping may be done using a spectrum of techniques. While many database facts are valid until now, the current time, this value is absent from the existing temporal types. Techniques that address this problem using different substitute values are presented. Using a stratum architecture, the performance of the different proposed techniques are studied. Although querying and modifying time-varying data is accompanied by a number of subtle problems, we present a comprehensive approach that provides application programmers with simple, consistent, and efficient support for modifying bitemporal databases in the context of user transactions. Received: March 11, 1998 / Accepted July 27, 1999 相似文献

5.

Identifying novel biomarkers through data mining—A realistic scenario?

Johannes Griss Yasset Perez-Riverol Henning Hermjakob Juan Antonio Vizcaíno 《Proteomics. Clinical applications》2015,9(3-4):437-443

In this article we discuss the requirements to use data mining of published proteomics datasets to assist proteomics-based biomarker discovery, the use of external data integration to solve the issue of inadequate small sample sizes and finally, we try to estimate the probability that new biomarkers will be identified through data mining alone. 相似文献

6.

整体构件数控电解加工CAD/CAM平台关键技术

王福元徐家文赵建社《计算机辅助设计与图形学学报》2010,22(6)

针对整体构件数控电解加工中存在的问题,利用数据库、有限元、数控等技术建立了以工程数据库为集成方式的整体构件数控电解加工CAD/CAM平台;解决了该平台中建模、加工过程数值模拟、数控运动仿真、数控加工编程、参数优化等关键技术;并将该平台用于整体叶轮、压气机静子等复杂零件的电解加工中.应用结果表明,采用文中的平台缩短了整体构件的试制周期,提高了电解加工试验的成功率. 相似文献

7.

分布式异构数据库集成系统研究与实现 总被引：1，自引：0，他引：1

徐爱萍宋先明徐武平《计算机工程与科学》2015,37(10):1909-1916

由于历史原因和数据库技术的不断发展,很多部门已经积累并且还会大量积累各种异构数据,其异构性主要表现在数据库类型和数据结构的不同。针对这一问题进行研究,以三峡库区水环境及水文分布式异构数据库为例,在分析水环境和水文数据需求的基础上,构建了水文及水环境数据交换架构和数据共享平台;采用异构多源数据库引擎中间件解决了不同种类数据库之间的数据交换问题;针对大量历史数据的交换问题提供了分批导入数据交换方式;使用数据目录注册的方式使得集成平台的管理和使用便捷通用。本研究的异构多源数据库引擎不仅可以方便地连接目前主流的各种数据库,还基于Web Services技术解决了连接Web数据接口的问题。研究成果可满足不同应用环境的异构数据集成需求。相似文献

8.

Mining a large database with a parallel database server

《Intelligent Data Analysis》1999,3(6):437-451

Data mining is a data-intensive computation activity. Parallel processing has often been used in data mining algorithms. However, when data do not fit in memory, some solutions do not apply and a database system may be required rather than flat files. Most of the implementations use the database system loosely coupled with the data mining techniques. Hence, the database system only issues queries to be processed on the client machine. In this work, we address the data consuming activities through parallel processing on a database server providing a tight integration with data mining techniques. Experimental results showing the potential benefits of this integration were obtained. Despite the difficulties in processing a complex application, we extracted rules and obtained high performance on all the data-intensive activities such as the construction of the decision tree, pruning and rule extraction. 相似文献

9.

Metric Analysis and Data Validation Across Fortran Projects

《IEEE transactions on pattern analysis and machine intelligence》1983,(6):652-663

The desire to predict the effort in developing or explain the quality of software has led to the proposal of several metrics in the literature. As a step toward validating these metrics, the Software Engineering Laboratory has analyzed the Software Science metrics, cyclomatic complexity, and various standard program measures for their relation to 1) effort (including design through acceptance testing), 2) development errors (both discrete and weighted according to the amount of time to locate and frix), and 3) one another. The data investigated are collected from a production Fortran environment and examined across several projects at once, within individual projects and by individual programmers across projects, with three effort reporting accuracy checks demonstrating the need to validate a database. When the data come from individual programmers or certain validated projects, the metrics' correlations with actual effort seem to be strongest. For modules developed entirely by individual programmers, the validity ratios induce a statistically significant ordering of several of the metrics' correlations. When comparing the strongest correlations, neither Software Science's E metric, cyclomatic complexity nor source lines of code appears to relate convincingly better with effort than the others 相似文献

10.

基于树形结构的汽车试验集成系统数据库设计

洪伟吴云周国祥《微机发展》2008,18(2):177-179

针对汽车产品试验内容不断增加、更新的需要,系统数据库必须经常扩展,给用户对系统的使用和维护带来困难。文中将树形结构引入到数据库设计中,详细讨论了该树形结构的定义以及设计方法。该方法简单、直观,易于数据的组织,简化了数据库的设计过程,在安徽江淮汽车试验集成系统中的应用取得了良好的效果。相似文献

11.

基于树形结构的汽车试验集成系统数据库设计 总被引：1，自引：0，他引：1

洪伟吴云周国祥《计算机技术与发展》2008,18(2):177-179,183

针对汽车产品试验内容不断增加、更新的需要,系统数据库必须经常扩展,给用户对系统的使用和维护带来困难.文中将树形结构引入到数据库设计中,详细讨论了该树形结构的定义以及设计方法.该方法简单、直观.易于数据的组织,简化了数据库的设计过程,在安徽江淮汽车试验集成系统中的应用取得了良好的效果. 相似文献

12.

数据整合与数据挖掘技术在医疗保险信息系统的研究与应用

简伟光《电脑与微电子技术》2010,(10):47-51

通过分析医疗保险管理信息化深入发展的需求,从技术的角度提出医疗保险信息系统数据整合及数据挖掘的总体解决方案,并对医疗保险信息系统的数据仓库的设计、数据整合的方案以及数据挖掘的技术和应用进行概要的分析和论述,并用关联规则挖掘算法实证研究医保信息挖掘的可能性与必要性。利用编码、解码技术和SQL的聚集函数,实现基于SQL的FP-Growth算法,从而突破机器内存对数据挖掘的处理效率,实现对海量数据挖掘的高效挖掘。相似文献

13.

Database technology for decision support systems

《Computer》2001,34(12):48-55

Decision support systems form the core of business IT infrastructures because they let companies translate business information into tangible and lucrative results. Collecting, maintaining, and analyzing large amounts of data, however, involves expensive technical challenges that require organizational commitment. Many commercial tools are available for each of the three major data warehousing tasks: populating the data warehouse from independent operational databases, storing and managing the data, and analyzing the data to make intelligent business decisions. Data cleaning relates to heterogeneous data integration, a problem studied for many years. More work must be done to develop domain-independent tools that solve the data cleaning problems associated with data warehouse development. Most data mining research has focused on developing algorithms for building more accurate models or building models faster. However, data preparation and mining model deployment present several engaging problems that relate specifically to achieving better synergy between database systems and data mining technology 相似文献

14.

基于数据挖掘的机械设备故障诊断的研究 总被引：1，自引：0，他引：1

褚建立陈步英《微计算机信息》2007,23(19):208-209,171

随着信息技术的发展,人们采集数据的手段日益丰富与高明,由此积累的机械设备故障数据日益膨胀,而且高维数据也日益成为主流.如何从这些海量数据及高维特征中选出有用的数据进行有效的故障诊断成为一件困难的事情.计算机性能的日益更新和数据库技术的快速发展,使得数据挖掘这一融合多种分析手段,从大量数据中发现有用知识的方法应运而生,为上述问题的解决开辟了一条道路.本文就详细论述了应用数据挖掘技术进行机械设备故障诊断的全过程. 相似文献

15.

我国水文数据挖掘技术研究的回顾与展望 总被引：9，自引：0，他引：9

艾萍倪伟新《计算机工程与应用》2003,39(28):13-17

水文科学研究的领域面临来自许多方面的不确定性和非确知问题。引入数据挖掘的理论与技术,结合水文科学发展的需要,充分应用以计算机技术为基础的现代信息技术,研究水文数据挖掘的理论、技术和方法,为解决水文科学研究面临的问题提供了新的思路。当前,水文数据挖掘研究还处于起步阶段,研究内容多集中在水文数据的单项和局部数据的模拟与处理方面,对基于水文数据库的全局性多因素数据挖掘涉及很少,在数据挖掘技术与水文数据适应性方面所进行的研究也还很不够。为了充分发挥数据挖掘发现知识的作用,需要在水文主题数据库和多维数据立方、水文序列的分类、聚类和关联规则挖掘技术及优化算法以及水文序列的相似性、周期性和其它序列模式挖掘方面开展进一步研究,并向形成水文数据挖掘软件及数据平台方向发展。相似文献

16.

Difference-list transformation for Prolog

Kim Marriott Harald Søndergaard 《New Generation Computing》1993,11(2):125-157

Difference-lists are terms that represent lists. The use of difference-lists can speed up most list-processing programs considerably. Prolog programmers routinely use “difference-list versions” of programs, but very little investigation has taken place into difference-list transformation. Thus, to most programmers it is either unknown that the use of difference-lists is far from safe in all contexts, or else this fact is known but attributed to Prolog’s infamous “occur check problem.” In this paper we study the transformation of list-processing programs into programs that use differencelists. In particular we are concerned with finding circumstances under which the transformation is safe. We show that dataflow analysis can be used to determine whether the transformation is applicable to a given program, thereby allowing for automatic transformation. We prove that our transformation preserves strong operational equivalence. This paper is a revised and extended version of a paper¹⁰) that was presented to theInternational Computer Science Conference 88 in Hong Kong December 1988. 相似文献

17.

Intelligent assistance for software development and maintenance 总被引：2，自引：0，他引：2

Kaiser G.E. Feiler P.H. Popovich S.S. 《Software, IEEE》1988,5(3):40-49

An environment is described, called Professor Marvel, that provides early error checking and answers questions about the program under development. The environment has a certain understanding of the systems being developed and how to use tools to produce software. It aids individual programmers and helps coordinate programming teams. The key components of Marvel are a database that stores data represented as objects, as in object-oriented languages, and a model of the development process that imposes a structure on programming activities. Marvel's support of insight and of opportunistic processing is discussed at length, as is the handling of side effects. A sample session is described 相似文献

18.

Anonymity preserving pattern discovery 总被引：5，自引：0，他引：5

Maurizio Atzori Francesco Bonchi Fosca Giannotti Dino Pedreschi 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(4):703-727

It is generally believed that data mining results do not violate the anonymity of the individuals recorded in the source database. In fact, data mining models and patterns, in order to ensure a required statistical significance, represent a large number of individuals and thus conceal individual identities: this is the case of the minimum support threshold in frequent pattern mining. In this paper we show that this belief is ill-founded. By shifting the concept of k -anonymity from the source data to the extracted patterns, we formally characterize the notion of a threat to anonymity in the context of pattern discovery, and provide a methodology to efficiently and effectively identify all such possible threats that arise from the disclosure of the set of extracted patterns. On this basis, we obtain a formal notion of privacy protection that allows the disclosure of the extracted knowledge while protecting the anonymity of the individuals in the source database. Moreover, in order to handle the cases where the threats to anonymity cannot be avoided, we study how to eliminate such threats by means of pattern (not data!) distortion performed in a controlled way. 相似文献

19.

ILX: Extending the .NET Common IL for Functional Language Interoperability

Don Syme 《Electronic Notes in Theoretical Computer Science》2001,59(1):53-72

This paper describes several extensions to the .NET Common Intermediary Language (CIL), each of which is designed to enable easier implementation of typed high-level programming languages on the .NET platform, and to promote closer integration and interoperability between these languages. In particular we aim for easier interoperability between components whose interfaces are expressed using function types, discriminated unions and parametric polymorphism, regardless of the languages in which these components are implemented. We show that it is possible to add these constructs to an existing, “real world” intermediary language and that this allows corresponding subsets of constructs to be compiled uniformly, which in turn will allow programmers to use these constructs seamlessly between different languages. In this paper we discuss the motivations for our extensions, which are together called Extended IL (ILX), and describe them via examples. In this setting, many of the traditional responsibilities of the backend of a compiler must be moved to ILX and the execution environment, in particular those related to representation choices and low-level optimizations. We have modified a Haskell compiler to generate this language, and have implemented an assembler that translates the extensions to regular or polymorphic CIL code.I am very grateful to Nick Benton, Cedric Fournet, Andrew Kennedy, Andy Gordon, Simon Peyton Jones, Claudio Russo, Reuben Thomas, Andrew Tolmach and the anonymous referees for their help and advice with this work. 相似文献

20.

基于数据挖掘技术的CIMS系统信息集成方法

秦国锋李启炎《计算机工程》2003,29(15):37-39,97

对CIMS工程的现状进行了分析，明确了形成CIMS系统信息孤岛的原因，提出一种基于数据挖掘技术进行CIMS系统信息集成的方法，以原有的数据库为基础，利用网络技术和数据挖掘技术，建立数据挖掘系统，通过数据挖掘的下钻处理、上卷递交与数据信息的析取和融合，建立了相关的模糊理论模型和实现的算法，较好地解决了CIMS系统的资金流、物流与信息流的集成问题。相似文献