期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Fast computation of spatial selections and joins using graphics hardware

Nagender Bandi Chengyu Sun Divyakant Agrawal Amr El Abbadi 《Information Systems》2007

Spatial database operations are typically performed in two steps. In the filtering step, indexes and the minimum bounding rectangles (MBRs) of the objects are used to quickly determine a set of candidate objects. In the refinement step, the actual geometries of the objects are retrieved and compared to the query geometry or each other. Because of the complexity of the computational geometry algorithms involved, the CPU cost of the refinement step is usually the dominant cost of the operation for complex geometries such as polygons. Although many run-time and pre-processing-based heuristics have been proposed to alleviate this problem, the CPU cost still remains the bottleneck. In this paper, we propose a novel approach to address this problem using the efficient rendering and searching capabilities of modern graphics hardware. This approach does not require expensive pre-processing of the data or changes to existing storage and index structures, and is applicable to both intersection and distance predicates. We evaluate this approach by comparing the performance with leading software solutions. The results show that by combining hardware and software methods, the overall computational cost can be reduced substantially for both spatial selections and joins. We integrated this hardware/software co-processing technique into a popular database to evaluate its performance in the presence of indexes, pre-processing and other proprietary optimizations. Extensive experimentation with real-world data sets show that the hardware-accelerated technique not only outperforms the run-time software solutions but also performs as well if not better than pre-processing-assisted techniques. 相似文献

2.

Tree-based partition querying: a methodology for computing medoids in large spatial datasets

Kyriakos Mouratidis Dimitris Papadias Spiros Papadimitriou 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(4):923-945

Besides traditional domains (e.g., resource allocation, data mining applications), algorithms for medoid computation and related problems will play an important role in numerous emerging fields, such as location based services and sensor networks. Since the k-medoid problem is NP-hard, all existing work deals with approximate solutions on relatively small datasets. This paper aims at efficient methods for very large spatial databases, motivated by: (1) the high and ever increasing availability of spatial data, and (2) the need for novel query types and improved services. The proposed solutions exploit the intrinsic grouping properties of a data partition index in order to read only a small part of the dataset. Compared to previous approaches, we achieve results of comparable or better quality at a small fraction of the CPU and I/O costs (seconds as opposed to hours, and tens of node accesses instead of thousands). In addition, we study medoid-aggregate queries, where k is not known in advance, but we are asked to compute a medoid set that leads to an average distance close to a user-specified value. Similarly, medoid-optimization queries aim at minimizing both the number of medoids k and the average distance. We also consider the max version for the aforementioned problems, where the goal is to minimize the maximum (instead of the average) distance between any object and its closest medoid. Finally, we investigate bichromatic and weighted medoid versions for all query types, as well as, maximum capacity and dynamic medoids. 相似文献

3.

Efficient preprocessing of XML queries using structured signatures

Yon Dohn Chung 《Information Processing Letters》2003,87(5):257-264

The paper proposes a preprocessing scheme for efficient processing of XML queries in XML-based information retrieval systems. For the preprocessing, we use a signature-based approach. In the conventional (flat document-based) information retrieval systems, user queries consist of keywords and boolean operators, and thus signatures are structured in a flat manner. However, in XML-based information retrieval systems, the user queries have the form of path queries. Therefore, the flat signature cannot be effective for XML documents. In the paper, we propose two structured signature methods for XML documents. Through experiments, we evaluate the performance of the proposed methods. 相似文献

4.

Compressed hierarchical binary histograms for summarizing multi-dimensional data

Filippo Furfaro Giuseppe M. Mazzeo Domenico Saccà Cristina Sirangelo 《Knowledge and Information Systems》2008,15(3):335-380

Hierarchical binary partitions of multi-dimensional data are investigated as a basis for the construction of effective histograms. Specifically, the impact of adopting lossless compression techniques for representing the histogram on both the accuracy and the efficiency of query answering is investigated. Compression is obtained by exploiting the hierarchical partition scheme underlying the histogram, and then introducing further restrictions on the partitioning which enable a more compact representation of bucket boundaries. Basically, these restrictions consist of constraining the splits of the partition to be laid onto regular grids defined on the buckets. Several heuristics guiding the histogram construction are also proposed, and a thorough experimental analysis comparing the accuracy of histograms resulting from combining different heuristics with different representation models (both the new compression-based and the traditional ones) is provided. The best accuracy turns out from combining our grid-constrained partitioning scheme with one of the new heuristics. Histograms resulting from this combination are compared with state-of-the-art summarization techniques, showing that the proposed approach yields lower error rates and is much less sensitive to dimensionality, and that adopting our compression scheme results in improving the efficiency of query estimation. 相似文献

5.

Some challenges of integrating spatial and non-spatial datasets using a geographical information system

Mohammad A. Rob 《Information Technology for Development》2013,19(3):171-178

Geographical Information Systems or GIS are becoming useful tools in making strategic decisions in a variety of government and business activities in areas such as housing, healthcare, land use, natural resources, environmental monitoring, public health, transportation, retail, and routing. This usefulness emanates from the capability of GIS to present a large amount of data in a short period of time on a map, using a geographical coordinate system. In most cases, spatial datasets required for GIS mapping are already available free from many governmental agencies. GIS use more of computing technology than geographical concepts, however, the capabilities of GIS software did not reach the level of simplicity encountered in most software used on a daily basis. Most organizations perform GIS analysis on their data without getting involved with the mapping technology. A typical GIS analyst faces various challenges while incorporating non-spatial dataset to spatial dataset in order to present resulting dataset on a geographical map. In this paper, we present some data manipulation complexities that are encountered while using a GIS software to provide spatial twists to a large user dataset. We also provide ways to facilitate the data manipulation process through a practical example of asthma epidemiology. The solutions will be beneficial to many GIS users in varieties of industries. 相似文献

6.

Algebraic manipulation of scientific datasets

Bill Howe David Maier 《The VLDB Journal The International Journal on Very Large Data Bases》2005,14(4):397-416

We investigate algebraic processing strategies for large numeric datasets equipped with a (possibly irregular) grid structure. Such datasets arise, for example, in computational simulations, observation networks, medical imaging, and 2-D and 3-D rendering. Existing approaches for manipulating these datasets are incomplete: The performance of SQL queries for manipulating large numeric datasets is not competitive with specialized tools. Database extensions for processing multidimensional discrete data can only model regular, rectilinear grids. Visualization software libraries are designed to process arbitrary gridded datasets efficiently, but no algebra has been developed to simplify their use and afford optimization. Further, these libraries are data dependent – physical changes to data representation or organization break user programs. In this paper, we present an algebra of gridfields for manipulating arbitrary gridded datasets, algebraic optimization techniques, and an implementation backed by experimental results. We compare our techniques to those of Geographic Information Systems (GIS) and visualization software libraries, using real examples from an Environmental Observation and Forecasting System. We find that our approach can express optimized plans inaccessible to other techniques, resulting in improved performance with reduced programming effort. 相似文献

7.

Enhancing accuracy and expressive power of range query answers over incomplete spatial databases via a novel reasoning approach

Alfredo CuzzocreaAuthor Vitae Andrea NucitaAuthor Vitae 《Data & Knowledge Engineering》2011,70(8):702-716

Modern spatial database applications built on top of distributed and heterogeneous spatial information sources such as conventional spatial databases underlying Geographical Information Systems (GIS), spatial data files and spatial information acquired or inferred from the Web, suffer from data integration and topological consistency problems. This more-and-more conveys in incomplete information, which makes answering range queries over incomplete spatial databases a leading research challenge in spatial database systems research. A significant instance of this setting is represented by the application scenario in which the geometrical information on a sub-set of spatial database objects is incomplete whereas the spatial database still stores topological relations among these objects (e.g., containment relations). Focusing on the spatial database application scenario above, in this paper we propose and experimentally assess a novel technique for efficiently answering range queries over incomplete spatial databases via integrating geometrical information and topological reasoning. We also propose I-SQE (Spatial Query Engine for Incomplete Information), an innovative query engine implementing this technique. Our proposed technique results not only effective but also efficient against both synthetic and real-life spatial data sets, and it finally allows us to enhance the quality and the expressive power of retrieved answers by meaningfully taking advantages from the amenity of representing spatial database objects via both the geometrical and the topological level. 相似文献

8.

集成化城市防灾信息系统的设计与实现 总被引：1，自引：0，他引：1

田伟涛任爱珠《计算机工程与设计》1997,18(2):3-8

城市的防灾减灾能力，关系到人民的生命财产安全，已成年为评价国家进步程度的重要标志。此文针对城市防灾工作的特点，以城市火灾防治为例，撮邮以ＣＡＤ系统为基础，图形与数据库链结为核心，辅助管理、辅助办公，辅助决策为内容的综合型、空间型城市防灾信息系统的模式，论述了系统的集成机理，并对其中的一些关键技术提出了若干解决方法。相似文献

9.

Dynamic programming solution for multiple query optimization problem

Ismail H. Toroslu Ahmet Cosar 《Information Processing Letters》2004,92(3):149-155

相似文献

10.

On learning multivalued dependencies with queries

Víctor Lavín Puente 《Theoretical computer science》2011,412(22):2331-2339

Data dependencies play an important role in the design of relational databases. There is a strong connection between dependencies and some fragments of the propositional logic. In particular, functional dependencies are closely related to Horn formulas. Also, multivalued dependencies are characterized in terms of multivalued formulas. It is known that both Horn formulas and sets of functional dependencies are learnable in the exact model of learning with queries. Here we present an algorithm that learns a non-trivial subclass of multivalued formulas using membership and equivalence queries. Furthermore, a slight modification of the algorithm allows us to learn the corresponding subclass of multivalued dependencies. 相似文献

11.

Negative results on learning multivalued dependencies with queries

Víctor Lavín Puente Montserrat Hermo 《Information Processing Letters》2011,111(19):968-972

Data dependencies are useful to design relational databases. There is a strong connection between dependencies and some fragments of the propositional logic. In particular, functional dependencies are closely related to Horn formulas. Also, multivalued dependencies are characterized in terms of multivalued formulas. It is known that both Horn formulas and sets of functional dependencies are learnable in the exact model of learning with queries. Here we proof that neither multivalued formulas nor multivalued dependencies can be learned using only membership queries or only equivalence queries. 相似文献

12.

Reverse engineering database queries from examples: State-of-the-art,challenges, and research opportunities

《Information Systems》2019

With the popularization of data access and usage, an increasing number of users without expert knowledge of databases is required to perform data interactions. Often, these users face the challenges of writing and reformulating database queries, which consume a considerable amount of time and frequently yield unsatisfactory results. To facilitate this human–database interaction, researchers have investigated the Query By Example (QBE) paradigm in which database queries are (semi) automatically discovered from data examples given by users. This paradigm allows non-database experts to formulate queries without relying on complex query languages. In this context, this work aims to present a systematic review of the recent developments, open challenges, and research opportunities of the QBE reported in the literature. This work also describes strategies employed to leverage efficient example acquisition and query reverse engineering. The obtained results show that recent research developments have focused on enhancing the expressiveness of produced queries, minimizing user interaction, and enabling efficient query learning in the context of data retrieval, exploration, integration, and analytics. Our findings indicate that future research should concentrate efforts to provide innovative solutions to the challenges of improving controllability and transparency, considering diverse user preferences in the processes of learning personalized queries, ensuring data quality, and improving the support of additional SQL features and operators. 相似文献

13.

Indexing views to route queries in a PDMS

Lefteris Sidirourgos George Kokkinidis Theodore Dalamagas Vassilis Christophides Timos Sellis 《Distributed and Parallel Databases》2008,23(1):45-68

P2P computing gains increasing attention lately, since it provides the means for realizing computing systems that scale to very large numbers of participating peers, while ensuring high autonomy and fault-tolerance. Peer Data Management Systems (PDMS) have been proposed to support sophisticated facilities in exchanging, querying and integrating (semi-)structured data hosted by peers. In this paper, we are interested in routing graph queries in a very large PDMS, where peers advertise their local bases using fragments of community RDF/S schemes (i.e., views). We introduce an original encoding for these fragments, in order to efficiently check whether a peer view is subsumed by a query. We rely on this encoding to design an RDF/S view lookup service featuring a statefull and a stateless execution over a DHT-based P2P infrastructure. We finally evaluate experimentally our system to demonstrate its scalability for very large P2P networks and arbitrary RDF/S schema fragments, and to estimate the number of routing hops required by the two versions of our lookup service. Work done when T. Dalamagas was a postdoc researcher in NTUA. 相似文献

14.

Concept-based querying in mediator systems 总被引：1，自引：0，他引：1

Kai-Uwe Sattler Ingolf Geist Eike Schallehn 《The VLDB Journal The International Journal on Very Large Data Bases》2005,14(1):97-111

One approach to overcoming heterogeneity as a part of data integration in mediator systems is the use of metadata in the form of a vocabulary or ontology to represent domain knowledge explicitly. This requires including this meta level during query formulation and processing. In this paper, we address this problem in the context of a mediator that uses a concept-based integration model and an extension of the XQuery language called CQuery. This mediator has been developed as part of a project for integrating data about cultural assets. We describe the language extensions and their semantics as well as the rewriting and evaluation steps. Furthermore, we discuss aspects of caching and keyword-based search in support of an efficient query formulation and processing.Received: 23 December 2002, Accepted: 15 September 2003, Published online: 6 February 2004Edited by: V. Atluri. 相似文献

15.

Adaptive processing of historical spatial range queries in peer-to-peer sensor networks 总被引：1，自引：0，他引：1

Alexandru Coman Joerg Sander Mario A. Nascimento 《Distributed and Parallel Databases》2007,22(2-3):133-163

We investigate the problem of processing historical queries on a sensor network. Since data is considered to have been already collected at the sensor nodes, the main issue is exploring the spatial component of the query in order to minimize its cost represented by the energy consumption. We assume queries can be issued at any network node, i.e., there is no central base station and all nodes have only local knowledge of the network. On the one hand, a globally optimum query processing plan is desirable but its construction is not possible due to the lack of global knowledge of the network. On the other hand, while a simple network flooding is feasible, it is not a practical choice from a cost perspective. To address this problem we propose a two-phase query processing strategy, where in the first phase a path from the query originator to the query region is found and in the second phase the query is processed within the query region itself. This strategy is supported by analytical models that are used to dynamically select the best processing strategy depending on the query specifics. Our extensive analytical and experimental results show that our analytical models are accurate and that the two-phase strategy is better suited for small to medium sized queries, being up to 10 times more cost effective than a typical network flooding. In addition, the dynamic selection of a query processing technique proved itself capable of always delivering at least as good performance as the most energy efficient strategy for all query sizes. Research supported in part by NSERC Canada. 相似文献

16.

Phenomena – A visual environment for querying heterogenous spatial data

Luca Paolino Monica Sebillo Genoveffa Tortora Giuliana Vitiello Robert Laurini 《Journal of Visual Languages and Computing》2009,20(6):420-436

The need to perform complex analysis and decision making tasks has motivated growing interest in Geographic Information Systems (GIS) as a means to compare different scenarios and simulate the evolution of a phenomenon. However, data and function complexity may critically affect human interaction and system performances during planning and prevention activities. This is especially true when the scenarios of interest involve continuous fields, besides discrete objects.In the present paper we describe the visual environment Phenomena, where continuous and discrete data may be handled through a uniform approach. We illustrate how users’ activity is supported by a visual framework where they can interact with, manipulate and query heterogeneous data, with a very small training effort. A preliminary experimental study suggests that when users perform complex tasks, a higher usability degree may be achieved compared to the adoption of a textual spatial SQL. 相似文献

17.

Modeling mountain pine beetle infestation with an agent-based approach at two spatial scales

Liliana Perez Suzana Dragicevic 《Environmental Modelling & Software》2010,25(2):223-236

Extensive outbreaks of tree-killing insects have been occurring in many parts of North America, including the province of British Columbia, raising concerns about the health of pine forest ecosystems. The dynamic phenomenon of mountain pine beetle (MPB), Dendroctonus ponderosae Hopkins, infestation outbreaks is an inherent spatial and temporal complex process. Agent-based modeling (ABM) facilitates simulating spatial interactions that describe the ecological context in which insect populations spread. The main objective of this study was to develop a model of the MPB forest infestation dynamics. This spatially explicit model integrates geographic information systems (GISs) and ABM to simulate MPB outbreaks at the tree and landscape scales, providing spatiotemporal information of annual distribution and patterns of MPB outbreaks. This prototype was implemented with geographic data generated from aerial overview surveys carried out by the B.C. Ministry of Forests and Range, for the study site in Kamloops, Canada. Results show the direct influence that vigorous forest stands and trees have on higher breeding rates, and therefore in the MPB population increment at a tree scale, in a period of 5 years. The simulation results at the landscape level help to determine the most probable locations of future MPB infestations in a time frame of 10 years. 相似文献

18.

A fast and robust bulk-loading algorithm for indexing very large digital elevation datasets II. Experimental results

Félix R. Rodríguez Manuel Barrena 《Computers & Geosciences》2011,37(7):814-821

The spatial indexing of eventually all the available topographic information of Earth is a highly valuable tool for different geoscientific application domains. The Shuttle Radar Topography Mission (SRTM) collected and made available to the public one of the world's largest digital elevation models (DEMs). With the aim of providing on easier and faster access to these data by improving their further analysis and processing, we have indexed the SRTM DEM by means of a spatial index based on the kd-tree data structure, called the Q-tree. This paper is the second in a two-part series that includes a thorough performance analysis to validate the bulk-load algorithm efficiency of the Q-tree. We investigate performance measuring elapsed time in different contexts, analyzing disk space usage, testing response time with typical queries, and validating the final index structure balance. In addition, the paper includes performance comparisons with Oracle 11g that helps to understand the real cost of our proposal. Our tests prove that the proposed algorithm outperforms Oracle 11g using around a 9% of the elapsed time, taking six times less storage with more than 96% of page utilization, and getting faster response times to spatial queries issued on 4.5 million points. In addition to this, the behavior of the spatial index has been successfully tested on both an open GIS (VT Builder) and a visualizer tool derived from the previous one. 相似文献

19.

Resource location in large scale heterogeneous and autonomous databases

Athman Bouguettaya Stephen Milliner Roger King 《Journal of Intelligent Information Systems》1995,5(2):145-173

In many large organizations there has been a proliferation of database systems to handle ever increasing volumes of information. In order to explore a potentially huge on-line information space, we must develop an architecture which allows for the dynamic data driven construction of inter-database node relationships in an incremental manner. In this paper we introduce the FINDIT architecture which uses informationmeta-types to provide a basis for such an organization and, consequently, provides a platform for interoperability. A distinction is made between theinformation andinter-node relationship spaces to achieve scalability. Tassili language primitives are used for the incremental building of dynamic inter-node relationships based upon usage considerations. 相似文献

20.

Linking GIS with real-time visualisation for exploration of landscape changes in rural community workshops

Christian Stock Ian D. Bishop 《Virtual Reality》2006,9(4):260-270

To allow rural communities to evaluate possible future landscape scenarios, we have created a portable environment for landscape simulation (envisioning system). The goal of this system is to give communities the opportunity to plan their desired futures. Our system is designed for workshop environments and allows workshop attendees to explore and to interact with representations of virtual landscapes. We are using virtual reality technology to visualise the landscape representations, a geographic information system to allow participants to change the current landscape configuration, and mobile computing devices to allow the attendees to navigate in the virtual landscape, and give feedback and opinions on the landscape changes. Here, we describe the technology that implements the interaction between geographical information systems and real-time rendering needed to achieve real-time visualisation of landscape changes. To achieve this functionality we have programmed two software clients (a renderer and an ESRI ArcMap extension) and a server that handles message flow. The landscape has been divided into management units that each supports one land use type. Using the GIS interface, users can change the land uses associated to the units and the renderer will update the landscape correspondingly in real time. 相似文献