期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances 总被引：1，自引：0，他引：1

Ada Wai-chee Fu Polly Mei-shuen Chan Yin-Ling Cheung Yiu Sang Moon 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(2):154-173

Abstract. For some multimedia applications, it has been found that domain objects cannot be represented as feature vectors in a multidimensional space. Instead, pair-wise distances between data objects are the only input. To support content-based retrieval, one approach maps each object to a k-dimensional (k-d) point and tries to preserve the distances among the points. Then, existing spatial access index methods such as the R-trees and KD-trees can support fast searching on the resulting k-d points. However, information loss is inevitable with such an approach since the distances between data objects can only be preserved to a certain extent. Here we investigate the use of a distance-based indexing method. In particular, we apply the vantage point tree (vp-tree) method. There are two important problems for the vp-tree method that warrant further investigation, the n-nearest neighbors search and the updating mechanisms. We study an n-nearest neighbors search algorithm for the vp-tree, which is shown by experiments to scale up well with the size of the dataset and the desired number of nearest neighbors, n. Experiments also show that the searching in the vp-tree is more efficient than that for the -tree and the M-tree. Next, we propose solutions for the update problem for the vp-tree, and show by experiments that the algorithms are efficient and effective. Finally, we investigate the problem of selecting vantage-point, propose a few alternative methods, and study their impact on the number of distance computation. Received June 9, 1998 / Accepted January 31, 2000 相似文献

2.

Deterministic broadcasting in ad hoc radio networks

Bogdan S. Chlebus Leszek Gasieniec Alan Gibbons Andrzej Pelc Wojciech Rytter 《Distributed Computing》2002,15(1):27-38

We consider the problem of distributed deterministic broadcasting in radio networks of unknown topology and size. The network is synchronous. If a node u can be reached from two nodes which send messages in the same round, none of the messages is received by u. Such messages block each other and node u either hears the noise of interference of messages, enabling it to detect a collision, or does not hear anything at all, depending on the model. We assume that nodes know neither the topology nor the size of the network, nor even their immediate neighborhood. The initial knowledge of every node is limited to its own label. Such networks are called ad hoc multi-hop networks. We study the time of deterministic broadcasting under this scenario. For the model without collision detection, we develop a linear-time broadcasting algorithm for symmetric graphs, which is optimal, and an algorithm for arbitrary n-node graphs, working in time . Next we show that broadcasting with acknowledgement is not possible in this model at all. For the model with collision detection, we develop efficient algorithms for broadcasting and for acknowledged broadcasting in strongly connected graphs. Received: January 2000 / Accepted: June 2001 相似文献

3.

Distance-based outliers: algorithms and applications 总被引：20，自引：0，他引：20

Edwin M. Knorr Raymond T. Ng Vladimir Tucakov 《The VLDB Journal The International Journal on Very Large Data Bases》2000,8(3-4):237-253

This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the analysis of performance statistics of professional athletes. Existing methods that we have seen for finding outliers can only deal efficiently with two dimensions/attributes of a dataset. In this paper, we study the notion of DB (distance-based) outliers. Specifically, we show that (i) outlier detection can be done efficiently for large datasets, and for k-dimensional datasets with large values of k (e.g., ); and (ii), outlier detection is a meaningful and important knowledge discovery task. First, we present two simple algorithms, both having a complexity of , k being the dimensionality and N being the number of objects in the dataset. These algorithms readily support datasets with many more than two attributes. Second, we present an optimized cell-based algorithm that has a complexity that is linear with respect to N, but exponential with respect to k. We provide experimental results indicating that this algorithm significantly outperforms the two simple algorithms for . Third, for datasets that are mainly disk-resident, we present another version of the cell-based algorithm that guarantees at most three passes over a dataset. Again, experimental results show that this algorithm is by far the best for . Finally, we discuss our work on three real-life applications, including one on spatio-temporal data (e.g., a video surveillance application), in order to confirm the relevance and broad applicability of DB outliers. Received February 15, 1999 / Accepted August 1, 1999 相似文献

4.

Defect detection in textured surfaces using color ring-projection correlation 总被引：3，自引：0，他引：3

Du-Ming Tsai Ya-Hui Tsai 《Machine Vision and Applications》2003,13(4):194-200

In this paper, we present a correlation scheme that incorporates a color ring-projection representation for the automatic inspection of defects in textured surfaces. The proposed color ring projection transforms a 2-D color image into a 1-D color pattern as a function of radius. For a search window of width W, data dimensionality is reduced from in the 2-D image to O(W) in the 1-D ring-projection space. The complexity of computing a correlation function is significantly reduced accordingly. Since the color ring-projection representation is invariant to rotation, the proposed method can be applied for both isotropic and oriented textures at arbitrary orientations. Experiments on regular textured surfaces have shown the efficacy of the proposed method. Received: 30 March 2000 / Accepted: 24 July 2001 Correspondence to: D.-M. Tsai (e-mail: iedmtsai@saturn.yzu.edu.tw) 相似文献

5.

Optimizing multiple dimensional queries simultaneously in multidimensional databases 总被引：1，自引：0，他引：1

Weifa Liang Maria E. Orlowska Jeffrey X. Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2000,8(3-4):319-338

Some significant progress related to multidimensional data analysis has been achieved in the past few years, including the design of fast algorithms for computing datacubes, selecting some precomputed group-bys to materialize, and designing efficient storage structures for multidimensional data. However, little work has been carried out on multidimensional query optimization issues. Particularly the response time (or evaluation cost) for answering several related dimensional queries simultaneously is crucial to the OLAP applications. Recently, Zhao et al. first exploited this problem by presenting three heuristic algorithms. In this paper we first consider in detail two cases of the problem in which all the queries are either hash-based star joins or index-based star joins only. In the case of the hash-based star join, we devise a polynomial approximation algorithm which delivers a plan whose evaluation cost is $ O(n^{\epsilon }$) times the optimal, where n is the number of queries and is a fixed constant with . We also present an exponential algorithm which delivers a plan with the optimal evaluation cost. In the case of the index-based star join, we present a heuristic algorithm which delivers a plan whose evaluation cost is n times the optimal, and an exponential algorithm which delivers a plan with the optimal evaluation cost. We then consider a general case in which both hash-based star-join and index-based star-join queries are included. For this case, we give a possible improvement on the work of Zhao et al., based on an analysis of their solutions. We also develop another heuristic and an exact algorithm for the problem. We finally conduct a performance study by implementing our algorithms. The experimental results demonstrate that the solutions delivered for the restricted cases are always within two times of the optimal, which confirms our theoretical upper bounds. Actually these experiments produce much better results than our theoretical estimates. To the best of our knowledge, this is the only development of polynomial algorithms for the first two cases which are able to deliver plans with deterministic performance guarantees in terms of the qualities of the plans generated. The previous approaches including that of [ZDNS98] may generate a feasible plan for the problem in these two cases, but they do not provide any performance guarantee, i.e., the plans generated by their algorithms can be arbitrarily far from the optimal one. Received: July 21, 1998 / Accepted: August 26, 1999 相似文献

6.

Nearly optimal perfectly periodic schedules

Amotz Bar-Noy Aviv Nisgav Boaz Patt-Shamir 《Distributed Computing》2002,15(4):207-220

We consider the problem of scheduling a set of pages on a single broadcast channel using time-multiplexing. In a perfectly periodic schedule, time is divided into equal size slots, and each page is transmitted in a time slot precisely every fixed interval of time (the period of the page). We study the case in which each page i has a given demand probability , and the goal is to design a perfectly periodic schedule that minimizes the average time a random client waits until its page is transmitted. We seek approximate polynomial solutions. Approximation bounds are obtained by comparing the costs of a solution provided by an algorithm and a solution to a relaxed (non-integral) version of the problem. A key quantity in our methodology is a fraction we denote by , that depends on the maximum demand probability: . The best known polynomial algorithm to date guarantees an approximation of . In this paper, we develop a tree-based methodology for perfectly periodic scheduling, and using new techniques, we derive algorithms with better bounds. For small values, our best algorithm guarantees approximation of . On the other hand, we show that the integrality gap between the cost of any perfectly periodic schedule and the cost of the fractional problem is at least . We also provide algorithms with good performance guarantees for large values of . Received: December 2001 / Accepted: September 2002 相似文献

7.

The congenial talking philosophers problem in computer networks

Yuh-Jzer Joung 《Distributed Computing》2002,15(3):155-175

Summary. Group mutual exclusion occurs naturally in situations where a resource can be shared by processes of the same group, but not by processes of different groups. For example, suppose data is stored in a CD-jukebox. Then when a disc is loaded for access, users that need data on the disc can concurrently access the disc, while users that need data on a different disc have to wait until the current disc is unloaded. The design issues for group mutual exclusion have been modeled as the Congenial Talking Philosophers problem, and solutions for shared-memory models have been proposed [12,14]. As in ordinary mutual exclusion and many other problems in distributed systems, however, techniques developed for shared memory do not necessary apply to message passing (and vice versa). So in this paper we investigate solutions for Congenial Talking Philosophers in computer networks where processes communicate by asynchronous message passing. We first present a solution that is a straightforward adaptation from Ricart and Agrawala's algorithm for ordinary mutual exclusion. Then we show that the simple modification suffers a severe performance degradation that could cause the system to behave as though only one process of a group can be in the critical section at a time. We then present a more efficient and highly concurrent distributed algorithm for the problem, the first such solution in computer networks. Received: August 2000 / Accepted: November 2001 相似文献

8.

A bin picking system based on depth from defocus 总被引：3，自引：0，他引：3

Ghita Ovidiu Whelan Paul F. 《Machine Vision and Applications》2003,13(4):234-244

It is generally accepted that to develop versatile bin-picking systems capable of grasping and manipulation operations, accurate 3-D information is required. To accomplish this goal, we have developed a fast and precise range sensor based on active depth from defocus (DFD). This sensor is used in conjunction with a three-component vision system, which is able to recognize and evaluate the attitude of 3-D objects. The first component performs scene segmentation using an edge-based approach. Since edges are used to detect the object boundaries, a key issue consists of improving the quality of edge detection. The second component attempts to recognize the object placed on the top of the object pile using a model-driven approach in which the segmented surfaces are compared with those stored in the model database. Finally, the attitude of the recognized object is evaluated using an eigenimage approach augmented with range data analysis. The full bin-picking system will be outlined, and a number of experimental results will be examined. Received: 2 December 2000 / Accepted: 9 September 2001 Correspondence to: O. Ghita 相似文献

9.

Motion compensated film restoration

O. Buisson S. Boukir B. Besserer 《Machine Vision and Applications》2003,13(4):201-212

Motion picture films are susceptible to local degradations such as dust spots. Other deteriorations are global such as intensity and spatial jitter. It is obvious that motion needs to be compensated for before the detection/correction of such local and dynamic defects. Therefore, we propose a hierarchical motion estimation method ideally suited for high resolution film sequences. This recursive block-based motion estimator relies on an adaptive search strategy and Radon projections to improve processing speed. The localization of dust particles then becomes straightforward. Thus, it is achieved by simple inter-frame differences between the current image and motion compensated successive and preceding frames. However, the detection of spatial and intensity jitter requires a specific process taking advantage of the high temporal correlation in the image sequence. In this paper, we present our motion compensation-based algorithms for removing dust spots, spatial and intensity jitter in degraded motion pictures. Experimental results are presented showing the usefulness of our motion estimator for film restoration at reasonable computational costs. Received: 9 July 2000 / Accepted: 13 January 2002 Correspondence to:S. Boukir 相似文献

10.

The BG distributed simulation algorithm

E. Borowsky E. Gafni N. Lynch S. Rajsbaum 《Distributed Computing》2001,14(3):127-146

We present a shared memory algorithm that allows a set of f+1 processes to wait-free “simulate” a larger system of n processes, that may also exhibit up to f stopping failures. Applying this simulation algorithm to the k-set-agreement problem enables conversion of an arbitrary k-fault-tolerant{\it n}-process solution for the k-set-agreement problem into a wait-free k+1-process solution for the same problem. Since the k+1-processk-set-agreement problem has been shown to have no wait-free solution [5,18,26], this transformation implies that there is no k-fault-tolerant solution to the n-process k-set-agreement problem, for any n. More generally, the algorithm satisfies the requirements of a fault-tolerant distributed simulation.\/ The distributed simulation implements a notion of fault-tolerant reducibility\/ between decision problems. This paper defines these notions and gives examples of their application to fundamental distributed computing problems. The algorithm is presented and verified in terms of I/O automata. The presentation has a great deal of interesting modularity, expressed by I/O automaton composition and both forward and backward simulation relations. Composition is used to include a safe agreement\/ module as a subroutine. Forward and backward simulation relations are used to view the algorithm as implementing a multi-try snapshot\/ strategy. The main algorithm works in snapshot shared memory systems; a simple modification of the algorithm that works in read/write shared memory systems is also presented. Received: February 2001 / Accepted: February 2001 相似文献

11.

Performing tasks on synchronous restartable message-passing processors

Bogdan S. Chlebus Roberto De Prisco Alex A. Shvartsman 《Distributed Computing》2001,14(1):49-64

Summary. This work considers the problem of performing t tasks in a distributed system of p fault-prone processors. This problem, called do-all herein, was introduced by Dwork, Halpern and Waarts. The solutions presented here are for the model of computation that abstracts a synchronous message-passing distributed system with processor stop-failures and restarts. We present two new algorithms based on a new aggressive coordination paradigm by which multiple coordinators may be active as the result of failures. The first algorithm is tolerant of stop-failures and does not allow restarts. Its available processor steps (work) complexity is and its message complexity is . Unlike prior solutions, our algorithm uses redundant broadcasts when encountering failures and, for p =t and largef, it achieves better work complexity. This algorithm is used as the basis for another algorithm that tolerates stop-failures and restarts. This new algorithm is the first solution for the do-all problem that efficiently deals with processor restarts. Its available processor steps is , and its message complexity is , wheref is the total number of failures. Received: October 1998 / Accepted: September 2000 相似文献

12.

Can We Ever Escape from Data Overload? A Cognitive Systems Diagnosis

D. D. Woods E. S. Patterson E. M. Roth 《Cognition, Technology & Work》2002,4(1):22-36

Data overload is a generic and tremendously difficult problem that has only grown with each new wave of technological capabilities. As a generic and persistent problem, three observations are in need of explanation: Why is data overload so difficult to address? Why has each wave of technology exacerbated, rather than resolved, data overload? How are people, as adaptive responsible agents in context, able to cope with the challenge of data overload? In this paper, first we examine three different characterisations that have been offered to capture the nature of the data overload problem and how they lead to different proposed solutions. As a result, we propose that (a) data overload is difficult because of the context sensitivity problem – meaning lies, not in data, but in relationships of data to interests and expectations and (b) new waves of technology exacerbate data overload when they ignore or try to finesse context sensitivity. The paper then summarises the mechanisms of human perception and cognition that enable people to focus on the relevant subset of the available data despite the fact that what is interesting depends on context. By focusing attention on the root issues that make data overload a difficult problem and on people’s fundamental competence, we have identified a set of constraints that all potential solutions must meet. Notable among these constraints is the idea that organisation precedes selectivity. These constraints point toward regions of the solution space that have been little explored. In order to place data in context, designers need to display data in a conceptual space that depicts the relationships, events and contrasts that are informative in a field of practice. 相似文献

13.

View management in multimedia databases

K. Selçuk Candan Eric Lemar V.S. Subrahmanian 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(2):131-153

Abstract. Though there has been extensive work on multimedia databases in the last few years, there is no prevailing notion of a multimedia view, nor there are techniques to create, manage, and maintain such views. Visualizing the results of a dynamic multimedia query or materializing a dynamic multimedia view corresponds to assembling and delivering an interactive multimedia presentation in accordance with the visualization specifications. In this paper, we suggest that a non-interactive multimedia presentation is a set of virtual objects with associated spatial and temporal presentation constraints. A virtual object is either an object, or the result of a query. As queries may have different answers at different points in time, scheduling the presentation of such objects is nontrivial. We then develop a probabilistic model of interactive multimedia presentations, extending the non-interactive model described earlier. We also develop a probabilistic model of interactive visualization where the probabilities reflect the user profiles, or the likelihood of certain user interactions. Based on this probabilistic model, we develop three utility-theoretic based types of prefetching algorithms that anticipate how users will interact with the presentation. These prefetching algorithms allow efficient visualization of the query results in accordance with the underlying specification. We have built a prototype system that incorporates these algorithms. We report on the results of experiments conducted on top of this implementation. Received June 10, 1998 / Accepted November 10, 1999 相似文献

14.

Handling message semantics with Generic Broadcast protocols 总被引：1，自引：0，他引：1

F. Pedone A. Schiper 《Distributed Computing》2002,15(2):97-107

Summary. Message ordering is a fundamental abstraction in distributed systems. However, ordering guarantees are usually purely “syntactic,” that is, message “semantics” is not taken into consideration despite the fact that in several cases semantic information about messages could be exploited to avoid ordering messages unnecessarily. In this paper we define the Generic Broadcast problem, which orders messages only if needed, based on the semantics of the messages. The semantic information about messages is introduced by conflict relations. We show that Reliable Broadcast and Atomic Broadcast are special instances of Generic Broadcast. The paper also presents two algorithms that solve Generic Broadcast. Received: August 2000 / Accepted: August 2001 相似文献

15.

I/O scheduling for digital continuous media 总被引：4，自引：0，他引：4

Deepak R. Kenchammana-Hosekote Jaideep Srivastava 《Multimedia Systems》1997,5(4):213-237

A growing set of applications require access to digital video and audio. In order to provide playback of such continuous media (CM), scheduling strategies for CM data servers (CMS) are necessary. In some domains, particularly defense and industrial process control, the timing requirements of these applications are strict and essential to their correct operation. In this paper we develop a scheduling strategy for multiple access to a CMS such that the timing guarantees are maintained at all times. First, we develop a scheduling strategy for the steady state, i.e., when there are no changes in playback rate or operation. We derive an optimal Batched SCAN (BSCAN) algorithm that requires minimum buffer space to schedule concurrent accesses. The scheduling strategy incorporates two key constraints: (1) data fetches from the storage system are assumed to be in integral multiples of the block size, and (2) playback guarantees are ensured for frame-oriented streams when each frame can span multiple blocks. We discuss modifications to the scheduling strategy to handle compressed data like motion-JPEG and MPEG. Second, we develop techniques to handle dynamic changes brought about by VCR-like operations executed by applications. We define a suite of primitive VCR-like operations that can be executed. We show that an unregulated change in the BSCAN schedule, in response to VCR-like operations, will affect playback guarantees. We develop two general techniques to ensure playback guarantees while responding to VCR-like operations: passive and active accumulation. Using user response time as a metric we show that active accumulation algorithms outperform passive accumulation algorithms. An optimal response-time algorithm in a class of active accumulation strategies is derived. The results presented here are validated by extensive simulation studies. 相似文献

16.

Semantic caching of Web queries

Boris Chidlovskii Uwe M. Borghoff 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(1):2-17

Abstract. In meta-searchers accessing distributed Web-based information repositories, performance is a major issue. Efficient query processing requires an appropriate caching mechanism. Unfortunately, standard page-based as well as tuple-based caching mechanisms designed for conventional databases are not efficient on the Web, where keyword-based querying is often the only way to retrieve data. In this work, we study the problem of semantic caching of Web queries and develop a caching mechanism for conjunctive Web queries based on signature files. Our algorithms cope with both relations of semantic containment and intersection between a query and the corresponding cache items. We also develop the cache replacement strategy to treat situations when cached items differ in size and contribution when providing partial query answers. We report results of experiments and show how the caching mechanism is realized in the Knowledge Broker system. Received June 15, 1999 / Accepted December 24, 1999 相似文献

17.

Long lived adaptive splitter and applications

Yehuda Afek Gideon Stupp Dan Touitou 《Distributed Computing》2002,15(2):67-86

Summary. Long-lived and adaptive implementations of mutual exclusion and renaming in the read/write shared memory model are presented. An implementation of a task is adaptive if the step complexity of any operation in the implementation is a function of the number of processes that take steps concurrently with the operation. The renaming algorithm assigns a new unique id in the range to any process whose initial unique name is taken from a set of size N, for an arbitrary N and where k is the number of processes that actually take steps or hold a name while the new name is being acquired. The step complexity of acquiring a new name is , while the step complexity of releasing a name is 1. The space complexity of the algorithm is where n is an upper bound on the number of processes that may be active at the same time (acquiring or holding new names), which could be N in the worst case. Both the system response time and the worst case number of operations per process in the presented mutual-exclusion algorithm are adaptive. Both algorithms rely on the basic building block of a long-lived and adaptive splitter. While the adaptive-splitter satisfies a slightly different set of properties than the Moir-Anderson splitter [MA95], it is adaptive and long-lived. In addition, the new splitter properties enable the construction of a non-blocking long-lived (2k-1)-renaming algorithm (which is optimal in the size of the new name space). We believe that the mechanisms introduced in our splitter implementation are interesting on their own, and might be used in other adaptive and long-lived constructions. Received: March 2000 / Accepted July 2001 相似文献

18.

On fast microscopic browsing of MPEG-compressed video 总被引：1，自引：0，他引：1

Boon-Lock Yeo 《Multimedia Systems》1999,7(4):269-281

MPEG has been established as a compression standard for efficient storage and transmission of digital video. However, users are limited to VCR-like (and tedious) functionalities when viewing MPEG video. The usefulness of MPEG video is presently limited by the lack of tools available for fast browsing, manipulation and processing of MPEG video. In this paper, we first address the problem of rapid access to individual shots and frames in MPEG video. We build upon the compressed-video-processing framework proposed in [1, 8], and propose new and fast algorithms based on an adaptive mixture of approximation techniques for extracting spatially reduced image sequence of uniform quality from MPEG video across different frame types and also under different motion activities in the scenes. The algorithms execute faster than real time on a Pentium personal computer. We demonstrate how the reduced images facilitate fast and convenient shot- and frame-level video browsing and access, shot-level editing and annotation, without the need for frequent decompression of MPEG video. We further propose methods for reducing the auxiliary data size associated with the reduced images through exploitation of spatial and temporal redundancy. We also address how the reduced images lead to computationally efficient algorithms for video analysis based on intra- and inter-shot processing for video database and browsing applications. The algorithms, tools for browsing and techniques for video processing presented in this paper have been used by many in IBM Research on more than 30 h of MPEG-1 video for video browsing and analysis. 相似文献

19.

Concurrency control in hierarchical multidatabase systems 总被引：1，自引：0，他引：1

Sharad Mehrotra Henry F. Korth Avi Silberschatz 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(2):152-172

Over the past decade, significant research has been done towards developing transaction management algorithms for multidatabase systems. Most of this work assumes a monolithic architecture of the multidatabase system with a single software module that follows a single transaction management algorithm to ensure the consistency of data stored in the local databases. This monolithic architecture is not appropriate in a multidatabase environment where the system spans multiple different organizations that are distributed over various geographically distant locations. In this paper, we propose an alternative multidatabase transaction management architecture, where the system is hierarchical in nature. Hierarchical architecture has consequences on the design of transaction management algorithms. An implication of the architecture is that the transaction management algorithms followed by a multidatabase system must be composable– that is, it must be possible to incorporate individual multidatabase systems as elements in a larger multidatabase system. We present a hierarchical architecture for a multidatabase environment and develop techniques for concurrency control in such systems. Edited by R. Sacks-Davis. Received June 27, 1994 / Accepted September 26, 1995 相似文献

20.

An adaptive collect algorithm with applications

Hagit Attiya Arie Fouren Eli Gafni 《Distributed Computing》2002,15(2):87-96

Summary. In a shared-memory distributed system, n independent asynchronous processes communicate by reading and writing to shared variables. An algorithm is adaptive (to total contention) if its step complexity depends only on the actual number, k, of active processes in the execution; this number is unknown in advance and may change in different executions of the algorithm. Adaptive algorithms are inherently wait-free, providing fault-tolerance in the presence of an arbitrary number of crash failures and different processes' speed. A wait-free adaptive collect algorithm with O(k) step complexity is presented, together with its applications in wait-free adaptive alg orithms for atomic snapshots, immediate snapshots and renaming. Received: August 1999 / Accepted: August 2001 相似文献