排序方式: 共有5条查询结果,搜索用时 15 毫秒
1
1.
Data-Intensive Web Sites: Design and Maintenance 总被引:1,自引:0,他引:1
2.
3.
Grammars have exceptions 总被引:9,自引:0,他引:9
4.
We develop a new algorithm for clustering search results. Differently from many other clustering systems that have been recently proposed as a post-processing step for Web search engines, our system is not based on phrase analysis inside snippets, but instead uses latent semantic indexing on the whole document content. A main contribution of the paper is a novel strategy – called dynamic SVD clustering – to discover the optimal number of singular values to be used for clustering purposes. Moreover, the algorithm is such that the SVD computation step has in practice good performance, which makes it feasible to perform clustering when term vectors are available. We show that the algorithm has very good classification performance, and that it can be effectively used to cluster results of a search engine to make them easier to browse by users. The algorithm has being integrated into the Noodles search engine, a tool for searching and clustering Web and desktop documents. 相似文献
5.
This paper develops a database query language called Transducer Datalog motivated by the needs of a new and emerging class of database applications. In these applications, such as text databases
and genome databases, the storage and manipulation of long character sequences is a crucial feature. The issues involved in
managing this kind of data are not addressed by traditional database systems, either in theory or in practice. To address
these issues, we recently introduced a new machine model called a generalized sequence transducer. These generalized transducers extend ordinary transducers by allowing them to invoke other transducers as “subroutines.”
This paper establishes the computational properties of Transducer Datalog, a query language based on this new machine model.
In the process, we develop a hierarchy of time-complexity classes based on the Ackermann function. The lower levels of this
hierarchy correspond to well-known complexity classes, such as polynomial time and hyper-exponential time. We establish a
tight relationship between levels in this hierarchy and the depth of subroutine calls within Transducer Datalog programs.
Finally, we show that Transducer Datalog programs of arbitrary depth express exactly the sequence functions computable in
primitive-recursive time.
Received: 12 March 1998 / 30 August 1999 相似文献
1