首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Abstract. Information retrieval typically involves accessing textual information from a database in response to a user's vague information need. Hypertext or hypermedia, on the other hand, involves a user browsing through a database of textual or multimedia information in response to a variety of types of information need. Thus information retrieval can be said to have a searching metaphor while hypertext has a browsing analogy. Initially, these two technologies for information access appear to be very different, almost competitive in nature. In this paper information retrieval systems are briefly reviewed and hypertext systems are also examined. These two techniques for accessing information have been integrated into a prototype system which is described. The system dynamically generates guided tours in response to a user's query and the tour guides the user through the hypertext. Some experiments reporting on the effectiveness of this as an information access strategy are given.  相似文献   

A Fuzzy Approach to Classification of Text Documents   总被引:1,自引:0,他引:1       下载免费PDF全文
This paper discusses the classification problems of text documents. Based on the concept of the proximity degree, the set of words is partitioned into some equivalence classes.Particularly, the concepts of the semantic field and association degree are given in this paper.Based on the above concepts, this paper presents a fuzzy classification approach for document categorization. Furthermore, applying the concept of the entropy of information, the approaches to select key words from the set of words covering the classification of documents and to construct the hierarchical structure of key words are obtained.  相似文献   

Literature instructors are using hypertext to enhance their teaching in a broad variety of ways that includes putting course materials on the WWW; creating online tutorials; using annotated hypertexts in addition to or in lieu of print texts; having students write hypertexts; examining the medium of hypertext as a literary and cultural theme; and studying hypertext fiction in the context of traditional literature classes. The article describes examples of each of these uses of hypertext in teaching literature and provides sources of further examples of and information on using hypertext as a teaching tool in literature classes.Seth R. Katz is Assistant Professor of English at Bradley University in Peoria, IL. His research interests include computer applications in teaching literature and writing, and the grammatical analysis of poetic language. His recent publications include Graduate Programs and Job Training in Profession 95.I presented a version of this article as part of a session on Hypertexts for Teaching Imaginative Literature at the MLA Convention in Chicago, December 29, 1995.  相似文献   

ASP将文本文件作为数据库和文本数据流的两种访问方式   总被引:3,自引:3,他引:0  
介绍了ASP动态网页将文本文件作为数据库和文本数据流的两种访问方式,并讨论了这两种访问方式的结果。  相似文献   

One purported advantage of hypertext systems is the ability to move between semantically related parts of a document (or family of documents). If the document is undergoing frequent modification (for example while an author is writing a book or while a software design stored in the hypertext system is evolving) the question arises as to how to incrementally maintain semantic interconnections in the face of the modifications.

The paper presents an optimal technique for the incremental maintenance of such interconnections as a document evolves. The technique, based on theories of information retrieval based on lexical affinities and theories of incremental computation, updates semantic interconnections as nodes are checked into the hypertext system (either new or as a result of an edit). Because we use the semantic weight of lexical affinities to determine which affinities are meaningful in the global context of the document, introducing a new affinity or changing the weight of an existing affinity can potentially have an effect on any node in the system. The challenge met by our algorithm is to guarantee that despite this potentially arbitrary impact, we still update link information optimally.

Once established the semantic interconnections are used to allow the user to move from node to node based not on rigid connections but instead on dynamically determined semantic interrelationships among the nodes.  相似文献   

一种带潜在类别主题词的简单贝叶斯文本分类器   总被引:1,自引:1,他引:0  
简单贝叶斯器是一种有效的文本分类方法。文中提出一种改进的简单贝叶斯文本分类器。即利用有限次的迭代来提升分类的精度,在迭代过程中,为每个分类实例引入了一个权值系数和一个呆滞系数,经过一次迭代,两个系数就会相应的改变,体现了提升的思想。最终的分类结果则是整个迭代结果的综合。  相似文献   

超文本是一种非结构化的文档.它虽然不支持跨页查询和全文检索,但却是Internet上信息组织与存储的重要方式.提出了一种将超文本转换为结构化数据库的算法.分析了超文本结构化转换的需求,运用图论分析并描述了超文本的转换模型与实现算法.该算法在鲁迅数字图书馆系统中得到了实际应用和验证.  相似文献   

The program browsing problem is discussed, with particular emphasis on a multiple-window user interface and its implications for recording acquired knowledge, navigation, and attention-tracking. Hypertext systems are considered as an implementation of browsing techniques for nonprogram text. A classification scheme for text-viewing systems is offered, and then browsing is discussed as a nonintrusive, static technique for program study.

Multiple techniques are synthesised into a coherent plan for a multiwindow program study tool, based on theories of program browsing and the use of hypertext. A test system, HYBROW, emerged from the plan for studying the application of several hypertext multiple-window techniques to program browsing, especially window replacement. HYBROW is a hypertext, multiple-window program browser. This generic tool is applicable to any source language, although certain aspects of the preprocessing and the hierarchical browser presentation are specific to the C language. The tool permits opening an arbitrary number of text windows into an arbitrary number of files, rapid window switching, multiple-window search, placemarking, automatic screen organisation, and services for the creation, maintenance and production of study notes. An informal usability study was conducted.  相似文献   

知识空间理论提供了一种描述给定知识域结构的方法,它被看作是有效评估学生知识程度的一个基础,但它是一种基于问题的知识空间理论.利用超文本结构和知识空间结构相似的特性,将知识点超文本结构转换为超文本知识空间,使知识空间理论建立在知识点上,并在此超文本知识空间上利用自动机原理实现对学生知识结构的自适应测评过程.给出了基于自动机的自适应测评算法,并进一步分析了算法的有效性和复杂性.  相似文献   

International Journal on Document Analysis and Recognition (IJDAR) - Discourse parsing of scholarly documents is the premise and basis for standardizing the writing of scholarly documents,...  相似文献   

一种基于中心文档的KNN中文文本分类算法   总被引:3,自引:0,他引:3       下载免费PDF全文
在浩瀚的数据资源中,为了实现对特定主题的搜索或提取,文本自动分类技术已经成为目前研究的热点。KNN是一种重要的文本自动分类方法,KNN能够处理大规模数据,且具有较高的稳定性,但面临分类速度较慢的问题。以KNN方法为基础,引入特征项间的语义关系,并根据语义关系进行聚类生成中心文档,减少了KNN要搜索的文档数,提高了分类速度。仿真实验表明,该算法在不损失分类精度的情况下,显著提高了分类的速度。  相似文献   

The conventional hypertext authoring framework compels authors to represent their material as an interconnected network of nodes and links. Apart from the difficulties that this alone entails, the situation with HTML is even more problematic since the author is also responsible for mapping the abstract network model onto the computer file system. This is likely to hinder the widespread adoption of HTML by information owners who are finding it difficult not only to create but also to maintain coherent documents with complex interconnection topologies.In this paper it is argued that familiar document forms such as books, manuals, articles, reports, etc., often contain sufficient structural and cross-referential cues with which to build a rich hypertextual structure. It is shown how this structure can be automatically extracted and then realised as a collection of HTML files which can be explored using generated navigation panels. The conversion process and the advantages of this approach are illustrated with interactive examples using the LaTeX2HTML converter. Other unique features of LaTeX2HTML — mathematical equations and “conditional text” — are also discussed.Allowing authors to work with familiar metaphors and tools without compromising the flexibility afforded to them by the target hypertext system and delivery mechanism is perhaps the main reason for the growing popularity of text to hypertext conversion tools.  相似文献   

现有的从PDF文档抽取文本内容的方法(如PDFBox类库采用的方法)处理速度较低,无法满足高速网络中内容分析的需求,也不能对网络中部分到达的PDF数据包进行流式的处理。为此,提出了基于自动机理论的PDF文本内容抽取方法。该方法通过建立具有层次的关键字自动机,可以快速地抽取完整PDF文档和不完整PDF文档中的文本内容。在中文和英文PDF文档数据集下的实验结果表明,基于自动机理论的PDF文本内容抽取方法耗时仅为PDFBox方法的17%~37%。  相似文献   

Hypertext may represent a new paradigm capable of exploring legal sources within which links are established according to pertinent relationships found between statute texts and case law. However, to discover relevant information in such a network, a browsing mechanism is not enough when faced with a large volume of texts. This paper describes a new retrieval model where documents are represented according to both their content and relationships with other sources of information.  相似文献   

A study is described which examines the effects of two hypertext topologies (hierarchy and non-linear) on navigation performance compared to a linear version of the same document. Subjects used the document to answer 10 questions. After a distraction period, subjects returned to the document to locate five specified nodes. Speed and accuracy measures were taken, and the subjects' own evaluation of their performance was assessed using a questionnaire. The results showed that subjects performed better with the linear text than with the non-linear text, while performance on the hierarchical document fell between these two extremes. Analysis of the questionnaire data confirmed these differences. The results are discussed in terms of their implications for computer-assisted learning systems.  相似文献   

文本主题的自动提取方法研究与实现   总被引:1,自引:0,他引:1  
张其文  李明 《计算机工程与设计》2006,27(15):2744-2746,2766
在深入分析了当前流行的文本主题提取技术和方法的基础上,将语义方法融入统计算法,提出了一种基于统计的主题提取方法,并描述了它的实现过程。该方法利用文档内句子之间的语义相关性,实现了文本主题的自动生成。首先对文本进行切词和分句处理实现信息分割,再结合文本聚类技术对文本句进行聚类实现信息合并,最后从每类中抽取代表句生成文本主题。实验结果表明,该方法是一个有效、实用的方法。  相似文献   

Similarity searching in text databases with multiple field types is still an open problem. We focus our attention on the ‘Community Research and Development Information Service’ (CORDIS) database of the European Union and we evaluate the effectiveness of many text retrieval methods in terms of precision, recall and ranking quality. Our experiments indicate that different field types should be handled by different retrieval methods. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号