首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT

A federated search service does not stand still. Software changes on a fairly predictable schedule but content is constantly in flux as vendors make changes in their products and platforms and libraries and librarians make changes in their selection of products and vendors. It is important to have a plan for distribution of maintenance responsibilities and a workflow that integrates the maintenance of the federated search tool into existing routines. The extent to which these routines can be automated is a focus of this article. doi:10.1300/J136v12n03_05  相似文献   

2.
网站黄页系统是一个自动生成网站黄页目录并以此为基础为用户提供一系列服务的系统。它通过快速收集网络上的教育资源,并自动化地对其进行高质量的分类和信息抽取,形成教育网站黄页,为用户提供浏览、检索等服务。未经过二次开发的黄页系统检索的准确性普遍较低,不适合校园网络的使用.针对普通搜索引擎的固有缺陷,提出了一种应用于新闻检索的搜索引擎,该引擎是利用开源的网络爬虫工具将互联网信息抓取到本地,并利用Lucene开放的API,对特定的信息进行索引和搜索。  相似文献   

3.
Shneiderman  B. 《Software, IEEE》1997,14(2):18-20
Searching textual databases can be confusing for users. Popular search systems for the World Wide Web and stand alone systems typically provide a simple interface: users type in keywords and receive a relevance ranked list of 10 results. This is appealing in its simplicity, but users are often frustrated because search results are confusing or aspects of the search are out of their control. If we are to improve user performance, reduce mistaken assumptions, and increase successful searches, we need more predictable design. To coordinate design practice, we suggest a four-phase framework that would satisfy first time, intermittent, and frequent users accessing a variety of textual and multimedia libraries  相似文献   

4.
不同构件库之间实现互通可以有效提高复用者检索构件的效率,分类是检索的基础。通过建立多个以关键字和本体分类方式的构件库的检索条件转换模式帮助用户从基于这两种分类模式的多个构件库中检索构件,避免用户多次为同一需求构造不同的检索条件,减少复用者的理解成本,提高关键字检索本体构件库的查全率。实验结果证明了该方法的有效性和可行性。  相似文献   

5.
Software reuse is important, especially product reuse. This paper describes a retrieval system for software components, the most popular form of product reuse. The system is distributed and embedded in the web and is based on structured retrieval using a classification schema.After defining the requirements for the system, we first discuss the advanced outside functionalities of the component retrieval system, as its multi-paradigmatic classification approach, the ability to extend/change the schema, the navigational facility through different views, and the system's interface to search engines. Then, the most interesting topics of the system's realization are discussed, as dynamic web page generation and personalization, how the specific environments for different roles are built, how schema modification is handled, and how the system was designed being driven by software for reuse. Some measurements of the system's outside behavior and its convenience for users are given.  相似文献   

6.
Browsing and searching are two prominent paradigms in information retrieval. In the current digital library implementations, exploratory browsing is either not available as an option or commonly presented as an alphabetical listing of chosen categories depending on the scope of the digital collections. In addition, users have to switch between different information spaces for browsing and searching. This research proposes an information retrieval paradigm of integrated faceted browser and direct search interfaces for text-based digital libraries. Experimental results show that compared to a conventional alphabetical browser, the faceted browser can significantly improve the effectiveness (by 30.8%, p = .015) and efficiency (by 11.3%, p = .001) of information retrieval. Also, compared to un-integrated alphabetical browser with direct search interfaces, the integrated faceted browser with direct search interfaces can significantly improve the effectiveness of information retrieval (by 35.7%, p = .03) and bring users greater satisfaction (by 34.8%, p < .03) with the process.  相似文献   

7.
针对目前大部分搜索引擎不能精确识别使用不同查询词而期望获得不同查询结果的情况,该文提出一种基于用户行为模型的搜索引擎的思想,论述其原型系统SEB实现过程和关键技术,其中的行为模型结合人类行为学相关理论对用户访问行为进行分类和表示,对搜索结果进行了处理,实现了个性化搜索。实验表明,经SEB原型系统处理的搜索结果更加符合用户需求。  相似文献   

8.
One of the key promises of the Semantic Web is its potential to enable and facilitate data interoperability. The ability of data providers and application developers to share and reuse ontologies is a critical component of this data interoperability: if different applications and data sources use the same set of well defined terms for describing their domain and data, it will be much easier for them to “talk” to one another. Ontology libraries are the systems that collect ontologies from different sources and facilitate the tasks of finding, exploring, and using these ontologies. Thus ontology libraries can serve as a link in enabling diverse users and applications to discover, evaluate, use, and publish ontologies. In this paper, we provide a survey of the growing—and surprisingly diverse—landscape of ontology libraries. We highlight how the varying scope and intended use of the libraries affects their features, content, and potential exploitation in applications. From reviewing 11 ontology libraries, we identify a core set of questions that ontology practitioners and users should consider in choosing an ontology library for finding ontologies or publishing their own. We also discuss the research challenges that emerge from this survey, for the developers of ontology libraries to address.  相似文献   

9.
SUMMARY

Federated searching software offers much promise to users as a convenient way of accessing the wealth of electronic information resources libraries provide. But metasearching is not the same as Google searching; care must be taken in organizing and presenting search options and results so that they are comprehensible to users. From software installation to usability testing and creating documentation, most of the work of implementation is behind-the-scenes and hidden from most library staff and users; however, the decisions made during implementation greatly affect staff and user experiences with the product as well as its overall utility and usability. Systematic testing of the product is necessary to make informed and defensible decisions. This article details three layers of testing (technical, functional, and usability) recommended during implementation of a federated search product, based on best practices in the literature, metasearch standards, and the authors' own experiences with implementing a locally developed broadcast search system and the federated search system WebFeat.  相似文献   

10.
11.
Both of an automatic classification method for original documents based on image feature and a layout analysis method based on rule hypothesis tree are proposed. Then an intelligent document-filling system by electronizing the original documents, which can be applied to cellphones and pads is designed. When users are filling documents online, information can be automatically input to the financial information system merely by taking photos of the original documents. By this means can not only save time but also ensure the accuracy between the data online and that on the original documents. Experiments show that the accuracy of document classification is 88.38%, the accuracy of document-filling is 87.22%, and it takes 5.042 seconds dealing with per document. The system can be applied to financial, government, libraries, electric power, enterprises and many other industries, which has high economic and application value.  相似文献   

12.
基于内容过滤的个性化搜索算法   总被引:65,自引:0,他引:65       下载免费PDF全文
曾春  邢春晓  周立柱 《软件学报》2003,14(5):999-1004
传统信息检索技术满足了人们一定的需要,但由于其通用的性质,仍然不能满足不同背景、不同目的和不同时期的查询请求.提出了一种基于内容过滤的个性化搜索算法.利用领域分类模型上的概率分布表达了用户的兴趣模型,然后给出了相似性计算和用户兴趣模型更新的方法.对比实验表明,概率模型比矢量空间模型更好地表达了用户的兴趣和变化.  相似文献   

13.
关系数据库上基于非数值属性关键词的模糊查询   总被引:1,自引:1,他引:0  
关系数据库上的关键词查找技术使得用户像使用搜索引擎一样获取数据库中的相关数据.然而,这种技术只实现了精确查询,还不能很好地实现模糊查询.本文通过引进分类学习中的Rocchio算法并对其做小部分修改,用于数据库的关键词查询中,结合不同类型对象之间相异度和相关度的量化计算,每次返回的结果集按照相关度降序排列,实现精确到模糊的查询.如果用户不满意初始查询结果集,利用Rocchio算法经过几次交互,便可不断满足需求.对权值优化的Rocchio算法反馈过程进行了实验测试,结果证明是比较令用户满意的,而且返回的结果集中少量的不相关集合可以提高查询的性能.  相似文献   

14.
15.
The information revolution is bringing people of different backgrounds from around the world into a global information superhighway. The Internet provides a global platform connecting thousands of networks around the world. There is a variety of information available on the Internet for the users. It has been considered as a forum for users to share worldwide information resources. The resources are so vast that many of us really cannot grasp or understand the Internet fully. It has become a ‘global information library’ which allows the users to participate in the group discussion, search for any information, start any discussion with others and so on. It can be considered as a hybrid environment of postal services, citizen's band radio, libraries and neighborhood community centers where we (‘we’ is mainly used in this paper in its generic form) can spend time with our friends (‘our’ is also mainly used generically). Internet users (Internauts) share jokes, gossip in on-line conferences and join special groups to keep abreast of their specific interests. The main objective of this tutorial is to discuss various services on the Internet, their implementations, various Internet tools, and interconnection to the Internet. Other important issues like the Internet addressing, domain name system, IP addressing, etc. are discussed in detail in order to understand some design concepts. A brief list of the different types of browsers for different platforms is given. A discussion on future of the Internet is given via different advances and tools defined to provide security, interconnectivity and other related issues.  相似文献   

16.
An authorization model for a distributed hypertext system   总被引:2,自引:0,他引:2  
Digital libraries support quick and efficient access to a large number of information sources that are distributed but interlinked. As the amount of information to be shared grows, the need to restrict access only to specific users or for specific usage will surely arise. The protection of information in digital libraries, however, is difficult because of the peculiarity of the hypertext paradigm which is generally used to represent information in digital libraries, together with the fact that related data in a hypertext are often distributed at different sites. We present an authorization model for distributed hypertext systems. Our model supports authorizations at different granularity levels, takes into consideration different types of data and the relationships among them, and allows administrative privileges to be delegated  相似文献   

17.
Service-oriented computing (SOC) suggests that the Internet will be an open repository of many modular capabilities realized as web services. Organizations may be able to leverage this SOC paradigm if their employees are able to ubiquitously incorporate such capabilities and their resulting information into their daily practices. It is impractical to assume that human users will be able to manually search vast distributed repositories at real-time. This paper presents an architecture, Software Agent-Based Groupware using E-services (SAGE), that incorporates the use of intelligent agents to integrate human users with web services. SAGE provides background search and discovery approaches, thus enabling human users to exploit service-based capabilities that were previously too time-consuming to locate and integrate. We present a multi-agent system where each agent learns the rule-based preferences of a human user with regards to their current operational “context” and manages the incorporation of relevant web services. Recommended by: Djamal Benslimane and Zakaria Maamar  相似文献   

18.
Choice of a classification algorithm is generally based upon a number of factors, among which are availability of software, ease of use, and performance, measured here by overall classification accuracy. The maximum likelihood (ML) procedure is, for many users, the algorithm of choice because of its ready availability and the fact that it does not require an extended training process. Artificial neural networks (ANNs) are now widely used by researchers, but their operational applications are hindered by the need for the user to specify the configuration of the network architecture and to provide values for a number of parameters, both of which affect performance. The ANN also requires an extended training phase.In the past few years, the use of decision trees (DTs) to classify remotely sensed data has increased. Proponents of the method claim that it has a number of advantages over the ML and ANN algorithms. The DT is computationally fast, make no statistical assumptions, and can handle data that are represented on different measurement scales. Software to implement DTs is readily available over the Internet. Pruning of DTs can make them smaller and more easily interpretable, while the use of boosting techniques can improve performance.In this study, separate test and training data sets from two different geographical areas and two different sensors—multispectral Landsat ETM+ and hyperspectral DAIS—are used to evaluate the performance of univariate and multivariate DTs for land cover classification. Factors considered are: the effects of variations in training data set size and of the dimensionality of the feature space, together with the impact of boosting, attribute selection measures, and pruning. The level of classification accuracy achieved by the DT is compared to results from back-propagating ANN and the ML classifiers. Our results indicate that the performance of the univariate DT is acceptably good in comparison with that of other classifiers, except with high-dimensional data. Classification accuracy increases linearly with training data set size to a limit of 300 pixels per class in this case. Multivariate DTs do not appear to perform better than univariate DTs. While boosting produces an increase in classification accuracy of between 3% and 6%, the use of attribute selection methods does not appear to be justified in terms of accuracy increases. However, neither the univariate DT nor the multivariate DT performed as well as the ANN or ML classifiers with high-dimensional data.  相似文献   

19.
Software defect prediction is aimed to find potential defects based on historical data and software features. Software features can reflect the characteristics of software modules. However, some of these features may be more relevant to the class (defective or non-defective), but others may be redundant or irrelevant. To fully measure the correlation between different features and the class, we present a feature selection approach based on a similarity measure (SM) for software defect prediction. First, the feature weights are updated according to the similarity of samples in different classes. Second, a feature ranking list is generated by sorting the feature weights in descending order, and all feature subsets are selected from the feature ranking list in sequence. Finally, all feature subsets are evaluated on a k-nearest neighbor (KNN) model and measured by an area under curve (AUC) metric for classification performance. The experiments are conducted on 11 National Aeronautics and Space Administration (NASA) datasets, and the results show that our approach performs better than or is comparable to the compared feature selection approaches in terms of classification performance.  相似文献   

20.
Abstract

Librarians should think explicitly about Google users whenever they publish on the Web, and should update their policies and procedures accordingly. The article describes procedures that libraries can adopt to ensure that their publications are optimised for access by users of Google and other Web search engines. The aim of these procedures is to enhance resource discovery and information retrieval, and to enhance the reputation of libraries as valued custodians of published information, as well as exemplars of good practice in information management.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号