首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于内容的网页信息处理方法   总被引:2,自引:0,他引:2  
提出了一种基于内容的网页信息处理方法:利用顺序滤波直接对网页页面内容进行过滤处理,再根据设置的阈值将过滤后的网页页面变为黑白页面,从而提取出网页中的图像信息。对多个包含图像信息的网页进行处理的结果表明此方法能较有效的提取出网页中的图像信息。基于此方法和网页文本信息提取方法的不同原理,还尝试构建了一种基于内容的网页信息处理系统。  相似文献   

2.
Although many web pages consist of blocks of text surrounded by graphics, there is a lack of valid empirical research to aid the design of this type of page [D. Diaper, P. Waelend, Interact. Comput. 13 (2000) 163]. In particular little is known about the influence of animations on interaction with web pages. Proportion, in particular the Golden Section, is known to be a key determinant of aesthetic quality of objects and aesthetics have recently been identified as a powerful factor in the quality of human–computer interaction [N. Tractinsky, A.S. Katz, D. Ikar, Interact. Comput. 13 (2000) 127]. The current study aimed to establish the relative strength of the effects of graphical display and screen ratio of content and navigation areas in web pages, using an information retrieval task and a split-plot experimental research design. Results demonstrated the effect of screen ratio, but a lack of an effect of graphical display on task performance and two subjective outcome measures. However, there was an effect of graphical display on perceived distraction, with animated display leading to more distraction than static display, t(64) = 2.33. Results are discussed in terms of processes of perception and attention and recommendations for web page design are given.  相似文献   

3.
A knowledge base containing incomplete information in the form of disjunctions and negative information shows difficulties regarding the update operators. In this paper simple and straightforward definitions are given for an ‘adding’ operator (‘+’) and a ‘removing’ operator (‘−’) using Hebrand models.  相似文献   

4.
Matching performance of vehicle icons in graphical and textual formats   总被引:1,自引:0,他引:1  
The current research classified 82 vehicle icons into seven categories (image-related, concept-related, semi-abstract, arbitrary, abbreviation, word, and combined) for their matching accuracy, matching sequence, and matching time. These data can be compared and used as a framework for future icon development. Forty participants, all with a university degree, took part in this experiment. Half of the participants had intensive driving experience, while the other half never driven a car. The results indicated that on average, word icons had a significantly greater matching accuracy than the other icon formats; ranging from 4.7 to 20.8% difference. Regarding the matching sequence, participants matched image-related icons before other icon formats. Arbitrary and combined icons took significantly longer to match than other icon formats by 1.4–6.2 s. Based on the high matching accuracy (86.3%) and high ratings on subjective design features, word format can be used for functions describable using simple English for users with English reading ability. Confusion matrices showed that 63.2% of the misunderstandings were caused by similarity in format or function.  相似文献   

5.
为有效解决Web信息抽取中的主题漂移问题,提出了一种能更准确地反映Web页面信息熵的计算方法--混合熵.该方法把需要计算信息熵的信息块放在多页面网站环境中进行讨论,通过考虑页面内信息对信息熵计算的影响,并同时考虑由模版生成的页面间相同的信息分布的影响,从而保证了信息熵的计算的准确度.用该方法解决信息抽取中信息块的信息熵计算问题,并将仿真结果与其它算法进行比较,结果表明了该方法计算的信息熵的准确度及主题相关信息块与主题无关信息块之间的区分度优于其它方法.  相似文献   

6.
信息在网络上传播具有高时效,低成本等特性,因而越来越多的企业和个人都选择在网上发布商品信息,例如汽车、房产等信息.这些内容多数是以有一定的结构的信息呈现,如表格,但是不同网站的表现形式却大相径庭.提出了以领域本体知识为指导,抽取表格信息表达为主的商品信息,以房产为例,自动集成不同网站的同类服务或产品的信息,用以实现专业检索的功能.  相似文献   

7.
单个页面信息量远远大于特定用户对页面中的信息需求.为快速准确从当前页面中获取特定用户所需求的兴趣信息,提出了页面信息主动检索模型.该检索模型中,根据页面Block特点将当前Web页面转化成信息树,根据用户过去的浏览行为构造用户特征树,挖掘用户特征树产生用户需求信息集,然后从当前页面中检索需求的信息,获取用户兴趣信息集.详述了主动检索的基本原理,给出了相应的算法描述,并通过实验证明了该模型具有可行性.  相似文献   

8.
Group awareness is critical to improving the collaboration efficiency of a group, especially when teammates are geographically separated while working on the web. Previous studies have focused mainly on enhancing the awareness of the current working status of teammates, such as web pages being viewed, or other web activities, and they seldom take into account past working/browsing information, such as web pages visited or past web activities. However, the awareness of this kind of historical information can be useful for group collaboration. In this paper, we propose a novel approach to sharing web page visitation information among teammates. We present the design and implementation of our prototype, named Shared Browsing History. We then describe two user studies in which three groups with eight participants each used the prototype. The results of these studies show that our approach was effective in enhancing participants’ group awareness and improved group collaborative efficiency in programming and software development tasks.  相似文献   

9.
基于语义Web的空间信息共享服务   总被引:3,自引:2,他引:3  
谢储晖  郭达志 《计算机工程与设计》2005,26(10):2674-2676,2680
存取、交换、集成空间信息是Web应用研究的热点之一。然而,Web是设计为人使用的,人们必须浏览、理解、选择和导航Web信息。此外,由于数据间的语义冲突和缺乏集成存取共享的空间信息的工具,空间信息很难得到利用。利用语义Web技术和常用本体,提出了查找互联网上的空间信息的方法。首先介绍了语义Web的主要概念;其次描述了语义冲突解决本体;最后详细地探讨了如何在语义Web上实现空间信息共享服务。  相似文献   

10.
基于Web Service的企业信息集成平台设计   总被引:1,自引:0,他引:1  
对分布、异构、自治和数据变化的管理信息系统进行集成是当务之急,提出基于Web Service的企业信息集成平台解决方案.阐述了信息集成平台的系统架构,通过信息集成平台实现了异构系统间的信息传递和共享.最后结合实际案例,对以上技术的可行性进行了验证.  相似文献   

11.
主要介绍了公共服务信息咨询系统利用网页制作的主要技术及具体的实现过程,包括制作工具的简介、整体的规划工作、利用Dreamweaveu建立和编辑网页,以及其中存在的技术问题。  相似文献   

12.
The number of Internet users and the number of web pages being added to WWW increase dramatically every day.It is therefore required to automatically and e?ciently classify web pages into web directories.This helps the search engines to provide users with relevant and quick retrieval results.As web pages are represented by thousands of features,feature selection helps the web page classifiers to resolve this large scale dimensionality problem.This paper proposes a new feature selection method using Ward s minimum variance measure.This measure is first used to identify clusters of redundant features in a web page.In each cluster,the best representative features are retained and the others are eliminated.Removing such redundant features helps in minimizing the resource utilization during classification.The proposed method of feature selection is compared with other common feature selection methods.Experiments done on a benchmark data set,namely WebKB show that the proposed method performs better than most of the other feature selection methods in terms of reducing the number of features and the classifier modeling time.  相似文献   

13.
网页文本信息自动提取技术综述 *   总被引:2,自引:0,他引:2  
对Web网页文本信息自动提取技术提供了一个较为全面的综述。通过分析在这个领域常用到的三种 信息提取模型和四类机器学习算法的发展,较为全面地阐述了当前主流的网页文本信息自动提取技术,对比了 各种方法的应用范围,最后对于该领域当前的热点问题和发展趋势进行了展望。  相似文献   

14.
基于Web Service进行信息共享和集成的关键技术   总被引:1,自引:0,他引:1  
由于Web Service是基于标准协议和规范(包括HTTP、SOAP、XML、WSDL、UDDI等),并且平台独立和语言独立的,因此Web Service已成为基于Internet网进行信息交换、信息共享、信息集成和互操作的主流技术,在电子商务和电子政务中被广泛应用.详细介绍了基于Web Service进行信息共享和集成的关键技术--Web Service异步调用技术和动态调用技术.通过Web Service异步调用技术,实现了Web Service的分布式并行执行,提高了计算的效率.通过Web Service动态调用技术,使整个软件系统具有可扩充性,并且易于维护,能满足Internet环境动态性的要求.  相似文献   

15.
提出了一种剪枝信息熵增较大结点的信息抽取方法。通过对HTML文档解析来构造DOM树。根据配置过滤掉不需处理的相关内容并建立语义模型树,最后对熵增超过阈值的结点进行剪枝并输出抽取的主题信息页面。初步实验结果验证了用这种方法进行Web页面信息抽取的有效性。方法的数学模型简单可靠,基本不需要人工干预即可完成主题信息抽取。可应用于Web数据挖掘系统以及PDA等移动设备的信息获取方面。  相似文献   

16.
Web信息查询优化的遗传算法   总被引:1,自引:0,他引:1  
为帮助用户在丰富的网络资源中快速、准确地查询到所需要的信息,提出一种基于增强遗传算法的查询优化算法.其基本思想是:把查询种群组织成多个称为小生境的查询子种群,一个小生境用于查询文档空闻的一个区域,规定了相应的基于项权重和相似项的交叉算子、自适应变异算子,并通过引入局部搜索机制来增强算法的局部搜索能力,最后把查询结果依据相关性次序进行合并,并返回给查询用户.实验结果表明,该算法在查询精度和计算速度上均优于常用的查询优化技术。  相似文献   

17.
Online reviews are often accessed by users deciding to buy a product, see a movie, or go to a restaurant. However, most reviews are written in a free-text format, usually with very scant structured metadata information and are therefore difficult for computers to understand, analyze, and aggregate. Users then face the daunting task of accessing and reading a large quantity of reviews to discover potentially useful information. We identified topical and sentiment information from free-form text reviews, and use this knowledge to improve user experience in accessing reviews. Specifically, we focus on improving recommendation accuracy in a restaurant review scenario. We propose methods to derive a text-based rating from the body of the reviews. We then group similar users together using soft clustering techniques based on the topics and sentiments that appear in the reviews. Our results show that using textual information results in better review score predictions than those derived from the coarse numerical star ratings given by the users. In addition, we use our techniques to make fine-grained predictions of user sentiments towards the individual topics covered in reviews with good accuracy.  相似文献   

18.
The publication of different media types, like images, audio and video in the World Wide Web is getting more importance each day. However, searching and locating content in multimedia sites is challenging. In this paper, we propose a platform for the development of multimedia web information systems. Our approach is based on the combination between semantic web technologies and collaborative tagging. Producers can add meta-data to multimedia content associating it with different domain-specific ontologies. At the same time, users can tag the content in a collaborative way. The proposed system uses a search engine that combines both kinds of meta-data to locate the desired content. It will also provide browsing capabilities through the ontology concepts and the developed tags.  相似文献   

19.
李净  郭洪禹 《计算机应用》2012,32(10):2899-2903
针对基于区域的图像检索系统检索精度不高的问题,提出结合文本信息的多示例原型选择算法和反馈标注机制。在示例原型选择时,首先使用文本信息进行正例拓展,然后通过估计负示例分布进行最初示例选择,最后通过示例更新和分类器学习的交替优化获得真的示例原型。相关反馈采用了多策略相结合的主动学习机制,通过信息值控制主动学习策略的自动切换,使系统能够自动选择当前最适合的主动学习策略。实验结果表明,该方法有效且性能优于其他方法。  相似文献   

20.
基于视觉特征的网页正文提取方法研究   总被引:1,自引:0,他引:1  
利用网页的视觉特征和DOM树的结构特性对网页进行分块,并采用逐层分块逐层删减的方法将与正文无关的噪音块删除,从而得到正文块.对得到的正文块运用VIPS算法得到完整的语义块,最后在语义块的基础上提取正文内容.试验表明,这种方法是切实可行的.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号