首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Document engineering is the computer science discipline that investigates systems for documents in any form and in all media. As with the relationship between software engineering and software, document engineering is concerned with principles, tools and processes that improve our ability to create, manage, and maintain documents (). The ACM Symposium on Document Engineering is an annual meeting of researchers active in document engineering: it is sponsored by ACM by means of the ACM SIGWEB Special Interest Group. In this editorial, we first point to work carried out in the context of document engineering, which are directly related to multimedia tools and applications. We conclude with a summary of the papers presented in this special issue.
Luiz Fernando Gomes SoaresEmail:
  相似文献   

2.
The performance of a focused, or topic-specific Web robot can be improved by taking into consideration the structure of the documents downloaded by the robot. In the case of HTML, document structure is tree-like, defined by nested document elements (tags) and their attributes. By analysing this structure, a robot may use the text of certain HTML elements to prioritise documents for downloading and thus significantly improve the speed of convergence to a topic. Clear separation of the structure-aware document parser from the download scheduler provides flexibility but requires a standard interface and protocol between the two. The paper discusses such an interface in the context of an experimental Web robot, whose speed of convergence to a topic was observed to increase by a factor of 3 to 8, as measured by the number of documents downloaded to reach a given average relevance score.  相似文献   

3.
4.
5.
In the past years we have witnessed Sentiment Analysis and Opinion Mining becoming increasingly popular topics in Information Retrieval and Web data analysis. With the rapid growth of the user-generated content represented in blogs, wikis and Web forums, such an analysis became a useful tool for mining the Web, since it allowed us to capture sentiments and opinions at a large scale. Opinion retrieval has established itself as an important part of search engines. Ratings, opinion trends and representative opinions enrich the search experience of users when combined with traditional document retrieval, by revealing more insights about a subject. Opinion aggregation over product reviews can be very useful for product marketing and positioning, exposing the customers’ attitude towards a product and its features along different dimensions, such as time, geographical location, and experience. Tracking how opinions or discussions evolve over time can help us identify interesting trends and patterns and better understand the ways that information is propagated in the Internet. In this study, we review the development of Sentiment Analysis and Opinion Mining during the last years, and also discuss the evolution of a relatively new research direction, namely, Contradiction Analysis. We give an overview of the proposed methods and recent advances in these areas, and we try to layout the future research directions in the field.  相似文献   

6.
Nowadays, people frequently use different keyword-based web search engines to find the information they need on the web. However, many words are polysemous and, when these words are used to query a search engine, its output usually includes links to web pages referring to their different meanings. Besides, results with different meanings are mixed up, which makes the task of finding the relevant information difficult for the users, especially if the user-intended meanings behind the input keywords are not among the most popular on the web.  相似文献   

7.
This paper takes as its premise that the web is a place of action, not just information, and that the purpose of global data is to serve human needs. The paper presents several component technologies, which together work towards a vision where many small micro-applications can be threaded together using automated assistance to enable a unified and rich interaction. These technologies include data detector technology to enable any text to become a start point of semantic interaction; annotations for web-based services so that they can link data to potential actions; spreading activation over personal ontologies, to allow modelling of context; algorithms for automatically inferring ‘typing’ of web-form input data based on previous user inputs; and early work on inferring task structures from action traces. Some of these have already been integrated within an experimental web-based (extended) bookmarking tool, Snip!t, and a prototype desktop application On Time, and the paper discusses how the components could be more fully, yet more openly, linked in terms of both architecture and interaction. As well as contributing to the goal of an action and activity-focused web, the work also exposes a number of broader issues, theoretical, practical, social and economic, for the Semantic Web.  相似文献   

8.
介绍了采用De Bruiin序列对结构光进行编码,基于全局优化思想对条纹边界进行最优邻域匹配,利用增加约束的动态编程遍历最优匹配路径网格得到最优匹配路径;对畸变条纹图像进行颜色校正,提高了边界检测的准确率.该编码策略解码简单,匹配算法能取得较好效果,得到的点云数据精度能够达到三维表面重建的要求.  相似文献   

9.
WEB半结构化数据查询   总被引:1,自引:0,他引:1  
当前许多大的Web站点的信息和数据呈现结构化或半结构化的特点,因而可经抽象,作为类似关系数据库或面向对象数据库并加以处理,以提高操作效率,特别是在此基础上的查询操作。采用数据模型Araneus的一个子集作为数据模型,并采用连接约束、包含约束、范围约束,提出一种半结构化查询重写的方法,该方法在保证算法正确性和完备性的基础上,利用半结构化数据特点和查询子目标之间的关系,极大地降低了算法的代价。  相似文献   

10.
11.
12.
随着信息化的不断深入和科学技术的提高,数据库技术和网络技术已经帮助企业实现了办公自动化、经营决策管理信息化和生产过程信息化,但是信息量的扩大给信息的采集和长久保存带来了困难,传统的信息处理技术以及Hadoop技术都不能实现海量结构化数据的处理,为了更好地提升企业决策的思维广度和获取信息的完整度,文章"数据服务云平台"进行了研究和分析,这种站在全新的大数据应用高度,对新的技术架构进行探索和研究的方式,能够更为合理的解决企业大数据应用的关键技术难题。  相似文献   

13.
14.
In this paper we address the problem of integrating independent and possibly heterogeneous data warehouses, a problem that has received little attention so far, but that arises very often in practice. We start by tackling the basic issue of matching heterogeneous dimensions and provide a number of general properties that a dimension matching should fulfill. We then propose two different approaches to the problem of integration that try to enforce matchings satisfying these properties. The first approach refers to a scenario of loosely coupled integration, in which we just need to identify the common information between data sources and perform join operations over the original sources. The goal of the second approach is the derivation of a materialized view built by merging the sources, and refers to a scenario of tightly coupled integration in which queries are performed against the view. We also illustrate architecture and functionality of a practical system that we have developed to demonstrate the effectiveness of our integration strategies. A preliminary version this paper appeared, under the title “Integrating Heterogeneous Multidimensional Databases” [9], in 17th Int. Conference on Scientific and Statistical Database Management, 2005.  相似文献   

15.
随着网络应用和企业决策支持系统的需求持续增长,Web数据格式转换的问题成为进一步研究的方向.XML近来已成为Web上数据表示与交换的标准,目前国内外学者己经提出一些基于XML技术的查询语言,XSQL技术就是XML技术扩展了的一种简单易用的查询语言.对XSQL进行了研究和实践,认为XSQL技术的优势为Web技术的发展提供了新的思路.通过使用XSQL技术改进Web的服务功能,解决了数据格式转换的问题,使数据能够按照不同的需求在Web中进行多样式的显示,并以Oracle数据库为例说明了该技术的应用.  相似文献   

16.
随着网络通信技术的快速发展与成本的不断降低,越来越多的信息都被发布到网络上.但是,由于Web数据挖掘比单个数据仓库的挖掘要复杂的多,因而面向Web的数据挖掘成了一个新的课题.介绍了Web数据挖掘的分类以及当前的发展状况,并将XML技术应用在Web数据挖掘中,介绍了一个自动挖掘的模型,应用于股票信息自动采集系统,展示了Web数据自动挖掘方法的可行性与优越性.同时,也指出了Web数据自动挖掘尚存的不足及其发展前景.  相似文献   

17.
统计数据的Web表达研究   总被引:1,自引:0,他引:1  
对常规的基于Web的统计数据可视化表达方法和在Web上的实现进行了分析,提出基于GIS技术的统计数据可视化表达的解决方案,对数据准备、可视化生成过程等进行了详细地分析,给出了一个基于GIS的统计数据可视化表达的实例。  相似文献   

18.
讨论了多种在LAMP(Linux+Apache+Mysqi+PHP)平台下,使用PHP(超文本预处理器)实现对待下载的文件进行加密并在下载后解密的方法,介绍了PHP以及LAMP平台的应用情况;研究和介绍了在LAMP平台下,通过调用PHP内置函数、扩展/类库以及GnuPG(GNU privacy guard)软件实现数据加密解密的方法,并比较分析了不同方法之间的优缺点,给出了相应的示例代码,以及针对各种方法的比较分析和实验结果;讨论了如何选择不同的加密方法来适用于不同的具体环境,并指出了PHP加密的局限性.  相似文献   

19.
Web数据挖掘中数据集成问题的研究   总被引:3,自引:0,他引:3  
在分析Web环境下数据源特点的基础上,对Web数据挖掘中的数据集成问题进行了深入的研究,给出了一个基于XML技术的集成方案.该方案采用Web数据存取方式将不同数据源集成起来,为Web数据挖掘提供了统一有效的数据集,解决了Web异构数据源集成的难题.通过一个具体实例介绍了Web数据集成的过程.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号