首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 15 毫秒
1.
A site-based proxy cache   总被引:4,自引:0,他引:4       下载免费PDF全文
In traditional proxy caches,any visited page from any Web server is cached independently,ignoring connections between pages,And users still have to frequently visity in dexing pages just for reaching useful informative ones,which causes significant waste of caching space and unnecessary Web traffic.In order to solve the above problem,this paper introduced a site graph model to describe WWW and a site-based replacement strategy has been built based on it .The concept of “access frequency“ is developed for evaluating whether a Web page is worth being kept in caching space.On the basis of user‘‘‘‘‘‘‘‘s access history,auxiliary navigation information is provided to help him reach target pages more quickly.Performance test results haves shown that the proposed proxy cache system can get higher hit ratio than traditional ones and can reduce user‘‘‘‘‘‘‘‘s access latency effectively.  相似文献   

2.
We propose a new Web information extraction system. The outline of the system and the algorithm to extract information are explained in this paper. A typical Web page consists of multiple elements with different functionalities, such as main content, navigation panels, copyright and privacy notices and advertisements. Visitors to Web pages need only a little of the pages. A system to extract a piece of Web pages is needed, our system enables users to extract Web blocks only by setting clipping areas with their mouse. Web blocks are clickable image maps. Imaging and detecting hyperlink areas on client-side are used to generate image maps. The specialty of our system is that Web blocks perfect layouts and hyperlinks on the original Web pages. Users can access and manage their Web blocks via Evernote, which is a cloud storage system. And HTML snippets for Web blocks enable users to easily reuse Web contents on their own Web site.  相似文献   

3.
Data extraction from the web based on pre-defined schema   总被引:8,自引:1,他引:7       下载免费PDF全文
With the development of the Internet,the World Web has become an invaluable information source for most organizations,However,most documents available from the Web are in HTML form which is originally designed for document formatting with little consideration of its contents.Effectively extracting data from such documents remains a non-trivial task.In this paper,we present a schema-guided approach to extracting data from HTML pages .Under the approach,the user defines a schema specifying what to be extracted and provides sample mappings between the schema and th HTML page.The system will induce the mapping rules and generate a wrapper that takes the HTML page as input and produces the required datas in the form of XML conforming to the use-defined schema .A prototype system implementing the approach has been developed .The preliminary experiments indicate that the proposed semi-automatic approach is not only easy to use but also able to produce a wrapper that extracts required data from inputted pages with high accuracy.  相似文献   

4.
In this paper, we develop an intelligent environment called the personified home service system, which we have implementeA using standard Semantic Web (RDF, OWL, DAML), Web Services (SOAP, WSDL) and pervasive computing (UPnP) technologies. Extending the human-machine interaction, home devices such as sensor, TV and refrigerator could be used as interactive devices not only Mouse and CRT. It offers an incentive to device manufacturers to incorporate Semantic Web technologies into their devices in order to get the benefits of easiness and more flexibility to the users. For extensive intelligence in the system, the Semantic Web can assist the evolution of human knowledge as a whole. We analyze user's daily records and predict the user's interests, and find user's potential interest through feedbacks, The Semantic Webs will bring structure to the meaningful content of Web pages, and create an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.  相似文献   

5.
A rapidly increasing number of Web databases are now become accessible via their HTML form-based query interfaces. Query result pages are dynamically generated in response to user queries, which encode structured data and are displayed for human use. Query result pages usually contain other types of information in addition to query results, e.g., advertisements, navigation bar etc. The problem of extracting structured data from query result pages is critical for web data integration applications, such as comparison shopping, meta-search engines etc, and has been intensively studied. A number of approaches have been proposed. As the structures of Web pages become more and more complex, the existing approaches start to fail, and most of them do not remove irrelevant contents which may affect the accuracy of data record extraction. We propose an automated approach for Web data extraction. First, it makes use of visual features and query terms to identify data sections and extracts data records in these sections. We also represent several content and visual features of visual blocks in a data section, and use them to filter out noisy blocks. Second, it measures similarity between data items in different data records based on their visual and content features, and aligns them into different groups so that the data in the same group have the same semantics. The results of our experiments with a large set of Web query result pages in di?erent domains show that our proposed approaches are highly effective.  相似文献   

6.
The number of Internet users and the number of web pages being added to WWW increase dramatically every day.It is therefore required to automatically and e?ciently classify web pages into web directories.This helps the search engines to provide users with relevant and quick retrieval results.As web pages are represented by thousands of features,feature selection helps the web page classifiers to resolve this large scale dimensionality problem.This paper proposes a new feature selection method using Ward s minimum variance measure.This measure is first used to identify clusters of redundant features in a web page.In each cluster,the best representative features are retained and the others are eliminated.Removing such redundant features helps in minimizing the resource utilization during classification.The proposed method of feature selection is compared with other common feature selection methods.Experiments done on a benchmark data set,namely WebKB show that the proposed method performs better than most of the other feature selection methods in terms of reducing the number of features and the classifier modeling time.  相似文献   

7.
A system and method of saving a Web page from a Website on an Internet to a computer-readable medium is disclosed.A Web page is downloaded from the Intemet to the computer-readable medium.The Internet address for the Web page isstored on the computer—readable medium When the Web pageis opened from the computer-readable medium the Internetaddress is used to identify a security context for the Web page.By using the Internet address to identify the security context forthe Web page,the system and method of the present inventionallows users to securely view and execute Web pages down-loaded from the Internet.  相似文献   

8.
To develop logistics customers' potential demands for logistics services, and raise the level of logistics enterprise services, the research for customer segmentation has become a primitive work of logistics enterprises in order to run a differentiated customers' marketing. Through the use of clustering algorithm, this paper presented a segmentation modeling for differentiating customers in logistics industry. Firstly, based on attribute reduction, redundant properties were simplified in the complex data mining under variable parameters in order to improve the quality and efficiency of the modeling, and then the customer segmentation model was constructed through unsupervised clustering K-Means algorithm. It was verified that the logistics users have the obvious differentiation of characteristics by using the cluster model. And a logistics enterprise achieved significant benefits with application of the model in the differentiated data service marketing.  相似文献   

9.
There exists a gap between control theory and control practice, i.e., all control methods suggested by researchers are not implemented in real systems and, on the other hand, many important industrial problems are not studied in the academic research. Benchmark problems can help close this gap and provide many opportunities for members in both the controls theory and application communities. The goal is to survey and give pointers to different general controls and modeling related benchmark problems that can serve as inspiration for future benchmarks and then specifically focus the benchmark coverage on automotive control engineering application. In the paper reflections are given on how different categories of benchmark designers, benchmark solvers and third part users can benefit from providing, solving, and studying benchmark problems. The paper also collects information about several benchmark problems and gives pointers to papers than give more detailed information about different problems that have been presented.  相似文献   

10.
With the exploration of video data, it is difficult for people to navigate freely and make full use of amount of video data efficiently in order to get useful information or knowledge with an aim to fulfill special tasks such as visual analysis. In this paper, we propose a novel sketch based interface which is called StroyMap for video content representation and navigation under the guideline of distributed cognition. Instead of emphasizing on details, the sketches visualized the essential content of video effectively. In order to meet different requirement for video exploration tasks, StoryMap provides users a variety of navigation ways, such as path redirection, map tagging, zoom in and zoom out. During the whole interaction process, the system will collect user''s operation list, construct user model as a guide to understand interaction between people and computer. Experimental evaluation results show that StoryMap plays more effiective role in conveying and navigating video content compared with existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号