首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 142 毫秒
1.
Blog网页分类与识别技术研究   总被引:2,自引:0,他引:2  
郑德权  张迪  赵铁军  于浩 《通信学报》2007,28(12):156-160
为了找到一种自动将Blog网页区别于其他Web页面的方法,以便针对Blog语料进行内容抽取、对Blog社区进行规律性研究和发现等,针对Blog网页的特点与规律,提出一种根据网页结构和关键字计算相似度的方法识别Blog网页,初步的实验结果表明,达到了较高的识别正确率。  相似文献   

2.
移动互联网广告推荐算法研究   总被引:1,自引:1,他引:0  
移动互联网的飞速发展为广告投放提供了一种新的应用模式,移动广告已成为移动互联网的主要赢利模式之一.基于用户的细分按需投放广告,是移动广告发展的必然趋势.本文提出了移动互联网广告推荐体系的结构,根据WAP页面的所属类别和关键词,在广告库中选择相匹配的广告进行投送.  相似文献   

3.
利用网页来隐秘传递信息,具有重要的研究意义和实用性.提出了双比特的隐藏方法,使网页表格的隐藏容量比已有方法增大了一倍.同时,针对网页存在多个表格结构的特点,对平行表和嵌套表的进出特性进行分析,设计了全面的多表格隐藏算法和进出管理方法,详细分析其变化路线.本算法可以用于各种规则的多表结构和单表结构,多表结构适于任意平行表...  相似文献   

4.
网页结构化信息抽取技术方法研究   总被引:2,自引:0,他引:2  
分析了两种当前主流的网页结构化信息抽取技术方法:基于模版的分装器方法和不依赖模版的基于视觉的网页信息抽取技术方法,并在此基础上实现了一种新的网页结构化信息抽取算法,一定程度上提高了抽取效率和精度。  相似文献   

5.
连晗 《电子设计工程》2012,20(18):104-106
提出了基于网络远程监控系统设计中所涉及的关键技术,采用了基于Internet远程监控实验系统,通过设计,能够接受来自网络服务器验证的远方客户端请求,并能执行远方用户的操作代理请求,允许远方用户通过动态网页查询历史数据。通过远程用户模块采用ActiveX或Java Applet嵌入到网页中的形式,实现远程用户同本地实验台信息的交互,通过与设备服务器的通信实现实验过程,实现网络服务器与动态网页之间交互实现用户的注册与登录。  相似文献   

6.
Focused crawlers (also known as subject-oriented crawlers), as the core part of vertical search engine, collect topic-specific web pages as many as they can to form a subject-oriented corpus for the latter data analyzing or user querying. This paper demonstrates that the popular algorithms utilized at the process of focused web crawling, basically refer to webpage analyzing algorithms and crawling strategies (prioritize the uniform resource locator (URLs) in the queue). Advantages and disadvantages of three crawling strategies are shown in the first experiment, which indicates that the best-first search with an appropriate heuristics is a smart choice for topic-oriented crawling while the depth-first search is helpless in focused crawling. Besides, another experiment on comparison of improved ones (with a webpage analyzing algorithm added) is carried out to verify that crawling strategies alone are not quite efficient for focused crawling and in most cases their mutual efforts are taken into consideration. In light of the experiment results and recent researches, some points on the research tendency of focused crawler algorithms are suggested.  相似文献   

7.
Discovery of Web communities, groups of Web pages sharing common interests, is important for assisting users' information retrieval from the Web. This paper describes a method for visualizing Web communities and their internal structures. visualization of Web communities in the form of graphs enables users to access related pages easily, and it often reflects the characteristics of the Web communities. Since related Web pages are often co-referred from the same Web page, the number of co-occurrences of references in a search engine is used for measuring the relation among pages. Two URLs are given to a search engine as keywords, and the value of the number of pages searched from both URLs divided by the number of pages searched from either URL, which is called the Jaccard coefficient, is calculated as the criteria for evaluating the relation between the two URLs. The value is used for determining the length of an edge in a graph so that vertices of related pages will be located close to each other. Our visualization system based on the method succeeds in clarifying various genres of Web communities, although the system does not interpret the contents of the pages. The method of calculating the Jaccard coefficient is easily processed by computer systems, and it is suitable for visualization using the data acquired from a search engine.  相似文献   

8.
Social media usage among organizations is growing tremendously. Organizations are now building and maintaining social media public pages to improve their social network salience, enhance interest in their organizations, and build relationships with the online public. The majority of the studies on social media usage are based on the individual perspective while some are from the organizational perspective. However, not many studies have investigated the actual impact of social media usage on organizational performance. Therefore, using the qualitative approach, this study investigates the various purposes of social media usage and its impact on organizational performance. This study however, focuses only on the social media managers’ views. The senior managers of six organizations that are using social media are interviewed from which we find that social media is used for various purposes in organizations, such as advertising and promotion, branding, information search, building customer relations and many more. The results also show that social media has a greater impact on the performance of organizations in terms of enhancement in customer relations and customer service activities, improvement in information accessibility and cost reduction in terms of marketing and customer service.  相似文献   

9.
We applied the decision tree algorithm to learn association rules between webpage’s category (pornographic or normal) and the critical features. Based on these rules, we proposed an efficient method of filtering pornographic webpages with the following major advantages: 1) a weighted window-based technique was proposed to estimate for the condition of concept drift for the keywords found recently in pornographic webpages; 2) checking only contexts of webpages without scanning pictures; 3) an incremental learning mechanism was designed to incrementally update the pornographic keyword database.  相似文献   

10.
Aiming at the problem that some information causing harm to the network environment was transmitted through the mirror website so as to bypass the detection,an identification method of malicious mirror website for high-speed network traffic was proposed.At first,fragmented data from the traffic was extracted,and the source code of the webpage was restored.Next,a standardized processing module was utilized to improve the accuracy.Additionally,the source code of the webpage was divided into blocks,and the hash value of each block was calculated by the simhash algorithm.Therefore,the simhash value of the webpage source codes was obtained,and the similarity between the webpage source codes was calculated by the Hamming distance.The page snapshot was then taken and SIFT feature points were extracted.The perceptual hash value was obtained by clustering analysis and mapping processing.Finally,the similarity of webpages was calculated by the perceptual hash values.Experiments under real traffic show that the accuracy of the method is 93.42%,the recall rate is 90.20%,the F value is 0.92,and the processing delay is 20 μs.Through the proposed method,malicious mirror website can be effectively detected in the high-speed network traffic environment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号