首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Online news has become one of the major channels for Internet users to get news. News websites are daily overwhelmed with plenty of news articles. Huge amounts of online news articles are generated and updated everyday, and the processing and analysis of this large corpus of data is an important challenge. This challenge needs to be tackled by using big data techniques which process large volume of data within limited run times. Also, since we are heading into a social-media data explosion, techniques such as text mining or social network analysis need to be seriously taken into consideration.In this work we focus on one of the most common daily activities: web news reading. News websites produce thousands of articles covering a wide spectrum of topics or categories which can be considered as a big data problem. In order to extract useful information, these news articles need to be processed by using big data techniques. In this context, we present an approach for classifying huge amounts of different news articles into various categories (topic areas) based on the text content of the articles. Since these categories are constantly updated with new articles, our approach is based on Evolving Fuzzy Systems (EFS). The EFS can update in real time the model that describes a category according to the changes in the content of the corresponding articles. The novelty of the proposed system relies in the treatment of the web news articles to be used by these systems and the implementation and adjustment of them for this task. Our proposal not only classifies news articles, but it also creates human interpretable models of the different categories. This approach has been successfully tested using real on-line news.  相似文献   

2.
User profiling is an important step for solving the problem of personalized news recommendation. Traditional user profiling techniques often construct profiles of users based on static historical data accessed by users. However, due to the frequent updating of news repository, it is possible that a user’s fine-grained reading preference would evolve over time while his/her long-term interest remains stable. Therefore, it is imperative to reason on such preference evaluation for user profiling in news recommenders. Besides, in content-based news recommenders, a user’s preference tends to be stable due to the mechanism of selecting similar content-wise news articles with respect to the user’s profile. To activate users’ reading motivations, a successful recommender needs to introduce “somewhat novel” articles to users.In this paper, we initially provide an experimental study on the evolution of user interests in real-world news recommender systems, and then propose a novel recommendation approach, in which the long-term and short-term reading preferences of users are seamlessly integrated when recommending news items. Given a hierarchy of newly-published news articles, news groups that a user might prefer are differentiated using the long-term profile, and then in each selected news group, a list of news items are chosen as the recommended candidates based on the short-term user profile. We further propose to select news items from the user–item affinity graph using absorbing random walk model to increase the diversity of the recommended news list. Extensive empirical experiments on a collection of news data obtained from various popular news websites demonstrate the effectiveness of our method.  相似文献   

3.
We propose a new way of browsing bilingual web sites through concurrent browsing with automatic similar-content synchronization and viewpoint retrieval facilities. Our prototype browser system is called the Bilingual Comparative Web Browser (B-CWB) and it concurrently presents bilingual web pages in a way that enables their contents to be automatically synchronized. The B-CWB allows users to browse multiple web news sites concurrently and compare their viewpoint of news articles written in different languages (English and Japanese). Our viewpoint retrieval is based on similar and different detection. We described categorizing pages in terms of viewpoint: the entire similarity, the content difference, and subject difference. Content synchronization means that user operation (scrolling or clicking) on one web page does not necessarily invoke the same operations on the other web page to preserve similarity of content between the multiple web pages. For example, scrolling a web page may invoke passage-level viewpoint retrieval on the other web page. Clicking a web page (and obtaining a new web page) invokes page-level viewpoint retrieval within the other site's pages through the use of an English-Japanese dictionary.  相似文献   

4.
This study examines the relationships among perceived usability before actual use, task completion time, and preference, and the effects of design attributes on user preference for e-commerce web sites. Nine online bookstore web sites were used by ten participants. Results indicate: (1) pre-use usability and task completion time were correlated; (2) the relationship between pre-use usability and preference was greater than that of task completion time and preference; (3) design attribute assessments after actual use were highly intercorrelated; and (4) organizational structure and layout had a greater effect on user preference than aesthetic aspects, such as color and typography. These findings can be used to construct a conceptual framework for understanding user preferences and to develop design guidelines to yield more highly preferred e-commerce web sites. Also, the methodology in this study can be applied to other computerized-applications.  相似文献   

5.
6.
使用基于关键词匹配的方法,分析了 HTML 语言描述的Web文档,提取网页中有用的特征信息,得到两类标记中的内容:一类是网页的全局描述信息,如;另一类起局部修饰作用,强调了网页的部分内容,如.从而提出了基于层次概念的用户模型,并使用向量空间模型方法建立了以突发事件新闻为基础的用户兴趣模型.实验表明,这种方法有一定的可行性.  相似文献   

7.
The problem of expert finding targets on identifying experts with special skills or knowledge for some particular knowledge categories, i.e. knowledge domains, by ranking user authority. In recent years, this problem has become increasingly important with the popularity of knowledge sharing social networks. While many previous studies have examined authority ranking for expert finding, they have a focus on leveraging only the information in the target category for expert finding. It is not clear how to exploit the information in the relevant categories of a target category for improving the quality of authority ranking. To that end, in this paper, we propose an expert finding framework based on the authority information in the target category as well as the relevant categories. Along this line, we develop a scalable method for measuring the relevancies between categories through topic models, which takes consideration of both content and user interaction based category similarities. Also, we provide a topical link analysis approach, which is multiple-category-sensitive, for ranking user authority by considering the information in both the target category and the relevant categories. Finally, in terms of validation, we evaluate the proposed expert finding framework in two large-scale real-world data sets collected from two major commercial Question Answering (Q&A) web sites. The results show that the proposed method outperforms the baseline methods with a significant margin.  相似文献   

8.
Recommending online news articles has become a promising research direction as the Internet provides fast access to real-time information from multiple sources around the world. Many online readers have their own reading preference on news articles; however, a group of users might be interested in similar fascinating topics. It would be helpful to take into consideration the individual and group reading behavior simultaneously when recommending news items to online users. In this paper, we propose PENETRATE, a novel PErsonalized NEws recommendaTion framework using ensemble hieRArchical clusTEring to provide attractive recommendation results. Specifically, given a set of online readers, our approach initially separates readers into different groups based on their reading histories, where each user might be designated to several groups. Once a collection of newly-published news items is provided, we can easily construct a news hierarchy for each user group. When recommending news articles to a given user, the hierarchies of multiple user groups that the user belongs to are merged into an optimal one. Finally a list of news articles are selected from this optimal hierarchy based on the user’s personalized information, as the recommendation result. Extensive empirical experiments on a set of news articles collected from various popular news websites demonstrate the efficacy of our proposed approach.  相似文献   

9.
News personalized recommendation has long been a favorite research in recommender. Previous methods strive to satisfy the users by constructing the users’ preference profiles. Traditionally, most of recent researches use users’ reading history (content based) or access pattern (collaborative filtering based) to recommend newly published news to them. In this way, they only considered the relationship between news articles and the users and ignored the context of news report background. In other words, they fail to provide more useful information with considering the progression of the news story chain. In this paper, we propose to define the quality of a news story chain. Besides, we propose a method to construct a news story chain on a news corpus with date information. At last, we use a greedy selection method for filtering the final recommended news articles with considering accuracy and diversity. In this way, we can provide the news articles for users and meet their requirement: after reading the recommended news, the user gains a better understanding of the progression of the news story they read before. Finally, we designed several experiments compared to the state-of-the-art approaches, and the experimental results show that our proposed method significantly improves the accuracy, diversity and NDCG metrics.  相似文献   

10.
Web Services是一种构建应用程序的应用实体,形成特定条件下的API;同时也是一个可互操作的分布式应用程序平台,并能在所有支持HTTP协议操作系统上实施运行。在网络中服务方提供了一个Web Services平台,该平台不仅提供相关的网络服务,而且会提供一种标准来描述它的服务;而客户可以在网络中其它任何一点调用该服务,并且可以得到足够的信息来得知如何调用。本文设计并开发一款基于Web Services技术的高考服务系统,即在移动终端开发客户端系统,并通过Web Services获取服务器提供的各种信息。在系统中采用多种算法完成模拟志愿填报、高校查询、学习计划等多种模块,并提供高考动态、工具娱乐、心理辅导等多种工具模块,能够帮助考生在高考过程中得到更好的发挥。  相似文献   

11.
Online news articles,as a new format of press releases,have sprung up on the Internet.With its convenience and recency,more and more people prefer to read news online instead of reading the paper-format press releases.However,a gigantic amount of news events might be released at a rate of hundreds,even thousands per hour.A challenging problem is how to efficiently select specific news articles from a large corpus of newly-published press releases to recommend to individual readers,where the selected news items should match the reader’s reading preference as much as possible.This issue refers to personalized news recommendation.Recently,personalized news recommendation has become a promising research direction as the Internet provides fast access to real-time information from multiple sources around the world.Existing personalized news recommendation systems strive to adapt their services to individual users by virtue of both user and news content information.A variety of techniques have been proposed to tackle personalized news recommendation,including content-based,collaborative filtering systems and hybrid versions of these two.In this paper,we provide a comprehensive investigation of existing personalized news recommenders.We discuss several essential issues underlying the problem of personalized news recommendation,and explore possible solutions for performance improvement.Further,we provide an empirical study on a collection of news articles obtained from various news websites,and evaluate the effect of different factors for personalized news recommendation.We hope our discussion and exploration would provide insights for researchers who are interested in personalized news recommendation.  相似文献   

12.
The problem of automatically extracting multiple news attributes from news pages is studied in this paper. Most previous work on web news article extraction focuses only on content. To meet a growing demand for web data integration applications, more useful news attributes, such as title, publication date, author, etc., need to be extracted from news pages and stored in a structured way for further processing. An automatic unified approach to extract such attributes based on their visual features, including independent and dependent visual features, is proposed. Unlike conventional methods, such as extracting attributes separately or generating template-dependent wrappers, the basic idea of this approach is twofold. First, candidates for each news attribute are extracted from the page based on their independent visual features. Second, the true value of each attribute is identified from the candidates based on dependent visual features such as the layout relationships among the attributes. Extensive experiments with a large number of news pages show that the proposed approach is highly effective and efficient.  相似文献   

13.
14.
Following a study that demonstrated that user comments have a strong impact on public opinion, Popular Science magazine decided to disable its user comments option. Prompted by this dramatic decision, this study used an eye-tracking experiment (N = 197) to study the popularity of user comments, and the effects of pre-existing opinions, readership patterns and the tone of user comments on the evaluation of news articles. Despite vast research utilizing eye tracking to study online behavior, very few previous studies engaged with online news consumption. This is the first eye tracking study to test for a correlation between reading user comments and evaluating a news story article. Although more than 40% read the user comments, the most significant and persistent predictor of readers' evaluations of the articles were their pre-existing opinions about the articles' theme, while readership had no effect on the articles' evaluation. Follow-up interviews demonstrate that readers commonly view user comments as a realm characterized by biases and commercialization.  相似文献   

15.
With the development of mobile technology, the users browsing habits are gradually shifted from only information retrieval to active recommendation. The classification mapping algorithm between users interests and web contents has been become more and more difficult with the volume and variety of web pages. Some big news portal sites and social media companies hire more editors to label these new concepts and words, and use the computing servers with larger memory to deal with the massive document classification, based on traditional supervised or semi-supervised machine learning methods. This paper provides an optimized classification algorithm for massive web page classification using semantic networks, such as Wikipedia, WordNet. In this paper, we used Wikipedia data set and initialized a few category entity words as class words. A weight estimation algorithm based on the depth and breadth of Wikipedia network is used to calculate the class weight of all Wikipedia Entity Words. A kinship-relation association based on content similarity of entity was therefore suggested optimizing the unbalance problem when a category node inherited the probability from multiple fathers. The keywords in the web page are extracted from the title and the main text using N-gram with Wikipedia Entity Words, and Bayesian classifier is used to estimate the page class probability. Experimental results showed that the proposed method obtained good scalability, robustness and reliability for massive web pages.  相似文献   

16.
网页检索结果中,用户经常会得到内容相同的冗余页面。提出了一种通过新闻主题要素学习新闻内容的新闻网页去重算法。该方法的基本思想是:首先,抽取新闻要素中关于事件发生的时间和地点短语;然后,通过抽取的时间和地点短语抽取新闻的内容;最终,根据学习的新闻内容通过计算它们的相似度来判断新闻网页的重复度。实验结果表明,该方法能够完成针对新闻内容的新闻网页的去重,并得到较高的查全率和查准率。  相似文献   

17.
Social media has become an important source of information and a medium for following and spreading trends, news, and ideas all over the world. Although determining the subjects of individual posts is important to extract users' interests from social media, this task is nontrivial because posts are highly contextualized and informal and have limited length. To address this problem, we propose a user modeling framework that maps the content of texts in social media to relevant categories in news media. In our framework, the semantic gaps between social media and news media are reduced by using Wikipedia as an external knowledge base. We map term-based features from a short text and a news category into Wikipedia-based features such as Wikipedia categories and article entities. A user's microposts are thus represented in a rich feature space of words. Experimental results show that our proposed method using Wikipedia-based features outperforms other existing methods of identifying users' interests from social media.  相似文献   

18.
19.
将个性化推荐技术运用于新闻阅读应用,以其快速、精准的特点帮助用户快捷获取兴趣新闻,是值得挖掘的研究方向。设计并实现了一种新闻推荐系统,该系统基于用户协同过滤推荐技术,通过收集用户数据,计算阅读耗时因子对用户偏好值进行修正,纳入新闻热度影响并通过热度惩罚用户相似度值;然后基于相似邻居集对用户未阅读的新闻进行Top-N排序得到推荐列表,从而向用户推送其感兴趣的新闻。经测试,原型系统能够实时更新用户兴趣模型,达到推新、推准的效果,各项功能均已达到设计预期目标。  相似文献   

20.
In this paper, we propose a personalized recommendation system for mobile application software (app) to mobile user using semantic relations of apps consumed by users. To do that, we define semantic relations between apps consumed by a specific member and his/her social members using Ontology. Based on the relations, we identify the most similar social members from the reasoning process. The reasoning is explored from measuring the common attributes between apps consumed by the target member and his/her social members. The more attributes shared by them, the more similar is their preference for consuming apps. We also develop a prototype of our system using OWL (Ontology Web Language) by defining ontology-based semantic relations among 50 mobile apps. Using the prototype, we showed the feasibility of our algorithm that our recommendation algorithm can be practical in the real field and useful to analyze the preference of mobile user.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号