首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
More and more content on the Web is generated by users. To organize this information and make it accessible via current search technology, tagging systems have gained tremendous popularity. Especially for multimedia content they allow to annotate resources with keywords (tags) which opens the door for classic text-based information retrieval. To support the user in choosing the right keywords, tag recommendation algorithms have emerged. In this setting, not only the content is decisive for recommending relevant tags but also the user's preferences.In this paper we introduce an approach to personalized tag recommendation that combines a probabilistic model of tags from the resource with tags from the user. As models we investigate simple language models as well as Latent Dirichlet Allocation. Extensive experiments on a real world dataset crawled from a big tagging system show that personalization improves tag recommendation, and our approach significantly outperforms state-of-the-art approaches.  相似文献   

2.
一种面向协作标签系统的图片检索聚类方法   总被引:2,自引:0,他引:2       下载免费PDF全文
为了更有效地进行图片检索,提出了一种面向Web2.0协作标签系统的图片检索聚类方法。该算法首先针对标签空间由于标签表达多样性带来的不一致问题,并通过挖掘标签间的词汇关系实现语义级查询扩展来得到语义可能相关的扩展图片结果集;然后根据标签间的相关度度量选出图片结果集中与查询标签高相关的标签集,接着采用一种自顶向下启发式的图划分算法来自动对次相关标签集进行分类。最后图片结果集即根据标签分类结果被聚类。为验证该方法的效果,从标签图片共享网站Flickr上随机下载了大量真实图片集以及所含带的标签元数据,在已实现的图片检索原型系统PivotBrowser上进行了大量实验,结果证明,该聚类算法能有效解决标签空间存在的标签表达不一致问题和标签查询歧义性问题,能提供更满意的用户检索。  相似文献   

3.
The advent of internet has led to a significant growth in the amount of information available, resulting in information overload, i.e. individuals have too much information to make a decision. To resolve this problem, collaborative tagging systems form a categorization called folksonomy in order to organize web resources. A folksonomy aggregates the results of personal free tagging of information and objects to form a categorization structure that applies utilizes the collective intelligence of crowds. Folksonomy is more appropriate for organizing huge amounts of information on the Web than traditional taxonomies established by expert cataloguers. However, the attributes of collaborative tagging systems and their folksonomy make them impractical for organizing resources in personal environments.This work designs a desktop collaborative tagging (DCT) system that enables collaborative workers to tag their documents. This work proposes an application in patent analysis based on the DCT system. Folksonomy in DCT is built by aggregating personal tagging results, and is represented by a concept space. Concept spaces provide synonym control, tag recommendation and relevant search. Additionally, to protect privacy of authors and to decrease the transmission cost, relations between tagged and untagged documents are constructed by extracting document’s features rather than adopting the full text.Experimental results reveal that the adoption rate of recommended tags for new documents increases by 10% after users have tagged five or six documents. Furthermore, DCT can recommend tags with higher adoption rates when given new documents with similar topics to previously tagged ones. The relevant search in DCT is observed to be superior to keyword search when adopting frequently used tags as queries. The average precision, recall, and F-measure of DCT are 12.12%, 23.08%, and 26.92% higher than those of keyword searching.DCT allows a multi-faceted categorization of resources for collaborative workers and recommends tags for categorizing resources to simplify categorization easier. Additionally, DCT system provides relevance searching, which is more effective than traditional keyword searching for searching personal resources.  相似文献   

4.
In recent years, as the amount of data grows, personal information management has become essential as well as challenging for everyday lives. Tagging, an alternative or complement to classifying into tree-structured directories, allows users to classify a single information item in multiple categories. Due to its flexibility, tagging system has become popular and a number of studies have been conducted. Most of the previous research investigated the quality of tags with various tools such as questionnaires. However, the actual usage behavior of tag-based browsing and retrieval of stored information has rarely been studied. In this study, we examined the effects of tag attributes on the user behavior in browsing self-tagged documents under personal information management settings.

Three attributes, tag commonness, tag frequency and tag position, were identified. A controlled experiment with tasks of tagging and retrieval to trace users’ behavior revealed that the tags with higher tag commonness, higher tag frequency, and lower tag position were more likely to be used. The tags with lower tag commonness and lower tag frequency helped users recognize a desired document among a list of candidates. Among the three attributes, tag position was found the most influential. The findings of this study are expected to enhance the understanding of the quality tags and help information designers in building an effective tagging environment.  相似文献   


5.
随着Web的推广和普及,产生了越来越多的网络数据。 广泛应用了 标签系统 ,以便人们使用搜索技术来组织和使用这些信息。这些数据允许用户使用关键字(标签)注释资源,为传统的基于文本的信息检索提供了方案。为了支持用户选择正确的关键字,标签推荐算法应运而生。提出了一种个性化标签推荐方法,该方法综合了用户的资源标签与标签概率模型。该模型利用了简单语言模型和隐含狄利克雷分配模型,并针对现实世界的大型数据集进行了大量实验。实验表明,该个性化方法改进了标签推荐算法,推荐结果优于传统方法。  相似文献   

6.
7.
In recent years, social Web users have been overwhelmed by the huge numbers of social media available. Consequentially, users have trouble finding social media suited to their needs. To help such users retrieve useful social media content, we propose a new model of tag-based personalized searches to enhance not only retrieval accuracy but also retrieval coverage. By leveraging social tagging as a preference indicator, we build two models: (i) a latent tag preference model that reflects how a certain user has assigned tags similar to a given tag and (ii) a latent tag annotation model that captures how users have tagged a certain tag to resources similar to a given resource. We then seamlessly map the tags onto items, depending on an individual user's query, to find the most desirable content relevant to the user's needs. Experimental results demonstrate that the proposed method significantly outperforms the state-of-the art algorithms and show our method's feasibility for personalized searches in social media services.  相似文献   

8.
王洁  于颜硕  周宽久  侯刚 《计算机科学》2014,41(12):197-201
Web标签有助于用户根据自己特定的兴趣完成信息资源的分类、组织和检索。然而,正是由于协同标记系统特有的公开性、自由化的特点,采用其对信息资源进行描述、组织、分类和检索,存在着信息描述不精确、标签组织混乱和标签语意模糊等问题。在此背景下提出了3种基于特征向量表示法(FVR)的Web标签SOINN聚类算法:基于资源的特征向量表示法、基于其他共现标签的特征向量表示法和基于全集共现标签的特征向量表示法。同时应用MapReduce框架将SOINN算法进行并行化。实验表明,当类中心数量超过2000时,3种分布式聚类FVR算法的召回率和准确度优于原始算法,可获得很好的加速比。从而证明此分布式聚类算法具有很好的可扩展性,可以用于更为海量的Web日志聚类分析系统。  相似文献   

9.
Social tagging systems are widely applied in Web 2.0. Many users use these systems to create, organize, manage, and share Internet resources freely. However, many ambiguous and uncontrolled tags produced by social tagging systems not only worsen users’ experience, but also restrict resources’ retrieval efficiency. Tag clustering can aggregate tags with similar semantics together, and help mitigate the above problems. In this paper, we first present a common co-occurrence group similarity based approach, which employs the ternary relation among users, resources, and tags to measure the semantic relevance between tags. Then we propose a spectral clustering method to address the high dimensionality and sparsity of the annotating data. Finally, experimental results show that the proposed method is useful and efficient.  相似文献   

10.
In many social tagging systems, users can see the tags already created by others. Prior research has shown that this exposure leads users to create tags that are semantically related to the previous ones. We investigate two possible mechanisms through which this occurs, semantic priming and strategic choice. Semantic priming occurs when an existing tag subconsciously primes the user’s mind to suggest semantically related tags. In an experiment, no such effect is found, in contrast to prior research. A follow-up study shows that whether semantic priming occurs depends on whether the person uses others’ previously created tags or is just passively exposed to them. The second type of influence, strategic choice, occurs in ESP-type settings. It refers to behavior in which a user chooses words that are semantically related to an existing tag in order to increase the chances of matching one’s partner. Experimental results provide clear evidence of this strategic influence. In a follow-up study, we demonstrate that there is a meaningful difference in the tag sets that are created under the influence of strategic choice. Our work sheds light on the conditions and mechanisms through which existing tags influence subsequent tagging behavior.  相似文献   

11.
Social online communities and platforms play a significant role in the activities of software developers either as an integral part of the main activities or through complimentary knowledge and information sharing. As such techniques become more prevalent resulting in a wealth of shared information, the need to effectively organize and sift through the information becomes more important. Top-down approaches such as formal hierarchical directories have shown to lack scalability to be applicable to these circumstanes. Light-weight bottom-up techniques such as community tagging have shown promise for better organizing the available content. However, in more focused communities of practice, such as software engineering and development, community tagging can face some challenges such as tag explosion, locality of tags and interpretation differences, to name a few. To address these challenges, we propose a semantic tagging approach that benefits from the information available in Wikipedia to semantically ground the tagging process and provide a methodical approach for tagging social software engineering content. We have shown that our approach is able to provide high quality tags for social software engineering content that can be used not only for organizing such content but also for making meaningful and relevant content recommendation to the users both within a local community and also across multiple social online communities. We have empirically validated our approach through four main research questions. The results of our observations show that the proposed approach is quite effective in organizing social software engineering content and making relevant, helpful and novel content recommendations to software developers and users of social software engineering communities.  相似文献   

12.
Social tagging systems leverage social interoperability by facilitating the searching, sharing, and exchanging of tagging resources. A major drawback of existing social tagging systems is that social tags are used as keywords in keyword-based search. They focus on keywords and human interpretability rather than on computer interpretable semantic knowledge. Therefore, social tags are useful for information sharing and organizing, but they lack the computer-interpretability needed to facilitate a personalized social tag recommendation. An interesting issue is how to automatically generate a personalized social tag recommendation list to users when a resource is accessed by users. The novel solution proposed in this study is a hybrid approach based on semantic tag-based resource profile and user preference to provide personalized social tag recommendation. Experiments show that the Precision and Recall of the proposed hybrid approach effectively improves the accuracy of social tag recommendation.  相似文献   

13.
Automatic image tagging automatically assigns image with semantic keywords called tags, which significantly facilitates image search and organization. Most of present image tagging approaches are constrained by the training model learned from the training dataset, and moreover they have no exploitation on other type of web resource (e.g., web text documents). In this paper, we proposed a search based image tagging algorithm (CTSTag), in which the result tags are derived from web search result. Specifically, it assigns the query image with a more comprehensive tag set derived from both web images and web text documents. First, a content-based image search technology is used to retrieve a set of visually similar images which are ranked by the semantic consistency values. Then, a set of relevant tags are derived from these top ranked images as the initial tag set. Second, a text-based search is used to retrieve other relevant web resources by using the initial tag set as the query. After the denoising process, the initial tag set is expanded with other tags mined from the text-based search result. Then, an probability flow measure method is proposed to estimate the probabilities of the expanded tags. Finally, all the tags are refined using the Random Walk with Restart (RWR) method and the top ones are assigned to the query images. Experiments on NUS-WIDE dataset show not only the performance of the proposed algorithm but also the advantage of image retrieval and organization based on the result tags.  相似文献   

14.
Self-organizing maps (SOM) have been applied on numerous data clustering and visualization tasks and received much attention on their success. One major shortage of classical SOM learning algorithm is the necessity of predefined map topology. Furthermore, hierarchical relationships among data are also difficult to be found. Several approaches have been devised to conquer these deficiencies. In this work, we propose a novel SOM learning algorithm which incorporates several text mining techniques in expanding the map both laterally and hierarchically. On training a set of text documents, the proposed algorithm will first cluster them using classical SOM algorithm. We then identify the topics of each cluster. These topics are then used to evaluate the criteria on expanding the map. The major characteristic of the proposed approach is to combine the learning process with text mining process and makes it suitable for automatic organization of text documents. We applied the algorithm on the Reuters-21578 dataset in text clustering and categorization tasks. Our method outperforms two comparing models in hierarchy quality according to users’ evaluation. It also receives better F1-scores than two other models in text categorization task.  相似文献   

15.
Learning Social Tag Relevance by Neighbor Voting   总被引:2,自引:0,他引:2  
Social image analysis and retrieval is important for helping people organize and access the increasing amount of user tagged multimedia. Since user tagging is known to be uncontrolled, ambiguous, and overly personalized, a fundamental problem is how to interpret the relevance of a user-contributed tag with respect to the visual content the tag is describing. Intuitively, if different persons label visually similar images using the same tags, these tags are likely to reflect objective aspects of the visual content. Starting from this intuition, we propose in this paper a neighbor voting algorithm which accurately and efficiently learns tag relevance by accumulating votes from visual neighbors. Under a set of well-defined and realistic assumptions, we prove that our algorithm is a good tag relevance measurement for both image ranking and tag ranking. Three experiments on 3.5 million Flickr photos demonstrate the general applicability of our algorithm in both social image retrieval and image tag suggestion. Our tag relevance learning algorithm substantially improves upon baselines for all the experiments. The results suggest that the proposed algorithm is promising for real-world applications.  相似文献   

16.
在社会化标记系统中,常采用聚类等数据挖掘技术来解决标签冗余和语意模糊的问题.现有标签聚类算法大多根据不同标签在对象中共同出现的次数来计算它们之间的相似度,但是这种方法聚类的精确度与召回率并不高.针对此问题,提出一种新的标签聚类算法,充分考虑标签的标记信息,采用基于对象的特征向量来精确地表征一个标签,根据余弦相似度公式得到较为准确的标签相似度,然后采用K-Means算法将用户标签进行聚类.实验结果表明该算法能够得到更加精确的聚类结果.  相似文献   

17.
18.
Email is one of the most popular forms of communication nowadays, mainly due to its efficiency, low cost, and compatibility of diversified types of information. In order to facilitate better usage of emails and explore business potentials in emailing, various data mining techniques have been applied on email data. In this paper, we present a brief survey of the major research efforts on email mining. To emphasize the differences between email mining and general text mining, we organize our survey on five major email mining tasks, namely spam detection, email categorization, contact analysis, email network property analysis and email visualization. Those tasks are inherently incorporated into various usages of emails. We systematically review the commonly used techniques and also discuss the related software tools available.  相似文献   

19.
本文总结中国科学院用户的主要信息需求,并从数字资源的统一发现、数字资源的关联化和对象化组织、基于知识组织体系的相关性检索技术以及可视化的检索和展示技术方面分析国外相关研究概况,提出集成化可视化知识检索服务平台的体系框架,最后介绍集成化可视化知识检索服务平台的功能实现.  相似文献   

20.
In social tagging system, a user annotates a tag to an item. The tagging information is utilized in recommendation process. In this paper, we propose a hybrid item recommendation method to mitigate limitations of existing approaches and propose a recommendation framework for social tagging systems. The proposed framework consists of tag and item recommendations. Tag recommendation helps users annotate tags and enriches the dataset of a social tagging system. Item recommendation utilizes tags to recommend relevant items to users. We investigate association rule, bigram, tag expansion, and implicit trust relationship for providing tag and item recommendations on the framework. The experimental results show that the proposed hybrid item recommendation method generates more appropriate items than existing research studies on a real-world social tagging dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号