首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Online communities can be an attractive source of ideas for product and process innovations. However, innovative user‐contributed ideas may be few. From a perspective of harnessing “big data” for inbound open innovation, the detection of good ideas in online communities is a problem of detecting rare events. Recent advances in text analytics and machine learning have made it possible to screen vast amounts of online information and automatically detect user‐contributed ideas. However, it is still uncertain whether the ideas identified by such systems will also be regarded as sufficiently novel, feasible and valuable by firms who might decide to develop them further. A validation study is reported in which 200 posts from an online home brewing community were extracted by an automatic idea detection system. Two professionals from a brewing company evaluated the posts in terms of idea content, idea novelty, idea feasibility and idea value. The results suggest that the automatic idea detection system is sufficiently valid to be deployed for the harvesting and initial screening of ideas, and that the profile of the identified ideas (in terms of novelty, feasibility and value) follows the same pattern identified in studies of user ideation in general.  相似文献   

2.
Research indicates that creative ideas provide the seed for successful service innovations. However, little attention has been paid to understanding idea creation, especially for service innovations. Lead user analysis has been shown to provide the highest potential to create attractive innovation ideas. But which characteristics in lead users are important in this regard is still under‐researched. In the realm of an idea contest, we examine the impact of specific lead user characteristics in driving the quality of service innovation ideas. Our study broadens the understanding of which customers are suitable and should be activated for service innovation idea contests. Using the data of 120 ideas resulting from an idea contest for new online services of soccer clubs, our findings demonstrate that specific lead user characteristics affect the quality of service ideas generated. We find that dissatisfaction with existing services has the highest impact on idea quality. Thus, companies should make use of their complaint management database to invite dissatisfied users to participate in idea contests. The results also show that highly experienced users provide ideas of higher quality. Our findings imply that companies should design closed‐membership idea contests so that only people who show specific characteristics can be admitted.  相似文献   

3.
Social networking sites such as Facebook or Twitter attract millions of users, who everyday post an enormous amount of content in the form of tweets, comments and posts. Since social network texts are usually short, learning tasks have to deal with a very high dimensional and sparse feature space, in which most features have low frequencies. As a result, extracting useful knowledge from such noisy data is a challenging task, that converts large-scale short-text learning tasks in social environments into one of the most relevant problems in machine learning and data mining. Feature selection is one of the most known and commonly used techniques for reducing the impact of the high dimensional feature space in text learning. A wide variety of feature selection techniques can be found in the literature applied to traditional, long-texts and document collections. However, short-texts coming from the social Web pose new challenges to this well-studied problem as texts’ shortness offers a limited context to extract enough statistical evidence about words relations (e.g. correlation), and instances usually arrive in continuous streams (e.g. Twitter timeline), so that the number of features and instances is unknown, among other problems. This paper surveys feature selection techniques for dealing with short texts in both offline and online settings. Then, open issues and research opportunities for performing online feature selection over social media data are discussed.  相似文献   

4.
针对贸易文本区别于普通文本的不同特性,提出了基于贸易政策文本的主题挖掘模型,对世界贸易组织的贸易政策审议报告进行研究,归纳出文本的主要内容和主题变化趋势,为商务部和中国驻世贸组织使团提供有价值的信息辅助,从而使得快速有效的处理大量的文本成为可能。通过大量的实验,表明了主题挖掘模型的有效性。  相似文献   

5.
This study analyses the online questions and chat messages automatically recorded by a live video streaming (LVS) system using data mining and text mining techniques. We apply data mining and text mining techniques to analyze two different datasets and then conducted an in-depth correlation analysis for two educational courses with the most online questions and chat messages respectively. The study found the discrepancies as well as similarities in the students’ patterns and themes of participation between online questions (student–instructor interaction) and online chat messages (student–students interaction or peer interaction). The results also identify disciplinary differences in students’ online participation. A correlation is found between the number of online questions students asked and students’ final grades. The data suggests that a combination of using data mining and text mining techniques for a large amount of online learning data can yield considerable insights and reveal valuable patterns in students’ learning behaviors. Limitations with data and text mining were also revealed and discussed in the paper.  相似文献   

6.
叶俊民  罗达雄  陈曙 《自动化学报》2020,46(9):1927-1940
当前利用短文本情感信息进行在线学习成绩预测的研究存在以下问题: 1)当前情感分类模型无法有效适应在线学习社区的短文本特征, 分类效果较差; 2)利用短文本情感信息定量预测在线学习成绩的研究在准确性上还有较大的提升空间. 针对以上问题, 本文提出了一种短文本情感增强的成绩预测方法. 首先, 从单词和句子层面建模短文本语义, 并提出基于学习者特征的注意力机制以识别不同学习者的语言表达特点, 得到情感概率分布向量; 其次, 将情感信息与统计、学习行为信息相融合, 并基于长短时记忆网络建模学习者的学习状态; 最后, 基于学习状态预测学习者成绩. 在三种不同类别课程组成的真实数据集上进行了实验, 结果表明本文方法能有效对学习社区短文本进行情感分类, 且能够提升在线学习者成绩预测的准确性. 同时, 结合实例分析说明了情感信息、学习状态与成绩之间的关联.  相似文献   

7.
Research related to online discussions frequently faces the problem of analyzing huge corpora. Natural Language Processing (NLP) technologies may allow automating this analysis. However, the state-of-the-art in machine learning and text mining approaches yields models that do not transfer well between corpora related to different topics. Also, segmenting is a necessary step, but frequently, trained models are very sensitive to the particulars of the segmentation that was used when the model was trained. Therefore, in prior published research on text classification in a CSCL context, the data was segmented by hand. We discuss work towards overcoming these challenges. We present a framework for developing coding schemes optimized for automatic segmentation and context-independent coding that builds on this segmentation. The key idea is to extract the semantic and syntactic features of each single word by using the techniques of part-of-speech tagging and named-entity recognition before the raw data can be segmented and classified. Our results show that the coding on the micro-argumentation dimension can be fully automated. Finally, we discuss how fully automated analysis can enable context-sensitive support for collaborative learning.  相似文献   

8.
文本分类任务作为文本挖掘的核心问题,已成为自然语言处理领域的一个重要课题.而短文本分类由于稀疏性、实时性和不规范性等特点,已成为文本分类亟待解决的问题之一.在某些特定场景,短文本存在大量隐含语义,由此给挖掘有限文本内的隐含语义特征等任务带来挑战.已有的方法对短文本分类主要采用传统机器学习或深度学习算法,但该类算法的模型构建复杂且工作量大,效率不高.此外,短文本包含有效信息较少且口语化严重,对模型的特征学习能力要求较高.针对以上问题,提出了KAe RCNN模型,该模型在TextRCNN模型的基础上,融合了知识感知与双重注意力机制.知识感知包含了知识图谱实体链接和知识图谱嵌入,可以引入外部知识以获取语义特征,同时,双重注意力机制可以提高模型对短文本中有效信息提取的效率.实验结果表明,KAe RCNN模型在分类准确度、F1值和实际应用效果等方面显著优于传统的机器学习算法.对算法的性能和适应性进行了验证,准确率达到95.54%, F1值达到0.901,对比4种传统机器学习算法,准确率平均提高了约14%, F1值提升了约13%.与TextRCNN相比,KAe RCNN模型在准确性方面提升了约3%...  相似文献   

9.
In collaborative crowdsourcing communities for open innovation, users generate and submit ideas as idea co‐creators. Firms then select and implement valuable ideas for new product development. Despite the popularity and success of these open innovation communities, relatively little is known about the factors that determine the implementation of the user‐generated ideas. Based on research on individual creativity, we propose a conceptual model integrating users' previous experience, idea presentation characteristics and feedback valence to explain the likelihood of idea implementation. We validate our research model with a panel data analysis of 43 550 ideas submitted by 16 360 users in the MIUI new product development community hosted by Xiaomi, a large electronics manufacturing company in China. We find an inverted U‐shaped relationship between users' past successful experience and idea implementation. Furthermore, the length of ideas is positively associated with the likelihood of idea implementation. There is also an inverted U‐shaped relationship between supporting evidence and idea implementation. Finally, we demonstrate the negative effect of positive feedback and the positive effect of negative feedback on idea implementation. These findings offer rich insights to understand the phenomenon of open innovation better. Theoretical and practical implications are discussed.  相似文献   

10.
Detecting similarity between texts is a frequently encountered text mining task. Because the measurement of similarity is typically composed of a number of metrics, and some measures are sensitive to subjective interpretation, a generic detector obtained using machine learning often has difficulties balancing the roles of different metrics according to the semantic context exhibited in a specific collection of texts. In order to facilitate human interaction in a visual analytics process for text similarity detection, we first map the problem of pairwise sequence comparison to that of image processing, allowing patterns of similarity to be visualized as a 2D pixelmap. We then devise a visual interface to enable users to construct and experiment with different detectors using primitive metrics, in a way similar to constructing an image processing pipeline. We deployed this new approach for the identification of commonplaces in 18th‐century literary and print culture. Domain experts were then able to make use of the prototype system to derive new scholarly discoveries and generate new hypotheses.  相似文献   

11.
Online active multi-field learning for efficient email spam filtering   总被引:1,自引:0,他引:1  
Email spam causes a serious waste of time and resources. This paper addresses the email spam filtering problem and proposes an online active multi-field learning approach, which is based on the following ideas: (1) Email spam filtering is an online application, which suggests an online learning idea; (2) Email document has a multi-field text structure, which suggests a multi-field learning idea; and (3) It is costly to obtain a label for a real-world email spam filter, which suggests an active learning idea. The online learner regards the email spam filtering as an incremental supervised binary streaming text classification. The multi-field learner combines multiple results predicted by field classifiers in a novel compound weight schema, and each field classifier calculates the arithmetical average of multiple conditional probabilities calculated from feature strings according to a data structure of string-frequency index. Comparing the current variance of field classifying results with the historical variance, the active learner evaluates the classifying confidence and takes the more uncertain email as the more informative sample for which to request a label. The experimental results show that the proposed approach can achieve the state-of-the-art performance with greatly reduced label requirements and very low space-time costs. The performance of our online active multi-field learning, the standard (1-ROCA)% measurement, even exceeds the full feedback performance of some advanced individual text classification algorithms.  相似文献   

12.
Label Propagation through Linear Neighborhoods   总被引:8,自引:0,他引:8  
In many practical data mining applications such as text classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi supervised learning algorithms have aroused considerable interests from the data mining and machine learning fields. In recent years, graph-based semi supervised learning has been becoming one of the most active research areas in the semi supervised learning community. In this paper, a novel graph-based semi supervised learning approach is proposed based on a linear neighborhood model, which assumes that each data point can be linearly reconstructed from its neighborhood. Our algorithm, named linear neighborhood propagation (LNP), can propagate the labels from the labeled points to the whole data set using these linear neighborhoods with sufficient smoothness. A theoretical analysis of the properties of LNP is presented in this paper. Furthermore, we also derive an easy way to extend LNP to out-of-sample data. Promising experimental results are presented for synthetic data, digit, and text classification tasks.  相似文献   

13.
This study investigates adult English language learners’ reading-strategy use when they read online texts in hypermedia learning environments. The learners joined the online Independent English Study Group (IESG) and worked both individually and collaboratively. This qualitative case study aims (a) to assess college-level ESL learners’ use of reading strategies for online second language (L2) texts and (b) to examine their use of hypertext and hypermedia resources while they read online L2 text. The seven strategies were (a) using hypermedia, (b) using computer applications and accessories, (c) dialoguing, (d) setting up reading purposes and planning, (e) previewing and determining what to read, (f) connecting prior knowledge and experiences with texts and tasks, and (g) inferring. The first two strategies were unique to online readings; the five remaining strategies apply to both online readings and paper-based text readings. The findings also revealed that “hybrid” online reading emphasized participants’ various reaction patterns and preferences in their hypermedia learning environments.  相似文献   

14.
网络直播的兴起,促使直播弹幕成为一种新型的交流方式。随之而来的还有各类非法弹幕。在识别非法弹幕方面,人工筛选过于低效,传统关键词过滤方法和统计机器学习方法识别率较低,且无法应对变异短文本。如何让机器更高效、更准确地识别非法弹幕以营造更好的网络环境是一个很有意义的问题。提出了基于文本卷积神经网络(TextCNN)的带噪非法短文本识别方法。通过对带噪短文本的预处理以及利用文本卷积神经网络挖掘字符间的相关特征,极大地提高了直播弹幕中非法短文本的识别率。  相似文献   

15.
Firms striving to maintain high rates of innovation need a continuous flow of new ideas. This is resulting in the establishment by large firms of IT platforms to generate ideas for innovation, and to encourage employees and customers to participate in innovation contests. However, there has been little published research on the use of IT platforms for idea generation by employees, and it is unclear whether they facilitate in‐house innovation. The purpose is to investigate how firms use IT platforms internally to generate ideas, and how their use contributes to the innovation process in large firms. We rely on data from two collaborative research projects in the automotive industry: Volvo Cars and Renault. We found that both firms used IT platforms as campaigns to promote innovation and to involve employees in the innovation process. The findings suggest that these virtual idea campaigns support innovation in large firms mainly by (1) encouraging employee creativity in idea generation and (2) involving employees and top managers simultaneously in the innovation process. This paper contributes to idea management systems theory by highlighting the importance of virtual idea campaigns for the firm's innovation process, and their dual role.  相似文献   

16.
人工智能与深度学习技术为精准识别在线健康社区抑郁症患者奠定了基础.首先构建了基于TCNN-GRU深度学习的抑郁情感分类模型,进行在线健康社区实验数据集进行抑郁情感分类标注后,通过TCNN-GRU模型判别用户的抑郁症倾向;在此基础上,进一步提出抑郁指数的概念,通过对抑郁指数和患者抑郁程度两者关系的深度挖掘,由此建立基于深度学习的在线健康社区抑郁症用户画像模型.实验结果表明,与传统的卷积神经网络模型、循环神经网络模型以及混合模型相比,TCNN-GRU模型在抑郁情感分类上能获得了更优的结果,基于深度学习的在线健康社区抑郁症用户画像模型也能够从文本分析的角度准确识别用户的抑郁情感和抑郁状态.  相似文献   

17.
企业品牌舆论监控、网络敏感社区及重点社区识别是当前企业舆情监控的重点工作。作为网络社会的子集,不同的网络社区(社交媒体中联系密切的群体)由于社区网络结构的不同、社区成员情感倾向的不同,导致企业负面新闻在其中的传播会表现出来不同的特质。从网络社区的角度出发,研究不同社区情感倾向及社区网络结构下,企业负面新闻在其中产生的影响;进而提出了基于文本挖掘及情感分析的社区负面舆论传播预测模型。根据心理学测量视角ProfileofMoodStates(POMS)测度社区成员情感倾向(Tendency),以事件划分时间窗口;通过对连续六个用抓取的网络数据使用文本挖掘相关算法分析每个事件窗口内社区成员六种情感的分布(愤怒、紧张、失望等);在情感分布及网络结构上进行聚类,识别不同类别的情感倾向的网络社区;在些基础上建立社区情感倾向及舆论传播预测模型。测试结果表明:该模型在对网络社区情感倾向的识别及舆论传播倾向预测方面有较高的准确度,在舆论传播监控、敏感社区及重点社区识别等方面有一定的指导意义。  相似文献   

18.
Abstract

We report two studies investigating readers' ability to allocate limited time adaptively across online texts of varying difficulty. In both studies participants were asked to learn about the human heart and were free to allocate time to 4 separate online texts about the heart but did not have enough time to read them all thoroughly. Of particular interest was whether readers attempted to select the best text for them (by sampling the texts before reading) or to monitor texts while reading them and continue reading any text judged good enough (a satisficing strategy). We argue that both strategies can be considered adaptive, depending on properties of readers, texts, and tasks. Experiment 1 tested readers with a range of background knowledge and allowed them either 7 or 15 min study time. It showed that participants were adaptive in how they allocated their time in that more knowledgeable readers spent more time reading more difficult texts. Satisficing was a much more common strategy than sampling. Experiment 2 showed that providing outline overviews of each text dramatically increased the number of participants using a sampling strategy so that it became the modal strategy. However, this change in strategy had no effect on learning. Outline overviews presumably changed readers' perception of the ease with which relevant dimensions of text quality can be judged.  相似文献   

19.
Online learning has grown exponentially in recent years; however, dropout problem remains challenging for some online programmes. The dropout problem can be attributed to a number of reasons, with a lack of interaction between learners and the instructor constituting one of the main reasons. The lack of interaction also leads to learners' feeling of isolation. Learning communities can provide learners with an environment conducive to increased interactions and alleviate their feeling of isolation. Unfortunately, there are no clear rules that instructors can follow to help learners create learning communities. In this paper, we propose guidelines for online instructors to facilitate the development of learning communities in online courses. We first review the definition of a learning community, importance of a learning community and factors affecting the development of a learning community. Afterwards, based on a review of the existing guidelines and other relevant literature, we propose guidelines for facilitating the development of learning communities in online courses.  相似文献   

20.
基于概念的中文文本可视化表示机制   总被引:1,自引:0,他引:1  
为了浏览因特网上日益增多的在线中文文本,本文给出了基于概念的中文文本可视化表示机制,以直观的方式组织和表示文本及文本集,其基本思想是:首先在概念扩充的基础上,进行文本分类,然后,利用本文提出的提出的文本特征抽取方法和摘要方法,获取广西类别、广西、广西正文的标记的信息,通过类别,文本、有选择地浏览文本。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号