期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Creating a reference data set for the summarization of discussion forum threads

Suzan Verberne Emiel Krahmer Iris Hendrickx Sander Wubben Antal van den Bosch 《Language Resources and Evaluation》2018,52(2):461-483

In this paper we address extractive summarization of long threads in online discussion fora. We present an elaborate user evaluation study to determine human preferences in forum summarization and to create a reference data set. We showed long threads to ten different raters and asked them to create a summary by selecting the posts that they considered to be the most important for the thread. We study the agreement between human raters on the summarization task, and we show how multiple reference summaries can be combined to develop a successful model for automatic summarization. We found that although the inter-rater agreement for the summarization task was slight to fair, the automatic summarizer obtained reasonable results in terms of precision, recall, and ROUGE. Moreover, when human raters were asked to choose between the summary created by another human and the summary created by our model in a blind side-by-side comparison, they judged the model’s summary equal to or better than the human summary in over half of the cases. This shows that even for a summarization task with low inter-rater agreement, a model can be trained that generates sensible summaries. In addition, we investigated the potential for personalized summarization. However, the results for the three raters involved in this experiment were inconclusive. We release the reference summaries as a publicly available dataset. 相似文献

2.

Discovering genres of online discussion threads via text mining

Fu-Ren Lin Lu-Shih Hsieh Fu-Tai Chuang 《Computers & Education》2009

As course management systems (CMS) gain popularity in facilitating teaching. A forum is a key component to facilitate the interactions among students and teachers. Content analysis is the most popular way to study a discussion forum. But content analysis is a human labor intensity process; for example, the coding process relies heavily on manual interpretation; and it is time and energy consuming. In an asynchronous virtual learning environment, an instructor needs to keep monitoring the discussion forum from time to time in order to maintain the quality of a discussion forum. However, it is time consuming and difficult for instructors to fulfill this need especially for K12 teachers. This research proposes a genre classification system, called GCS, to facilitate the automatic coding process. We treat the coding process as a document classification task via modern data mining techniques. The genre of a posting can be perceived as an announcement, a question, clarification, interpretation, conflict, assertion, etc. This research examines the coding coherence between GCS and experts’ judgment in terms of recall and precision, and discusses how we adjust the parameters of the GCS to improve the coherence. Based on the empirical results, GCS adopts the cascade classification model to achieve the automatic coding process. The empirical evaluation of the classified genres from a repository of postings in an online course on earth science in a senior high school shows that GCS can effectively facilitate the coding process, and the proposed cascade model can deal with the imbalanced distribution nature of discussion postings. These results imply that GCS based on the cascade model can perform as an automatic posting coding system. 相似文献

3.

Fast estimation algorithm for likelihood-based analysis of repeated categorical responses

Jukka Jokinen 《Computational statistics & data analysis》2006,51(3):1509-1522

Likelihood-based marginal regression modelling for repeated, or otherwise clustered, categorical responses is computationally demanding. This is because the number of measures needed to describe the associations within a cluster increase geometrically with increasing cluster size. The proposed estimation methods typically describe the associations using odds ratios, which result in computationally unfeasible solutions for large cluster sizes. An alternative method for joint modelling of the regression, association, and dropout mechanism for clustered categorical responses is presented. The joint distribution of a multivariate categorical response is described by utilizing the mean parameterization, which facilitates maximum likelihood estimation in two important respects. The models are illustrated by analyses of the presence and absence of schizophrenia symptoms on 86 patients at 12 repeated time-points, and a survey of opinions of 607 adults regarding government spending on nine different targets, measured on a common 3-level ordinal scale. Free software is available. 相似文献

4.

A software framework for data analysis

Markus Krätzig 《Computational statistics & data analysis》2007,52(2):618-634

The open-source Java software framework JStatCom is presented which supports the development of rich desktop clients for data analysis in a rather general way. The concept is to solve all recurring tasks with the help of reusable components and to enable rapid application development by adopting a standards based approach which is readily supported by existing programming tools. Furthermore, JStatCom allows to call external procedures from within Java that are written in other languages, for example Gauss, Ox or Matlab. This way it is possible to reuse an already existing code base for numerical routines written in domain-specific programming languages and to link them with the Java world. A reference application for JStatCom is the econometric software package JMulTi, which will shortly be introduced. 相似文献

5.

A shape analysis framework for neuromorphometry

Costa Lda F Manoel ET Faucereau F Chelly J van Pelt J Ramakers G 《Network (Bristol, England)》2002,13(3):283-310

This paper addresses in an integrated and systematic fashion the relatively overlooked but increasingly important issue of measuring and characterizing the geometrical properties of nerve cells and structures, an area often called neuromorphology. After discussing the main motivation for such an endeavour, a comprehensive mathematical framework for characterizing neural shapes, capable of expressing variations over time, is presented and used to underline the main issues in neuromorphology. Three particularly powerful and versatile families of neuromorphological approaches, including differential measures, symmetry axes/skeletons, and complexity, are presented and their respective potentials for applications in neuroscience are identified. Examples of applications of such measures are provided based on experimental investigations related to automated dendrogram extraction, mental retardation characterization, and axon growth analysis. 相似文献

6.

A framework for virtual disassembly analysis 总被引：2，自引：0，他引：2

HARI SRINIVASAN N. SHYAMSUNDAR RAJIT GADH 《Journal of Intelligent Manufacturing》1997,8(4):277-295

Product reuse or recyclability is enhanced by designing the product for inexpensive and efficient disassembly. However, accomplishing enhanced product design requires design for disassembly (DFD) tools. This paper presents a disassembly framework that consists of design modules; both of these are embodied in the geometric DFD tool. These modules consist of different tasks including: selection of the appropriate disassembly method; producing an optimized disassembly sequence; evaluating a disassembly sequence for cost; producing design change recommendations. These considerations make a product easier to disassemble and therefore have potential benefit to the environment. 相似文献

7.

A computational framework for location analysis

Keane J.A. Ward T.A. 《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society》2002,32(5):574-581

Location analysis decisions are interrelated and should be made within a single decision-making framework. A framework within which a number of location strategies can be placed is presented. Location-allocation models are improved in two ways: 1) the allocation rule is developed to more accurately reflect customer choice processes; and 2) the objective function is developed to incorporate future changes. Computational support for this framework is described. 相似文献

8.

A denotational framework for data flow analysis

Flemming Nielson 《Acta Informatica》1982,18(3):265-287

Summary It is shown how to express data flow analysis in a denotational framework by means of abstract interpretation. A continuation style formulation naturally leads to the MOP (Meet Over all Paths) solution, whereas a direct style formulation leads to the MFP (Maximal Fixed Point) solution. 相似文献

9.

A structurally motivated framework for discriminant analysis 总被引：1，自引：1，他引：1

Bo Yang Songcan Chen Xindong Wu 《Pattern Analysis & Applications》2011,14(4):349-367

Over the last few years, a lot of algorithms for discriminant analysis (DA) have been developed. Although having different motivations, they all inject structure information in data into their own within- and between-class scatters. However, to our best knowledge, there has not been yet a systematical examination about (1) which structure granularities lurk in data; (2) which structure granularities are utilized in scatters of a DA algorithm; (3) whether new DA algorithms can be developed based on existing structure granularities. In this paper, the established so-called structurally motivated (SM) framework for DA and its unified mathematical formulation of the ratio trace exactly answers them. It categorizes these DA algorithms from the viewpoint of constructing scatters based on different-granularity structures in data, identifies their applicable scenarios for different structure types, and provides insights into developing new DA algorithms. Inspired by the insight, we find that cluster granularity lying in the middle of granularity spectrum in SM framework can still be further utilized and exploited. As a result, the three DA algorithms based on the cluster granularity are derived from the SM framework and from the injection of the cluster structure information into the respective within-class, between-class and joint both scatter matrices for the classical MDA, and these corresponding algorithms are, respectively, called as SWDA, SBDA and SWBDA. The injection of cluster structure information makes the proposed three algorithms able to fit relatively complicated data not only more effectively, but also with the regularization technique obtain more projections than the classical MDA, which is very helpful for more effective DA. Moreover, MDA becomes their special case when the cluster numbers of all classes are set to 1. Our experiments on the benchmarks (face and UCI databases) here show that the proposed algorithms yield encouraging results. 相似文献

10.

A formal verification framework for static analysis

Elvira Albert Richard Bubel Samir Genaim Reiner Hähnle Germán Puebla Guillermo Román-Díez 《Software and Systems Modeling》2016,15(4):987-1012

Static analysis tools, such as resource analyzers, give useful information on software systems, especially in real-time and safety-critical applications. Therefore, the question of the reliability of the obtained results is highly important. State-of-the-art static analyzers typically combine a range of complex techniques, make use of external tools, and evolve quickly. To formally verify such systems is not a realistic option. In this work, we propose a different approach whereby, instead of the tools, we formally verify the results of the tools. The central idea of such a formal verification framework for static analysis is the method-wise translation of the information about a program gathered during its static analysis into specification contracts that contain enough information for them to be verified automatically. We instantiate this framework with costa, a state-of-the-art static analysis system for sequential Java programs, for producing resource guarantees and KeY, a state-of-the-art verification tool, for formally verifying the correctness of such resource guarantees. Resource guarantees allow to be certain that programs will run within the indicated amount of resources, which may refer to memory consumption, number of instructions executed, etc. Our results show that the proposed tool cooperation can be used for automatically producing verified resource guarantees. 相似文献

11.

一种用于事件重构的时间分析框架 总被引：1，自引：0，他引：1

刘晓宇翟晓飞许榕生《信息网络安全》2009,(3):77-80

本文针对Windows系统取证提出了一种新的时间分析框架,框架改进了传统的计算机取证中时间信息的提取方法,提出了粗、细两种粒度的分析步骤,在传统的人工分析中加入了聚类算法和启发式规则,最终为事件重构分析提供了可能。本文首先介绍了框架的总体结构,然后描述了改进的时间提取方法,接着介绍了粗、细两种聚类分析模块和规则分析模块,最后对框架的优缺点进行了总结。相似文献

12.

A likelihood-based QUALIFLEX method with interval type-2 fuzzy sets for multiple criteria decision analysis

Jih-Chang Wang Chueh-Yung Tsao Ting-Yu Chen 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2015,19(8):2225-2243

相似文献

13.

A framework for fuzzy quantification models analysis 总被引：1，自引：0，他引：1

Barro S. Bugarin A.J. Carinena P. Diaz-Hermida F. 《Fuzzy Systems, IEEE Transactions on》2003,11(1):89-99

相似文献

14.

Texterra: A framework for text analysis

D. Yu. Turdakov N. A. Astrakhantsev Ya. R. Nedumov A. A. Sysoev I. A. Andrianov V. D. Mayorov D. G. Fedorenko A. V. Korshunov S. D. Kuznetsov 《Programming and Computer Software》2014,40(5):288-295

A framework for fast text analysis, which is developed as a part of the Texterra project, is described. Texterra provides a scalable solution for the fast text processing on the basis of novel methods that exploit knowledge extracted from the Web and text documents. For the developed tools, details of the project, use cases, and evaluation results are presented. 相似文献

15.

A framework for analysis of data quality research 总被引：14，自引：0，他引：14

Wang R.Y. Storey V.C. Firth C.P. 《Knowledge and Data Engineering, IEEE Transactions on》1995,7(4):623-640

Organizational databases are pervaded with data of poor quality. However, there has not been an analysis of the data quality literature that provides an overall understanding of the state-of-art research in this area. Using an analogy between product manufacturing and data manufacturing, this paper develops a framework for analyzing data quality research, and uses it as the basis for organizing the data quality literature. This framework consists of seven elements: management responsibilities, operation and assurance costs, research and development, production, distribution, personnel management, and legal function. The analysis reveals that most research efforts focus on operation and assurance costs, research and development, and production of data products. Unexplored research topics and unresolved issues are identified and directions for future research provided 相似文献

16.

A video-based framework for the analysis of presentations/posters

A. Zandifar R. Duraiswami L. S. Davis 《International Journal on Document Analysis and Recognition》2005,7(2-3):178-187

Detection and recognition of textual information in an image or video sequence is important for many applications. The increased resolution and capabilities of digital cameras and faster mobile processing allow for the development of interesting systems. We present an application based on the capture of information presented at a slide-show presentation or at a poster session. We describe the development of a system to process the textual and graphical information in such presentations. The application integrates video and image processing, document layout understanding, optical character recognition (OCR), and pattern recognition. The digital imaging device captures slides/poster images, and the computing module preprocesses and annotates the content. Various problems related to metric rectification, key-frame extraction, text detection, enhancement, and system integration are addressed. The results are promising for applications such as a mobile text reader for the visually impaired. By using powerful text-processing algorithms, we can extend this framework to other applications, e.g., document and conference archiving, camera-based semantics extraction, and ontology creation.Received: 18 December 2003, Revised: 1 November 2004, Published online: 2 February 2005 相似文献

17.

A framework for analysis of extended fuzzy logic

Farnaz Sabahi M. -R. Akbarzadeh-T 《浙江大学学报:C卷英文版》2014,15(7):584-591

相似文献

18.

Uncensoring censored data for machine learning: A likelihood-based approach

Ivan Štajduhar Bojana Dalbelo-Baši? 《Expert systems with applications》2012,39(8):7226-7234

Various machine learning techniques have been applied to different problems in survival analysis in the last decade. They were usually adapted to learning from censored survival data by using the information on observation time. This includes learning from parts of the data or interventions to the learning algorithms. Efficient models were established in various fields of clinical medicine and bioinformatics. In this paper, we propose a pre-processing method for adapting the censored survival data to be used with ordinary machine learning algorithms. This is done by pre-assigning censored instances a positive or negative outcome according to their features and observation time. The proposed procedure calculates the goodness of fit of each censored instance to both the distribution of positives and the spoiled distribution of negatives in the entire dataset and relabels that instance accordingly. We performed a thorough empirical testing of our method in a simulation study and on two real-world medical datasets, using the naive Bayes classifier and decision trees. When compared to one of the popular ML methods dealing with survival, our method provided good results, especially when applied to heavily censored data. 相似文献

19.

A framework for systems analysis for decision support systems

Harish C. Bahl Raymond G. Hunt 《Information & Management》1984,7(3):121-131

Comprehensive and elaborate systems analysis techniques have been developed in the past of routine and operational information systems. Developing support systems for organizational decision-making requires new tools and methodologies. We present a new framework for data collection and decision analysis which is useful for developing decision support systems. This task analysis methodology encompasses (1) event analysis, (2) participant analysis, and (3) decision content analysis. With a proper coding manual, it provides a framework for collecting relevant and detailed information required for decision support design and implementation. Further research is suggested for application and evaluation of the methodology in real-life DSS environments. 相似文献

20.

A symbolic framework for multi-faceted security protocol analysis

Andrea Bracciali Gianluigi Ferrari Emilio Tuosto 《International Journal of Information Security》2008,7(1):55-84

Verification of software systems, and security protocol analysis as a particular case, requires frameworks that are expressive, so as to properly capture the relevant aspects of the system and its properties, formal, so as to be provably correct, and with a computational counterpart, so as to support the (semi-) automated certification of properties. Additionally, security protocols also present hidden assumptions about the context, specific subtleties due to the nature of the problem and sources of complexity that tend to make verification incomplete. We introduce a verification framework that is expressive enough to capture a few relevant aspects of the problem, like symmetric and asymmetric cryptography and multi-session analysis, and to make assumptions explicit, e.g., the hypotheses about the initial sharing of secret keys among honest (and malicious) participants. It features a clear separation between the modeling of the protocol functioning and the properties it is expected to enforce, the former in terms of a calculus, the latter in terms of a logic. This framework is grounded on a formal theory that allows us to prove the correctness of the verification carried out within the fully fledged model. It overcomes incompleteness by performing the analysis at a symbolic level of abstraction, which, moreover, transforms into executable verification tools. 相似文献