首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Question–answering systems make good use of knowledge bases (KBs, e.g., Wikipedia) for responding to definition queries. Typically, systems extract relevant facts from articles regarding the question across KBs, and then they are projected into the candidate answers. However, studies have shown that the performance of this kind of method suddenly drops, whenever KBs supply narrow coverage. This work describes a new approach to deal with this problem by constructing context models for scoring candidate answers, which are, more precisely, statistical n‐gram language models inferred from lexicalized dependency paths extracted from Wikipedia abstracts. Unlike state‐of‐the‐art approaches, context models are created by capturing the semantics of candidate answers (e.g., “novel,”“singer,”“coach,” and “city”). This work is extended by investigating the impact on context models of extra linguistic knowledge such as part‐of‐speech tagging and named‐entity recognition. Results showed the effectiveness of context models as n‐gram lexicalized dependency paths and promising context indicators for the presence of definitions in natural language texts.  相似文献   

2.
Text is composed of words and phrases. In the bag‐of‐words model, phrases in text are split into words. This may discard the semantics of phrases, which, in turn, may give an inconsistent relatedness score between 2 texts. Our objective is to apply phrase relatedness in conjunction with word relatedness on the text relatedness task to improve text relatedness performance. We adopt 2 existing word relatedness measures based on Google n‐gram and Global Vectors for Word Representation, respectively, and incorporate them differently with an existing Google n‐gram–based phrase relatedness method to compute text relatedness. The combination of Google n‐gram–based word and phrase relatedness performs better than Google n‐gram–based word relatedness alone, by achieving the higher weighted mean of Pearson's r, ie, 0.639 and 0.619, respectively, on the 14 data sets from the series of Semantic Evaluation workshops SemEval‐2012, SemEval‐2013, and SemEval‐2015. Similarly, the combination of GloVe‐based word relatedness and Google n‐gram–based phrase relatedness performs better than GloVe‐based word relatedness alone, by achieving the higher weighted mean of Pearson's r, ie, 0.619 and 0.605, respectively, on the same 14 data sets. On the SemEval‐2012, SemEval‐2013, and SemEval‐2015 data sets, the text relatedness results obtained from the combination of Google n‐gram–based word and phrase relatedness ranked 24, 3, and 31 out of 89, 90, and 73 text relatedness systems, respectively.  相似文献   

3.
Information ordering is a nontrivial task in multi‐document summarization (MDS), which typically relies on the traditional vector space model (VSM) notorious for semantic deficiency. In this article, we propose a novel event‐enriched VSM to alleviate the problem by building event semantics into sentence representations. The mediation of event information between sentence and term, especially in the news domain, has an intuitive appeal as well as technical advantage in common sentence‐level operations such as sentence similarity computation. Inspired by the block‐style writing by humans, we base the sentence ordering algorithm on sentence clustering. To accommodate the complexity introduced by event information, we adopt a soft‐to‐hard clustering strategy on the event and sentence levels, using expectation–maximization clustering and K‐means, respectively. For the purpose of cluster‐based sentence ordering, the event‐enriched VSM enables us to design an ordering algorithm to enhance event coherence computed between sentence and sentence–context pairs. Drawing on the findings of earlier research, we also incorporate topic continuity measures and time information into the scheme. We evaluate the performance of the model and its variants automatically and manually, with experimental results showing clear advantage of the event‐based model over baseline and non‐event‐based models in information ordering for multi‐document news summarization. We are confident that the event‐enriched VSM has even greater potential in summarization and beyond, which awaits further research. © 2014 Wiley Periodicals, Inc.  相似文献   

4.
An empirical study of smoothing techniques for language modeling   总被引:2,自引:0,他引:2  
We survey the most widely-used algorithms for smoothing models for language n -gram modeling. We then present an extensive empirical comparison of several of these smoothing techniques, including those described by Jelinek and Mercer (1980); Katz (1987); Bell, Cleary and Witten (1990); Ney, Essen and Kneser (1994), and Kneser and Ney (1995). We investigate how factors such as training data size, training corpus (e.g. Brown vs. Wall Street Journal), count cutoffs, and n -gram order (bigram vs. trigram) affect the relative performance of these methods, which is measured through the cross-entropy of test data. We find that these factors can significantly affect the relative performance of models, with the most significant factor being training data size. Since no previous comparisons have examined these factors systematically, this is the first thorough characterization of the relative performance of various algorithms. In addition, we introduce methodologies for analyzing smoothing algorithm efficacy in detail, and using these techniques we motivate a novel variation of Kneser–Ney smoothing that consistently outperforms all other algorithms evaluated. Finally, results showing that improved language model smoothing leads to improved speech recognition performance are presented.  相似文献   

5.
In this paper we consider the problem of finding aclosed partition in a directed graph. This problem has applications in concurrent probabilistic program verification. The best sequential algorithm known for this problem runs inO(mn) time wherem is the number of directed edges andn is the number of vertices in the given digraph. In this paper we present a linear-time sequential algorithm to solve the closed partition problem for planar digraphs that arecompact. We then build on this algorithm to obtain an O(n1.5)-time sequential algorithm to solve the closed partition problem for a general planar digraph.This work was supported in part by NSF Grant CCR 89-10707.  相似文献   

6.
7.
Language modeling is the problem of predicting words based on histories containing words already hypothesized. Two key aspects of language modeling are effective history equivalence classification and robust probability estimation. The solution of these aspects is hindered by the data sparseness problem.Application of random forests (RFs) to language modeling deals with the two aspects simultaneously. We develop a new smoothing technique based on randomly grown decision trees (DTs) and apply the resulting RF language models to automatic speech recognition. This new method is complementary to many existing ones dealing with the data sparseness problem. We study our RF approach in the context of n-gram type language modeling in which n  1 words are present in a history. Unlike regular n-gram language models, RF language models have the potential to generalize well to unseen data, even when histories are longer than four words. We show that our RF language models are superior to the best known smoothing technique, the interpolated Kneser–Ney smoothing, in reducing both the perplexity (PPL) and word error rate (WER) in large vocabulary state-of-the-art speech recognition systems. In particular, we will show statistically significant improvements in a contemporary conversational telephony speech recognition system by applying the RF approach only to one of its many language models.  相似文献   

8.
We introduce a probabilistic formalism handling both Markov random fields of bounded tree width and probabilistic context-free grammars. Our models are based on case-factor diagrams (CFDs) which are similar to binary decision diagrams (BDDs) but are more concise for circuits of bounded tree width. A probabilistic model consists of a CFD defining a feasible set of Boolean assignments and a weight (or cost) for each individual Boolean variable. We give versions of the inside–outside algorithm and the Viterbi algorithm for these models.  相似文献   

9.
A new reliability model, consecutive 2‐out‐of‐(r, r)‐from‐(n, n):F model, is proposed. The consecutive 2‐out‐of‐(r, r)‐from‐(n, n):F system consists of a square grid of side n (containing n2 components) such that the system fails if and only if there is at least one square of side r which includes among them at least two failed components. For i.i.d. case an algorithm is given for computing the reliability of the system. The reliability function can be expressed by the number of 0–1 matrices having no two or more 0s at any square of side r.  相似文献   

10.
We present a probabilistic analysis of the Floyd–Rivest expected time selection algorithm. In particular, we show that a refinement of the bootstrapped version of the Floyd–Rivest algorithm that determines the Cth order statistic by performing an expected n+C+O (n 1/2) comparisons can be made into a randomized algorithm that performs n+C+O(n 1/2 log3/2 n) comparisons with probability at least 1?1/n ρ, for any constant ρ>0.  相似文献   

11.
A bit of progress in language modeling   总被引:2,自引:0,他引:2  
In the past several years, a number of different language modeling improvements over simple trigram models have been found, including caching, higher-order n -grams, skipping, interpolated Kneser–Ney smoothing, and clustering. We present explorations of variations on, or of the limits of, each of these techniques, including showing that sentence mixture models may have more potential. While all of these techniques have been studied separately, they have rarely been studied in combination. We compare a combination of all techniques together to a Katz smoothed trigram model with no count cutoffs. We achieve perplexity reductions between 38 and 50% (1 bit of entropy), depending on training data size, as well as a word error rate reduction of 8.9%. Our perplexity reductions are perhaps the highest reported compared to a fair baseline.  相似文献   

12.
During the early stages of language acquisition, young infants face the task of learning a basic vocabulary without the aid of prior linguistic knowledge. Attempts have been made to model this complex behaviour computationally, using a variety of machine learning algorithms, a.o. non-negative matrix factorization (NMF). In this paper, we replace NMF in a vocabulary learning setting with a conceptually similar algorithm, probabilistic latent semantic analysis (PLSA), which can learn word representations incrementally by Bayesian updating. We further show that this learning framework is capable of modelling certain cognitive behaviours, e.g. forgetting, in a simple way.  相似文献   

13.
Two novel word clustering techniques are proposed which employ long distance bigram language models. The first technique is built on a hierarchical clustering algorithm and minimizes the sum of Mahalanobis distances of all words after a cluster merger from the centroid of the class created by merging. The second technique resorts to probabilistic latent semantic analysis (PLSA). Next, interpolated long distance bigrams are considered in the context of the aforementioned clustering techniques. Experiments conducted on the English Gigaword corpus (second edition) demonstrate that: (1) the long distance bigrams, when employed in the two clustering techniques under study, yield word clusters of better quality than the baseline bigrams; (2) the interpolated long distance bigrams outperform the long distance bigrams in the same respect; (3) the long distance bigrams perform better than the bigrams, which incorporate trigger-pairs selected at various distances; and (4) the best word clustering is achieved by the PLSA that employs interpolated long distance bigrams. Both proposed techniques outperform spectral clustering based on k-means. To assess objectively the quality of the created clusters, relative cluster validity indices are estimated as well as the average cluster sense precision, the average cluster sense recall, and the F-measure are computed by exploiting ground truth extracted from the WordNet.  相似文献   

14.
Atze van der Ploeg 《Software》2014,44(12):1467-1484
The well‐known Reingold–Tilford algorithm produces tidy‐layered drawings of trees: drawings where all nodes at the same depth are vertically aligned. However, when nodes have varying heights, layered drawing may use more vertical space than necessary. A non‐layered drawing of a tree places children at a fixed distance from the parent, thereby giving a more vertically compact drawing. Moreover, non‐layered drawings can also be used to draw trees where the vertical position of each node is given, by adding dummy nodes. In this paper, we present the first linear‐time algorithm for producing non‐layered drawings. Our algorithm is a modification of the Reingold–Tilford algorithm, but the original complexity proof of the Reingold–Tilford algorithm uses an invariant that does not hold for the non‐layered case. We give an alternative proof of the algorithm and its extension to non‐layered drawings. To improve drawings of trees of unbounded degree, extensions to the Reingold–Tilford algorithm have been proposed. These extensions also work in the non‐layered case, but we show that they then cause a O(n2) run‐time. We then propose a modification to these extensions that restores the O(n) run‐time. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

15.
This paper addresses parameter drift in stochastic models. We define a notion of context that represents invariant, stable-over-time behavior and we then propose an algorithm for detecting context changes in processing a stream of data. A context change is seen as model failure, when a probabilistic model representing current behavior is no longer able to “fit” newly encountered data. We specify our stochastic models using a first-order logic-based probabilistic modeling language called Generalized Loopy Logic (GLL). An important component of GLL is its learning mechanism that can identify context drift. We demonstrate how our algorithm can be incorporated into a failure-driven context-switching probabilistic modeling framework and offer several examples of its application.  相似文献   

16.
We describe a novel approach that allows humanoid robots to incrementally integrate motion primitives and language expressions, when there are underlying natural language and motion language modules. The natural language module represents sentence structure using word bigrams. The motion language module extracts the relations between motion primitives and the relevant words. Both the natural language module and the motion language module are expressed as probabilistic models and, therefore, they can be integrated so that the robots can both interpret observed motion in the form of sentences and generate the motion corresponding to a sentence command. Incremental learning is needed for a robot that develops these linguistic skills autonomously . The algorithm is derived from optimization of the natural language and motion language modules under constraints on their probabilistic variables such that the association between motion primitive and sentence in incrementally added training pairs is strengthened. A test based on interpreting observed motion in the forms of sentence demonstrates the validity of the incremental statistical learning algorithm.  相似文献   

17.
The process of microplanning in natural language generation (NLG) encompasses a range of problems in which a generator must bridge underlying domain‐specific representations and general linguistic representations. These problems include constructing linguistic referring expressions to identify domain objects, selecting lexical items to express domain concepts, and using complex linguistic constructions to concisely convey related domain facts. In this paper, we argue that such problems are best solved through a uniform, comprehensive, declarative process. In our approach, the generator directly explores a search space for utterances described by a linguistic grammar. At each stage of search, the generator uses a model of interpretation, which characterizes the potential links between the utterance and the domain and context, to assess its progress in conveying domain‐specific representations. We further address the challenges for implementation and knowledge representation in this approach. We show how to implement this approach effectively by using the lexicalized tree‐adjoining grammar (LTAG) formalism to connect structure to meaning and using modal logic programming to connect meaning to context. We articulate a detailed methodology for designing grammatical and conceptual resources which the generator can use to achieve desired microplanning behavior in a specified domain. In describing our approach to microplanning, we emphasize that we are in fact realizing a deliberative process of goal‐directed activity. As we formulate it, interpretation offers a declarative representation of a generator's communicative intent. It associates the concrete linguistic structure planned by the generator with inferences that show how the meaning of that structure communicates needed information about some application domain in the current discourse context. Thus, interpretations are plans that the microplanner constructs and outputs. At the same time, communicative intent representations provide a rich and uniform resource for the process of NLG. Using representations of communicative intent, a generator can augment the syntax, semantics, and pragmatics of an incomplete sentence simultaneously, and can work incrementally toward solutions for the various problems of microplanning.  相似文献   

18.
A FORTRAN program is described to compute the vertical magnetic field anywhere inside or outside a rectangular loop which carries a sinusoidally varying current and is horizontally placed on the surface of a n-layered earth. The program utilizes the concept of reciprocity and the known solution for a horizontal magnetic dipole source placed over a n-layered half-space. The computing algorithm is executable on a minicomputer like PDP 11/40 and as such is useful to many geoscientists who do not have access to mainframe computer, for computing model data to fit and interpret field observations from electromagnetic depth sounding experiments.  相似文献   

19.
Harry M. Chang 《Computing》2011,91(3):241-264
The Zipf–Mandelbrot law is widely used to model a power-law distribution on ranked data. One of the best known applications of the Zipf–Mandelbrot law is in the area of linguistic analysis of the distribution of words ranked by their frequency in a text corpus. By exploring known limitations of the Zipf–Mandelbrot law in modeling the actual linguistic data from different domains in both printed media and online content, a new algorithm is developed to effectively construct n-gram rules for building natural language (NL) models required for a human-to-computer interface. The construction of statistically-oriented n-gram rules is based on a new computing algorithm that identifies the area of divergence between Zipf–Mandelbrot curve and the actual frequency distribution of the ranked n-gram text tokens extracted from a large text corpus derived from the online electronic programming guide (EPG) for television shows and movies. Two empirical experiments were carried out to evaluate the EPG-specific language models created using the new algorithm in the context of NL-based information retrieval systems. The experimental results show the effectiveness of the algorithm for developing low-complexity concept models with high coverage for the user’s language models associated with both typed and spoken queries when interacting with a NL-based EPG search interface.  相似文献   

20.
目前,关于连续手语语句识别的研究相对较少,原因在于难以有效地分割出手语词。该文利用卷积神经网络提取手语词的手型特征,同时利用轨迹归一化算法提取手语词的轨迹特征,并在此基础上完成长短期记忆网络的构建,从而为手语语句识别准备好手语词分类器。对于一个待识别的手语语句,采用基于右手心轨迹信息的分割算法来检测过渡动作。由过渡动作可以将语句分割为多个片段,考虑到某些过渡动作可能是手语词内部的动作,所以将若干个片段拼接成一个复合段,并按照层次遍历的次序对所有复合段运用手语词分类器进行识别。最后,采用跨段搜索的动态规划算法寻找最大后验概率的词汇序列,从而完成手语语句的识别。实验结果表明,该算法可以对47个常用手语词组成的语句做出识别,且具有较高的准确性和实时性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号