首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Many model-based investigation techniques, such as sensitivity analysis, optimization, and statistical inference, require a large number of model evaluations to be performed at different input and/or parameter values. This limits the application of these techniques to models that can be implemented in computationally efficient computer codes. Emulators, by providing efficient interpolation between outputs of deterministic simulation models, can considerably extend the field of applicability of such computationally demanding techniques. So far, the dominant techniques for developing emulators have been priors in the form of Gaussian stochastic processes (GASP) that were conditioned with a design data set of inputs and corresponding model outputs. In the context of dynamic models, this approach has two essential disadvantages: (i) these emulators do not consider our knowledge of the structure of the model, and (ii) they run into numerical difficulties if there are a large number of closely spaced input points as is often the case in the time dimension of dynamic models. To address both of these problems, a new concept of developing emulators for dynamic models is proposed. This concept is based on a prior that combines a simplified linear state space model of the temporal evolution of the dynamic model with Gaussian stochastic processes for the innovation terms as functions of model parameters and/or inputs. These innovation terms are intended to correct the error of the linear model at each output step. Conditioning this prior to the design data set is done by Kalman smoothing. This leads to an efficient emulator that, due to the consideration of our knowledge about dominant mechanisms built into the simulation model, can be expected to outperform purely statistical emulators at least in cases in which the design data set is small. The feasibility and potential difficulties of the proposed approach are demonstrated by the application to a simple hydrological model.  相似文献   

2.

Visually based point-and-click user interfaces have become very common. This increases the need to understand the mechanics in learning and using pointing devices in order to design appropriate human-computer interaction and thereby to help alleviate musculosketetal symptoms. The paper reports a study of preference, strategies and learning in using keyboard and mouse in a tracking task under time pressure. The keyboard was preferred by 11 out of 12 subjects due primarily to comfort, frustration, and visual strain. One of the most distinguishing features in favour of the keyboard was the opportunity to develop a working strategy facilitating learning.  相似文献   

3.
Visually based point-and-click user interfaces have become very common. This increases the need to understand the mechanics in learning and using pointing devices in order to design appropriate human-computer interaction and thereby to help alleviate musculosketetal symptoms. The paper reports a study of preference, strategies and learning in using keyboard and mouse in a tracking task under time pressure. The keyboard was preferred by 11 out of 12 subjects due primarily to comfort, frustration, and visual strain. One of the most distinguishing features in favour of the keyboard was the opportunity to develop a working strategy facilitating learning.  相似文献   

4.
Neural Computing and Applications - Text summarization resolves the issue of capturing essential information from a large volume of text data. Existing methods either depend on the end-to-end...  相似文献   

5.
We introduce a new type of lexical structure called lexical system, an interoperable model that can feed both monolingual and multilingual language resources. We begin with a formal characterization of lexical systems as simple directed graphs, solely made up of nodes corresponding to lexical entities and links. To illustrate our approach, we present data borrowed from a lexical system that has been generated from the French DiCo database. We later explain how the compilation of the original dictionary-like database into a net-like one has been made possible. Finally, we discuss the potential of the proposed lexical structure for designing multilingual lexical resources.
Alain PolguèreEmail:
  相似文献   

6.
7.
We introduce an exponential language model which models a whole sentence or utterance as a single unit. By avoiding the chain rule, the model treats each sentence as a “bag of features", where features are arbitrary computable properties of the sentence. The new model is computationally more efficient, and more naturally suited to modeling global sentential phenomena, than the conditional exponential (e.g. maximum entropy) models proposed to date. Using the model is straightforward. Training the model requires sampling from an exponential distribution. We describe the challenge of applying Monte Carlo Markov Chain and other sampling techniques to natural language, and discuss smoothing and step-size selection. We then present a novel procedure for feature selection, which exploits discrepancies between the existing model and the training corpus. We demonstrate our ideas by constructing and analysing competitive models in the Switchboard and Broadcast News domains, incorporating lexical and syntactic information.  相似文献   

8.
9.
Finite-state Markov chains have proven useful as a model to characterize a population of uses of a software system. This paper presents a language (TML) for describing these models in a manner that supports development, reuse, and automated testing. TML provides a simple representation of usage models, while formalizing modeling techniques already in use informally.  相似文献   

10.
11.
12.

The exponential growth of the internet community has resulted in the production of a vast amount of unstructured data, including web pages, blogs and social media. Such a volume consisting of hundreds of billions of words is unlikely to be analyzed by humans. In this work we introduce the tool LanguageCrawl, which allows Natural Language Processing (NLP) researchers to easily build web-scale corpora using the Common Crawl Archive—an open repository of web crawl information, which contains petabytes of data. We present three use cases in the course of this work: filtering of Polish websites, the construction of n-gram corpora and the training of a continuous skipgram language model with hierarchical softmax. Each of them has been implemented within the LanguageCrawl toolkit, with the possibility to adjust specified language and n-gram ranks. This paper focuses particularly on high computing efficiency by applying highly concurrent multitasking. Our tool utilizes effective libraries and design. LanguageCrawl has been made publicly available to enrich the current set of NLP resources. We strongly believe that our work will facilitate further NLP research, especially in under-resourced languages, in which the lack of appropriately-sized corpora is a serious hindrance to applying data-intensive methods, such as deep neural networks.

  相似文献   

13.
《Knowledge》2007,20(2):177-185
There is an extensive body of work on Intelligent Tutoring Systems: computer environments for education, teaching and training that adapt to the needs of the individual learner. Work on personalisation and adaptivity has included research into allowing the student user to enhance the system’s adaptivity by improving the accuracy of the underlying learner model. Open Learner Modelling, where the system’s model of the user’s knowledge is revealed to the user, has been proposed to support student reflection on their learning. Increased accuracy of the learner model can be obtained by the student and system jointly negotiating the learner model. We present the initial investigations into a system to allow people to negotiate the model of their understanding of a topic in natural language. This paper discusses the development and capabilities of both conversational agents (or chatbots) and Intelligent Tutoring Systems, in particular Open Learner Modelling. We describe a Wizard-of-Oz experiment to investigate the feasibility of using a chatbot to support negotiation, and conclude that a fusion of the two fields can lead to developing negotiation techniques for chatbots and the enhancement of the Open Learner Model. This technology, if successful, could have widespread application in schools, universities and other training scenarios.  相似文献   

14.
Recent theoretical developments in queuing theory have made multiclass queuing network models a viable alternative to established simulation methods for the analysis of dynamic systems. Computer software is required to describe and solve such network models. This paper describes a language which provides the human interface to a particular multiclass queuing network modelling package. The discussion is illustrated with an example from the language and concludes with suggestions for improvements.  相似文献   

15.

Sense representations have gone beyond word representations like Word2Vec, GloVe and FastText and achieved innovative performance on a wide range of natural language processing tasks. Although very useful in many applications, the traditional approaches for generating word embeddings have a strict drawback: they produce a single vector representation for a given word ignoring the fact that ambiguous words can assume different meanings. In this paper, we explore unsupervised sense representations which, different from traditional word embeddings, are able to induce different senses of a word by analyzing its contextual semantics in a text. The unsupervised sense representations investigated in this paper are: sense embeddings and deep neural language models. We present the first experiments carried out for generating sense embeddings for Portuguese. Our experiments show that the sense embedding model (Sense2vec) outperformed traditional word embeddings in syntactic and semantic analogies task, proving that the language resource generated here can improve the performance of NLP tasks in Portuguese. We also evaluated the performance of pre-trained deep neural language models (ELMo and BERT) in two transfer learning approaches: feature based and fine-tuning, in the semantic textual similarity task. Our experiments indicate that the fine tuned Multilingual and Portuguese BERT language models were able to achieve better accuracy than the ELMo model and baselines.

  相似文献   

16.
The result quality of queries incorporating impreciseness can be improved by the specification of user-defined weights. Existing approaches evaluate weighted queries by applying arithmetic evaluations on top of the query’s intrinsic logic. This complicates the usage of logic-based optimization. Therefore, we suggest a weighting approach that is completely embedded in a logic. In order to facilitate the user interaction with the system, we exploit the intuitively comprehensible concept of preferences. In addition, we use a machine-based learning algorithm to learn weighting values in correspondence to the user’s intended semantics of a posed query. Experiments show the utility of our approach.  相似文献   

17.
Nowadays, publish–subscribe (pub-sub) and event-based architectures are frequently used for developing loosely coupled distributed systems. Hence, it is desirable to find a proper solution to specify different systems through these architectures. Abstract state machine (ASM) is a useful means to visually and formally model pub–sub and event-based architectures. However, modeling per se is not enough since the designers want to be able to verify the designed models. As the model checking is a proper approach to verify software and hardware systems, in this paper, we present an approach to verify ASM models specified in terms of Asmeta language using Bogor—a well known model checker. In our approach, the AsmetaL specification is automatically encoded to BIR, the input language of the Bogor. Our experimental results show that in the most cases our approach generates more efficient results in comparison with the existing approach.  相似文献   

18.
We are interested in the problem of robust understanding from noisy spontaneous speech input. With the advances in automated speech recognition (ASR), there has been increasing interest in spoken language understanding (SLU). A challenge in large vocabulary spoken language understanding is robustness to ASR errors. State of the art spoken language understanding relies on the best ASR hypotheses (ASR 1-best). In this paper, we propose methods for a tighter integration of ASR and SLU using word confusion networks (WCNs). WCNs obtained from ASR word graphs (lattices) provide a compact representation of multiple aligned ASR hypotheses along with word confidence scores, without compromising recognition accuracy. We present our work on exploiting WCNs instead of simply using ASR one-best hypotheses. In this work, we focus on the tasks of named entity detection and extraction and call classification in a spoken dialog system, although the idea is more general and applicable to other spoken language processing tasks. For named entity detection, we have improved the F-measure by using both word lattices and WCNs, 6–10% absolute. The processing of WCNs was 25 times faster than lattices, which is very important for real-life applications. For call classification, we have shown between 5% and 10% relative reduction in error rate using WCNs compared to ASR 1-best output.  相似文献   

19.
In this paper, we introduce the backoff hierarchical class n-gram language models to better estimate the likelihood of unseen n-gram events. This multi-level class hierarchy language modeling approach generalizes the well-known backoff n-gram language modeling technique. It uses a class hierarchy to define word contexts. Each node in the hierarchy is a class that contains all the words of its descendant nodes. The closer a node to the root, the more general the class (and context) is. We investigate the effectiveness of the approach to model unseen events in speech recognition. Our results illustrate that the proposed technique outperforms backoff n-gram language models. We also study the effect of the vocabulary size and the depth of the class hierarchy on the performance of the approach. Results are presented on Wall Street Journal (WSJ) corpus using two vocabulary set: 5000 words and 20,000 words. Experiments with 5000 word vocabulary, which contain a small numbers of unseen events in the test set, show up to 10% improvement of the unseen event perplexity when using the hierarchical class n-gram language models. With a vocabulary of 20,000 words, characterized by a larger number of unseen events, the perplexity of unseen events decreases by 26%, while the word error rate (WER) decreases by 12% when using the hierarchical approach. Our results suggest that the largest gains in performance are obtained when the test set contains a large number of unseen events.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号