首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 12 毫秒
1.
Classification of time series has been attracting great interest over the past decade. While dozens of techniques have been introduced, recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat for most time series problems, especially for large-scale datasets. While this may be considered good news, given the simplicity of implementing the nearest neighbor algorithm, there are some negative consequences of this. First, the nearest neighbor algorithm requires storing and searching the entire dataset, resulting in a high time and space complexity that limits its applicability, especially on resource-limited sensors. Second, beyond mere classification accuracy, we often wish to gain some insight into the data and to make the classification result more explainable, which global characteristics of the nearest neighbor cannot provide. In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. Informally, shapelets are time series subsequences which are in some sense maximally representative of a class. We can use the distance to the shapelet, rather than the distance to the nearest neighbor to classify objects. As we shall show with extensive empirical evaluations in diverse domains, classification algorithms based on the time series shapelet primitives can be interpretable, more accurate, and significantly faster than state-of-the-art classifiers.  相似文献   

2.
Clustering based on matrix approximation: a unifying view   总被引:8,自引:7,他引:1  
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. Recently, a number of methods have been proposed and demonstrated good performance based on matrix approximation. Despite significant research on these methods, few attempts have been made to establish the connections between them while highlighting their differences. In this paper, we present a unified view of these methods within a general clustering framework where the problem of clustering is formulated as matrix approximations and the clustering objective is minimizing the approximation error between the original data matrix and the reconstructed matrix based on the cluster structures. The general framework provides an elegant base to compare and understand various clustering methods. We provide characterizations of different clustering methods within the general framework including traditional one-side clustering, subspace clustering and two-side clustering. We also establish the connections between our general clustering framework with existing frameworks.
Tao LiEmail:
  相似文献   

3.
4.
Time series motifs are approximately repeated subsequences found within a longer time series. They have been in the literature since 2002, but recently they have begun to receive significant attention in research and industrial communities. This is perhaps due to the growing realization that they implicitly offer solutions to a host of time series problems, including rule discovery, anomaly detection, density estimation, semantic segmentation, summarization, etc. Recent work has improved the scalability so exact motifs can be computed on datasets with up to a million data points in tenable time. However, in some domains, for example seismology or climatology, there is an immediate need to address even larger datasets. In this work, we demonstrate that a combination of a novel algorithm and a high-performance GPU allows us to significantly improve the scalability of motif discovery. We demonstrate the scalability of our ideas by finding the full set of exact motifs on a dataset with one hundred and forty-three million subsequences, which is by far the largest dataset ever mined for time series motifs/joins; it requires ten quadrillion pairwise comparisons. Furthermore, we demonstrate that our algorithm can produce actionable insights into seismology and ethology.  相似文献   

5.
This paper presents new easy methods for directly obtaining the shift-transformation matrix, direct-product matrix and summation matrix of discrete Walsh series. The proposed methods can be conveniently implemented with a digital computer. They will be very useful in the study of control systems via discrete Walsh series  相似文献   

6.
7.
Various models for time series of counts which can account for discreteness, overdispersion and serial correlation are compared. Besides observation- and parameter-driven models based upon corresponding conditional Poisson distributions, a dynamic ordered probit model as a flexible specification to capture the salient features of time series of counts is also considered. For all models, appropriate efficient estimation procedures are presented. For the parameter-driven specification this requires Monte-Carlo procedures like simulated maximum likelihood or Markov chain Monte Carlo. The methods, including corresponding diagnostic tests, are illustrated using data on daily admissions for asthma to a single hospital. Estimation results turn out to be remarkably similar across the different models.  相似文献   

8.
9.
10.
The paper considers the problem of illuminant estimation: how, given an image of a scene, recorded under an unknown light, we can recover an estimate of that light. Obtaining such an estimate is a central part of solving the color constancy problem. Thus, the work presented will have applications in fields such as color-based object recognition and digital photography. Rather than attempting to recover a single estimate of the illuminant, we instead set out to recover a measure of the likelihood that each of a set of possible illuminants was the scene illuminant. We begin by determining which image colors can occur (and how these colors are distributed) under each of a set of possible lights. We discuss how, for a given camera, we can obtain this knowledge. We then correlate this information with the colors in a particular image to obtain a measure of the likelihood that each of the possible lights was the scene illuminant. Finally, we use this likelihood information to choose a single light as an estimate of the scene illuminant. Computation is expressed and performed in a generic correlation framework which we develop. We propose a new probabilistic instantiation of this correlation framework and show that it delivers very good color constancy on both synthetic and real images. We further show that the proposed framework is rich enough to allow many existing algorithms to be expressed within it: the gray-world and gamut-mapping algorithms are presented in this framework and we also explore the relationship of these algorithms to other probabilistic and neural network approaches to color constancy  相似文献   

11.
Scrutineer is an interactive, user-friendly program designed to search for motifs, patterns and profiles in the Swissprot, Protein Identification Resource (PIR) or SeqDb protein sequence databases. Basic capabilities include (i) searches for strings of amino acids with multiple choices at a given position; (ii) searches for strings including variable-length segments and delocalized constraints; (iii) searches over subsets of a database or particular regions within each sequence (e.g. N-terminal one-third); (iv) searches involving secondary structure predictions, physicochemical characteristics, and the like; and (v) searches using aligned sequences as targets with various optional weighting schemes. The various search criteria and hits can be combined and complex targets located. Once the data are loaded into virtual memory, all occurrences in PIR release 22.0 (3.7 x 10(6) amino acids) of a given short string of amino acids (e.g. a hexamer) are found in approximately 36 s. Scrutineer can also describe the entire database, user-specified hits, user-defined regions of sequence and all hits. The source code and accompanying manual are being freely distributed.  相似文献   

12.
Forecasting the behavior of variables (e.g., economic, financial, physical) is of strategic value for organizations, which helps to sustain practical interest in the development of alternative models and resolution procedures. This paper presents a non-linear model that combines radial basis functions and the ARMA(pq) structure. The optimal set of parameters for such a model is difficult to find. In this paper, a scatter search meta-heuristic is used to find this optimal set. Five time series are analyzed to assess and illustrate the pertinence of the proposed meta-heuristic method.  相似文献   

13.
Specific occupational construction safety, health, and well-being related knowledge and information are scattered and fragmented. Despite technological advancements of information and knowledge management, a link between safety management and information models is still missing. In this paper we present first steps towards a unifying formal (logic-based) domain model of construction safety, called SafeConDM, that consists of: (1) a semantically rich ontology of hazard, safety concepts, and concept relationships that builds on, and integrates with, existing construction safety ontologies and building information models; (2) a set of first-order if-then rules linking construction site states with the potential for specific hazards to occur that we define in a novel way using spatial artefacts. We present a prototype software tool, based on our ASP4BIM tool that implements SafeConDM for construction hazard analysis and safe construction planning decision support, and empirically evaluate our tool on three real-world construction building models.  相似文献   

14.
The transition matrixvarphicorresponding to then-dimensional matrixAcan be represented byvarphi(t) = g_{1}(t)I + g_{2}(t)A + ... + g_{n}(t)A^{n-1}, where the vectorg^{T} = (g_{1}, ... , g_{n})is generated fromdot{g}^{T} = g^{T}A_{c}, g^{T}(0) = (1, 0, ... , 0)and Acis the companion matrix toA. The result is applied to the covariance differential equationdot{C} = AC + CA^{T} + Qand its solution is written as a finite series. The equations are presented in a form amenable for implementation on a digital computer.  相似文献   

15.
Woodcock A 《Ergonomics》2007,50(10):1547-1560
Educational ergonomics - the teaching of ergonomics and the design of environments where ergonomics teaching and learning might occur - has received little attention from ergonomists. This paper first describes the roots of the author's interest and research in educational ergonomics; second it provides a personal view of the opportunities and challenges posed by the two streams of educational ergonomics; and lastly it considers the implications of teaching ergonomics to children in terms of their personal development, the design of schools and the impact such initiatives might have on wider societal problems.  相似文献   

16.
Generalization properties of support vector machines, orthogonal least squares and zero-order regularized orthogonal least squares algorithms are studied using simulation. For high signal-to-noise ratios (40 dB), mixed results are obtained, but for a low signal-to-noise ratio, the prediction performance of support vector machines is better than the orthogonal least squares algorithm in the examples considered. However, the latter can usually give a parsimonious model with very fast training and testing time. Two new algorithms are therefore proposed that combine the orthogonal least squares algorithm with support vector machines to give a parsimonious model with good prediction accuracy in the low signal-to-noise ratio case.  相似文献   

17.
The notion of off-line/on-line digital signature scheme was introduced by Even, Goldreich and Micali. Informally such signatures schemes are used to reduce the time required to compute a signature using some kind of preprocessing. Even, Goldreich and Micali show how to realize off-line/on-line digital signature schemes by combining regular digital signatures with efficient one-time signatures. Later, Shamir and Tauman presented an alternative construction (which produces shorter signatures) obtained by combining regular signatures with chameleon hash functions. In this paper, we study off-line/on-line digital signature schemes both from a theoretic and a practical perspective. More precisely, our contribution is threefold. First, we unify the Shamir–Tauman and Even et al. approaches by showing that they can be seen as different instantiations of the same paradigm. We do this by showing that the one-time signatures needed in the Even et al. approach only need to satisfy a weak notion of security. We then show that chameleon hashing is basically a one-time signature which satisfies such a weaker security notion. As a by-product of this result, we study the relationship between one-time signatures and chameleon hashing, and we prove that a special type of chameleon hashing (which we call double-trapdoor) is actually a fully secure one-time signature. Next, we consider the task of building, in a generic fashion, threshold variants of known schemes: Crutchfield et al. proposed a generic way to construct a threshold off-line/on-line signature scheme given a threshold regular one. They applied known threshold techniques to the Shamir–Tauman construction using a specific chameleon hash function. Their solution introduces additional computational assumptions which turn out to be implied by the so-called one-more discrete logarithm assumption. Here, we propose two generic constructions that can be based on any threshold signature scheme, combined with a specific (double-trapdoor) chameleon hash function. Our constructions are efficient and can be proven secure in the standard model using only the traditional discrete logarithm assumption. Finally, we ran experimental tests to measure the difference between the real efficiency of the two known constructions for non-threshold off-line/on-line signatures. Interestingly, we show that, using some optimizations, the two approaches are comparable in efficiency and signature length.  相似文献   

18.
The motion of an object (such as a wheel rotating) is seen as consistent independent of its position and size on the retina. Neurons in higher cortical visual areas respond to these global motion stimuli invariantly, but neurons in early cortical areas with small receptive fields cannot represent this motion, not only because of the aperture problem but also because they do not have invariant representations. In a unifying hypothesis with the design of the ventral cortical visual system, we propose that the dorsal visual system uses a hierarchical feedforward network architecture (V1, V2, MT, MSTd, parietal cortex) with training of the connections with a short-term memory trace associative synaptic modification rule to capture what is invariant at each stage. Simulations show that the proposal is computationally feasible, in that invariant representations of the motion flow fields produced by objects self-organize in the later layers of the architecture. The model produces invariant representations of the motion flow fields produced by global in-plane motion of an object, in-plane rotational motion, looming versus receding of the object, and object-based rotation about a principal axis. Thus, the dorsal and ventral visual systems may share some similar computational principles.  相似文献   

19.
Most scientific publication information, which may reflects scientists’ research interests, is publicly available on the Web. Understanding the characteristics of research interests from previous publications may help to provide better services for scientists in the Web age. In this paper, we introduce some parameters to track the evolution process of research interests, we analyze their structural and dynamic characteristics. According to the observed characteristics of research interests, under the framework of unifying search and reasoning (ReaSearch), we propose interests-based unification of search and reasoning (I-ReaSearch). Under the proposed I-ReaSearch method, we illustrate how research interests can be used to improve literature search on the Web. According to the relationship between an author’s own interests and his/her co-authors interests, social group interests are also used to refine the literature search process. Evaluation from both the user satisfaction and the scalability point of view show that the proposed I-ReaSearch method provides a user centered and practical way to problem solving on the Web. The efforts provide some hints and various methods to support personalized search, and can be considered as a step forward user centric knowledge retrieval on the Web. From the standpoint of the Active Media Technology (AMT) on the Wisdom Web, in this paper, the study on the characteristics of research interests is based on complex networks and human dynamics, which can be considered as an effort towards utilizing information physics to discover and explain the phenomena related to research interests of scientists. The application of research interests aims at providing scientific researchers best means and best ends in an active way for literature search on the Web.  相似文献   

20.
Keil MS 《Neural computation》2006,18(4):871-903
Recent evidence suggests that the primate visual system generates representations for object surfaces (where we consider representations for the surface attribute brightness). Object recognition can be expected to perform robustly if those representations are invariant despite environmental changes (e.g., in illumination). In real-world scenes, it happens, however, that surfaces are often overlaid by luminance gradients, which we define as smooth variations in intensity. Luminance gradients encode highly variable information, which may represent surface properties (curvature), nonsurface properties (e.g., specular highlights, cast shadows, illumination inhomogeneities), or information about depth relationships (cast shadows, blur). We argue, on grounds of the unpredictable nature of luminance gradients, that the visual system should establish corresponding representations, in addition to surface representations. We accordingly present a neuronal architecture, the so-called gradient system, which clarifies how spatially accurate gradient representations can be obtained by relying on only high-resolution retinal responses. Although the gradient system was designed and optimized for segregating, and generating, representations of luminance gradients with real-world luminance images, it is capable of quantitatively predicting psychophysical data on both Mach bands and Chevreul's illusion. It furthermore accounts qualitatively for a modified Ehrenstein disk.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号