首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A one sample statistic is derived for the analysis of repeated measures design when the data are multivariate normal and the dimension, d, can be large compared to the sample size, n, i.e. d>n. Quadratic and bilinear forms are used to define the statistic based on Box’s approximation [Box, G.E.P., 1954. Some theorems on quadratic forms applied in the study of analysis of variance problems I: Effect of inequality of variance in the one-way classification. Annals of Mathematical Statistics 25 (2), 290-302]. The statistic has an approximate distribution, even for moderately large n. One of the main advantages of the statistic is that it can be used both for unstructured and factorially structured repeated measures designs. In the asymptotic derivations, it is assumed that n while d remains finite and fixed. However, it is demonstrated through simulations that for n as small as 10, the new statistic very closely approximates the target distribution, unaffected by even large values of . The application is illustrated using a sleep lab example with .  相似文献   

2.
The several sample case of the so-called nonparametric Behrens-Fisher problem in repeated measures designs is considered. That is, even under the null hypothesis, the marginal distribution functions in the different groups may have different shapes, and are not assumed to be equal. Moreover, the continuity of the marginal distribution functions is not required so that data with ties and, particularly, ordered categorical data are covered by this model. A multiple relative treatment effect is defined which can be estimated by using the mid-ranks of the observations within pairwise samples. The asymptotic distribution of this estimator is derived, along with a consistent estimator of its asymptotic covariance matrix. In addition, a multiple contrast test and related simultaneous confidence intervals for the relative marginal effects are derived and compared to rank-based Wald-type and ANOVA-type statistics. Simulations show that the ANOVA-type statistic and the multiple contrast test appear to maintain the pre-assigned level of the test quite accurately (even for rather small sample sizes) while the Wald-type statistic leads, as expected, to somewhat liberal decisions. Regarding the power, none of the statistics is uniformly superior. A real data set illustrates the application.  相似文献   

3.
In this paper, we present a computer program written in version 9.1 of SAS' interactive matrix language in order to implement a new approach for analyzing repeated measures data. Previous studies reported that the new procedure is as powerful as conventional solutions and generally more robust (i.e., insensitive) to violations of assumptions that underlie conventional solutions. The program also included a step-wise procedure based on the Bonferroni inequality to test comparisons among the repeated measurements. Both univariate and multivariate repeated measures data can be analyzed. Finally, the application of the SAS/IML program is illustrated with a numeric example.  相似文献   

4.
5.
In this paper, we study the problem of anomaly detection in wireless network streams. We have developed a new technique, called Stream Projected Outlier deTector (SPOT), to deal with the problem of anomaly detection from multi-dimensional or high-dimensional data streams. We conduct a detailed case study of SPOT in this paper by deploying it for anomaly detection from a real-life wireless network data stream. Since this wireless network data stream is unlabeled, a validating method is thus proposed to generate the ground-truth results in this case study for performance evaluation. Extensive experiments are conducted and the results demonstrate that SPOT is effective in detecting anomalies from wireless network data streams and outperforms existing anomaly detection methods.  相似文献   

6.
We prove that ifR is a one-dimensional ring, then each basic submodule of a freeR-moduleE contains a rank one projective summand ofE. As it is known that this property implies the pole assignability property for rings, it follows that any one-dimensional ring has the pole assignability property. Applications to systems over rings are briefly explained. The authors were partially supported by NSF grants.  相似文献   

7.
This paper focuses on the method of the simulation of a stochastic system and the main method of our paper is the Monte Carlo computation simulation method. Taking the stochastic Logistic equation as an example, we present the simulation of the sample trajectory by Euler scheme and the invariant probability distribution of stochastic differential equations with the Monte Carlo method. We also compare the simulation result with the analytical result for the autonomous stochastic Logistic model. Moreover, the stochastic Logistic equation with Markovian switching which is described by a Markov chain taking values in a finite state space is considered.  相似文献   

8.
People process information at different levels of abstraction (e.g., talking about a topic in general terms and then going into the details). They move from one level to another but focus on a particular level at any specific moment. We see this behavior in the most common of tasks, such as solving problems, communicating and designing. This paper explores the implications of levels of abstraction on designing interactive systems. It demonstrates the idea by showing the feasibility and desirability of building a simple e-mail system based on the idea of levels of abstraction and testing its usability.We believe the implications of levels of abstraction on design are profound as regards the design of interactive systems that support dynamic behavior. Having shown the feasibility of some basic design implications, we call for empirical studies to test their usability and explore more advanced design implications.  相似文献   

9.
In applications, such as post-production and archiving of audiovisual material, users are confronted with large amounts of redundant unedited raw material, called rushes. Viewing and organizing this material are crucial but time consuming tasks. Typically, multiple but slightly different takes of the same scene can be found in the rushes video. We propose a method for detecting and clustering takes of one scene shot from the same or very similar camera positions. An important subproblem is to determine the similarity of video segments. We propose a distance measure based on the Longest Common Subsequence (LCSS) model. Two variants of the proposed approach, one with a threshold parameter and one with automatically determined threshold, are compared against the Dynamic Time Warping (DTW) distance measure on six videos from the TRECVID 2007 BBC rushes summarization data set. We also evaluate the influence of the applied temporal segmentation method at the input on the results. Applications of the proposed method to automatic skimming and interactive browsing of rushes video are described.
Georg ThallingerEmail:
  相似文献   

10.
The TV-tree: An index structure for high-dimensional data   总被引:20,自引:0,他引:20  
We propose a file structure to index high-dimensionality data, which are typically points in some feature space. The idea is to use only a few of the features, using additional features only when the additional discriminatory power is absolutely necessary. We present in detail the design of our tree structure and the associated algorithms that handle such varying length feature vectors. Finally, we report simulation results, comparing the proposed structure with theR *-tree, which is one of the most successful methods for low-dimensionality spaces.The results illustrate the superiority of our method, which saves up to 80% in disk accesses.  相似文献   

11.
A framework is presented for evaluating methods of testing programmable logic arrays (PLAs), and the attributes of 25 test design methodologies are tabulated. PLA testing problems are first examined, and several test-generation algorithms are briefly described. Techniques for designing testable designs are examined, namely, special coding, parity checking, signature analysis, divide and conquer, and fully testable PLAs. The attributes that make a good testable design are then discussed. They fall into four categories: (1) testability characteristics; (2) effect on original design; (3) requirements of the application environment; and (4) design costs, i.e. how difficult it is to implement the technique  相似文献   

12.
The self-organizing map (SOM) is a very popular unsupervised neural-network model for the analysis of high-dimensional input data as in data mining applications. However, at least two limitations have to be noted, which are related to the static architecture of this model as well as to the limited capabilities for the representation of hierarchical relations of the data. With our novel growing hierarchical SOM (GHSOM) we address both limitations. The GHSOM is an artificial neural-network model with hierarchical architecture composed of independent growing SOMs. The motivation was to provide a model that adapts its architecture during its unsupervised training process according to the particular requirements of the input data. Furthermore, by providing a global orientation of the independently growing maps in the individual layers of the hierarchy, navigation across branches is facilitated. The benefits of this novel neural network are a problem-dependent architecture and the intuitive representation of hierarchical relations in the data. This is especially appealing in explorative data mining applications, allowing the inherent structure of the data to unfold in a highly intuitive fashion.  相似文献   

13.
HD-Eye: visual mining of high-dimensional data   总被引:3,自引:0,他引:3  
Clustering in high-dimensional databases poses an important problem. However, we can apply a number of different clustering algorithms to high-dimensional data. The authors consider how an advanced clustering algorithm combined with new visualization methods interactively clusters data more effectively. Experiments show these techniques improve the data mining process  相似文献   

14.
In this paper, we introduce a method for the identification of fuzzy measures from sample data. It is implemented using genetic algorithms and is flexible enough to allow the use of different subfamilies of fuzzy measures for the learning, as k-additive or p-symmetric measures. The experiments performed to test the algorithm suggest that it is robust in situations where there exists noise in the considered data. We also explore some possibilities for the choice of the initial population, which lead to the study of the extremes of some subfamilies of fuzzy measures, as well as the proposal of a method for random generation of fuzzy measures.  相似文献   

15.
Ensemble classification is a well-established approach that involves fusing the decisions of multiple predictive models. A similar “ensemble logic” has been recently applied to challenging feature selection tasks aimed at identifying the most informative variables (or features) for a given domain of interest. In this work, we discuss the rationale of ensemble feature selection and evaluate the effects and the implications of a specific ensemble approach, namely the data perturbation strategy. Basically, it consists in combining multiple selectors that exploit the same core algorithm but are trained on different perturbed versions of the original data. The real potential of this approach, still object of debate in the feature selection literature, is here investigated in conjunction with different kinds of core selection algorithms (both univariate and multivariate). In particular, we evaluate the extent to which the ensemble implementation improves the overall performance of the selection process, in terms of predictive accuracy and stability (i.e., robustness with respect to changes in the training data). Furthermore, we measure the impact of the ensemble approach on the final selection outcome, i.e. on the composition of the selected feature subsets. The results obtained on ten public genomic benchmarks provide useful insight on both the benefits and the limitations of such ensemble approach, paving the way to the exploration of new and wider ensemble schemes.  相似文献   

16.
Knowledge discovery in high-dimensional data is a challenging enterprise, but new visual analytic tools appear to offer users remarkable powers if they are ready to learn new concepts and interfaces. Our three-year effort to develop versions of the hierarchical clustering explorer (HCE) began with building an interactive tool for exploring clustering results. It expanded, based on user needs, to include other potent analytic and visualization tools for multivariate data, especially the rank-by-feature framework. Our own successes using HCE provided some testimonial evidence of its utility, but we felt it necessary to get beyond our subjective impressions. This paper presents an evaluation of the hierarchical clustering explorer (HCE) using three case studies and an e-mail user survey (n=57) to focus on skill acquisition with the novel concepts and interface for the rank-by-feature framework. Knowledgeable and motivated users in diverse fields provided multiple perspectives that refined our understanding of strengths and weaknesses. A user survey confirmed the benefits of HCE, but gave less guidance about improvements. Both evaluations suggested improved training methods.  相似文献   

17.
Over the last decade, there has been a rapid growth in the generation and analysis of the genomics data. Though the existing data analysis methods are capable of handling a particular problem, they cannot guarantee to solve all problems with different nature. Therefore, there always lie a scope of a new algorithm to solve a problem which cannot be efficiently solved by the existing algorithms. In the present work, a novel hybrid approach is proposed based on the improved version of a recently developed bio-inspired optimization technique, namely, salp swarm algorithm (SSA) for microarray classification. Initially, the Fisher score filter is employed to pre-select a subset of relevant genes from the original high-dimensional microarray dataset. Later, a weighted-chaotic SSA (WCSSA) is proposed for the simultaneous optimal gene selection and parameter optimization of the kernel extreme learning machine (KELM) classifier. The proposed scheme is experimented on both binary-class and multi-class microarray datasets. An extensive comparison is performed against original SSA-KELM, particle swarm optimized-KELM (PSO-KELM), and genetic algorithm-KELM (GA-KELM). Lastly, the proposed method is also compared against the results of sixteen existing techniques to emphasize its capacity and competitiveness to successfully reduce the number of original genes by more than 98%. The experimental results show that the genes selected by the proposed method yield higher classification accuracy compared to the alternative techniques. The performance of the proposed scheme demonstrates its effectiveness in terms of number of selected genes (NSG), accuracy, sensitivity, specificity, Matthews correlation coefficient (MCC), and F-measure. The proposed WCSSA-KELM method is validated using a ten-fold cross-validation technique.  相似文献   

18.
In this paper we present a simulation based analysis of a service robot operating in a textile mill. The robot's function is to locate and service the operating units which require the servicing. The main purpose of the analysis is to determine the “best” movement decision for the robot in each instance.We present a simple simulation experiment which clearly illustrates that an increase in production output can be realized with the help of modestly sophisticated decision rules for the robot's movement.  相似文献   

19.
20.
The problem of finding nearest neighbors has emerged as an important foundation of feature-based similarity search in multimedia databases. Most spatial index structures based on the R-tree have failed to efficiently support nearest neighbor search in arbitrarily distributed high-dimensional data sets. In contrast, the so-called filtering principle as represented by the popular VA-file has turned out to be a more promising approach. Query processing is based on a flat file of compact vector approximations. In a first stage, those approximations are sequentially scanned and filtered so that in a second stage the nearest neighbors can be determined from a relatively small fraction of the data set.

In this paper, we propose the Active Vertice method as a novel filtering approach. As opposed to the VA-file, approximation regions are arranged in a quad-tree like structure. High-dimensional feature vectors are assigned to ellipsoidal approximation regions on different levels of the tree. A compact approximation of a vector corresponds to the path within the index from the root to the respective tree node. When compared to the VA-file, our method enhances the discriminatory power of the approximations while maintaining their compactness in terms of storage consumption. To demonstrate its effectiveness, we conduct extensive experiments with synthetic as well as real-life data and show the superiority of our method over existing filtering approaches.  相似文献   


设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号