首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Spectral Curvature Clustering (SCC)   总被引:1,自引:0,他引:1  
This paper presents novel techniques for improving the performance of a multi-way spectral clustering framework (Govindu in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 1150–1157, 2005; Chen and Lerman, 2007, preprint in the supplementary webpage) for segmenting affine subspaces. Specifically, it suggests an iterative sampling procedure to improve the uniform sampling strategy, an automatic scheme of inferring the tuning parameter from the data, a precise initialization procedure for K-means, as well as a simple strategy for isolating outliers. The resulting algorithm, Spectral Curvature Clustering (SCC), requires only linear storage and takes linear running time in the size of the data. It is supported by theory which both justifies its successful performance and guides our practical choices. We compare it with other existing methods on a few artificial instances of affine subspaces. Application of the algorithm to several real-world problems is also discussed. This work was supported by NSF grant #0612608. Supplementary webpage: .  相似文献   

2.
This paper details a new approach for learning a discriminative model of object classes, incorporating texture, layout, and context information efficiently. The learned model is used for automatic visual understanding and semantic segmentation of photographs. Our discriminative model exploits texture-layout filters, novel features based on textons, which jointly model patterns of texture and their spatial layout. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating the unary classifier in a conditional random field, which (i) captures the spatial interactions between class labels of neighboring pixels, and (ii) improves the segmentation of specific object instances. Efficient training of the model on large datasets is achieved by exploiting both random feature selection and piecewise training methods. High classification and segmentation accuracy is demonstrated on four varied databases: (i) the MSRC 21-class database containing photographs of real objects viewed under general lighting conditions, poses and viewpoints, (ii) the 7-class Corel subset and (iii) the 7-class Sowerby database used in He et al. (Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 695–702, June 2004), and (iv) a set of video sequences of television shows. The proposed algorithm gives competitive and visually pleasing results for objects that are highly textured (grass, trees, etc.), highly structured (cars, faces, bicycles, airplanes, etc.), and even articulated (body, cow, etc.). J. Shotton is now working at Toshiba Corporate Research & Development Center, Kawasaki, Japan.  相似文献   

3.
In this paper we present a hierarchical and contextual model for aerial image understanding. Our model organizes objects (cars, roofs, roads, trees, parking lots) in aerial scenes into hierarchical groups whose appearances and configurations are determined by statistical constraints (e.g. relative position, relative scale, etc.). Our hierarchy is a non-recursive grammar for objects in aerial images comprised of layers of nodes that can each decompose into a number of different configurations. This allows us to generate and recognize a vast number of scenes with relatively few rules. We present a minimax entropy framework for learning the statistical constraints between objects and show that this learned context allows us to rule out unlikely scene configurations and hallucinate undetected objects during inference. A similar algorithm was proposed for texture synthesis (Zhu et al. in Int. J. Comput. Vis. 2:107–126, 1998) but didn’t incorporate hierarchical information. We use a range of different bottom-up detectors (AdaBoost, TextonBoost, Compositional Boosting (Freund and Schapire in J. Comput. Syst. Sci. 55, 1997; Shotton et al. in Proceedings of the European Conference on Computer Vision, pp. 1–15, 2006; Wu et al. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, 2007)) to propose locations of objects in new aerial images and employ a cluster sampling algorithm (C4 (Porway and Zhu, 2009)) to choose the subset of detections that best explains the image according to our learned prior model. The C4 algorithm can quickly and efficiently switch between alternate competing sub-solutions, for example whether an image patch is better explained by a parking lot with cars or by a building with vents. We also show that our model can predict the locations of objects our detectors missed. We conclude by presenting parsed aerial images and experimental results showing that our cluster sampling and top-down prediction algorithms use the learned contextual cues from our model to improve detection results over traditional bottom-up detectors alone.  相似文献   

4.
This paper describes a novel structural approach to recognize the human facial features for emotion recognition. Conventionally, features extracted from facial images are represented by relatively poor representations, such as arrays or sequences, with a static data structure. In this study, we propose to extract facial expression features vectors as Localized Gabor Features (LGF) and then transform these feature vectors into FacE Emotion Tree Structures (FEETS) representation. It is an extension of the Human Face Tree Structures (HFTS) representation presented in (Cho and Wong in Lecture notes in computer science, pp 1245–1254, 2005). This facial representation is able to simulate as human perceiving the real human face and both the entities and relationship could contribute to the facial expression features. Moreover, a new structural connectionist architecture based on a probabilistic approach to adaptive processing of data structures is presented. The so-called probabilistic based recursive neural network (PRNN) model extended from Frasconi et al. (IEEE Trans Neural Netw 9:768–785, 1998) is developed to train and recognize human emotions by generalizing the FEETS representation. For empirical studies, we benchmarked our emotion recognition approach against other well known classifiers. Using the public domain databases, such as Japanese Female Facial Expression (JAFFE) (Lyons et al. in IEEE Trans Pattern Anal Mach Intell 21(12):1357–1362, 1999; Lyons et al. in third IEEE international conference on automatic face and gesture recognition, 1998) database and Cohn–Kanade AU-Coded Facial Expression (CMU) Database (Cohn et al. in 7th European conference on facial expression measurement and meaning, 1997), our proposed system might obtain an accuracy of about 85–95% for subject-dependent and subject-independent conditions. Moreover, by testing images having artifacts, the proposed model significantly supports the robust capability to perform facial emotion recognition.  相似文献   

5.
Many cells in the primary visual cortex respond differently when a stimulus is placed outside their classical receptive field (CRF) compared to the stimulus within the CRF alone, permitting integration of information at early levels in the visual processing stream that may play a key role in intermediate-level visual tasks, such a perceptual pop-out [Knierim JJ, van Essen DC (1992) J Neurophysiol 67(5):961–980; Nothdurft HC, Gallant JL, Essen DCV (1999) Visual Neurosci 16:15–34], contextual modulation [Levitt JB, Lund JS (1997) Nature 387:73–76; Das A, Gilbert CD (1999) Nature 399:655–661; Dragoi V, Sur M (2000) J Neurophysiol 83:1019–1030], and junction detection [Sillito AM, Grieve KL, Jones HE, Cudiero J, Davis J (1995) Nature 378:492–496; Das A, Gilbert CD (1999) Nature 399:655–661; Jones HE, Wang W, Sillito AM (2002) J Neurophysiol 88:2797–2808]. In this article, we construct a computational model in programming environment TiViPE [Lourens T (2004) TiViPE—Tino’s visual programming environment. In: The 28th Annual International Computer Software & Applications Conference, IEEE COMPSAC 2004, pp 10–15] of orientation contrast type of cells and demonstrate that the model closely resembles the functional behavior of the neuronal responses of non-orientation (within the CRF) sensitive 4Cβ cells [Jones HE, Wang W, Sillito AM (2002) J Neurophysiol 88:2797–2808], and give an explanation of the indirect information flow in V1 that explains the behavior of orientation contrast sensitivity. The computational model of orientation contrast cells demonstrates excitatory responses at edges near junctions that might facilitate junction detection, but the model does not reveal perceptual pop-out.  相似文献   

6.
Current technology allows the acquisition, transmission, storing, and manipulation of large collections of images. Content-based information retrieval is now a widely investigated issue that aims at allowing users of multimedia information systems to retrieve images coherent with a sample image. A way to achieve this goal is the automatic computation of features such as color, texture, and shape and the use of these features as query terms. Feature extraction is a crucial part of any such system. Current methods for feature extraction suffer from two main problems: firstly, many methods do not retain any spatial information, and secondly, the problem of invariance with respect to standard transformation is still unsolved. In this paper, we describe some results of a study on similarity evaluation in image retrieval using shape, texture, and color as content features. Images are retrieved based on similarity of features, where features of the query specification are compared with features of the image database to determine which images match similarly with given features. In this paper, we propose an effective method for image representation which utilizes fuzzy features. The text was submitted by the author in English. Ryszard S. Choraś is Professor of Computer Science in the Department of Telecommunications and EE of University of Technology and Agriculture, Bydgoszcz, Poland. He also holds a courtesy appointment with the Faculty of Mathematics, Technology, and Natural Sciences of Kazimierz Wielki University, Bydgoszcz and the College of Computer Science, Lódz, Poland. His research interests include image signal compression and coding, computer vision, and multimedia data transmission. He received his M.S. degree in Electrical Engineering from Electronics from the Technical University of Wroclaw, Poland in 1973, and his Ph.D. degree in Electronics from Technical University of Wroclaw, Poland, in 1980, and D.Sc. (Habilitation degree) in Computer Science from Warsaw Technical University, Poland, in 1993. Until 1973–1976 he was a member of the research staff at the Institute of Mathematical Machines Silesian Division, Gliwice, working on graphics hardware and human visual perception. In 1976, he joined University of Technology and Agriculture, Bydgoszcz, Poland, first as an Assistant, then as a Professor of Computer Science at the Department of Telecommunications and EE. From 1994 to 1996, he was also Professor of Computer Sciences of the Zielona Góra University, Poland. He has served as the Chairman of the Communication Switching Division and as Chief of the Image Processing and Recognition Group. Until 1996–2002 he was the Vice Rector of University of Technology and Agriculture, Bydgoszcz. Prof. Choraś has an expertise in EU Programs and National Programs, e.g., he was coordinator of EU Program CME-02060, EU Program on Continuous Education and Technology Transfer, and coordinator of national programs in IST and multimedia in e-learning. Prof. Choraś has authored two monographs, and over 130 book chapters, journal articles, and conference papers in the area of image processing. Professor Choraś is a member of the editorial boards of “Machine Vision and Graphics.” He is the editor-in-chief of “Image Processing and Communications Journal.” He has served on numerous conference committees, e.g., Visualization, Imaging, and Image Processing (VIIP), IASTED International Conference on Signal Processing, Pattern Recognition and Applications, International Conference on Computer Vision and Graphics, ICINCO International Conference on Informatics in Control, Automation and Robotics, ICETE International Conference on E-business and Telecommunication Networks, and CORES International Conference on Computer Recognition Systems, and many others. Prof Choraś is a member of the IASTED, WSEAS, various Committees of the Polish Academy of Sciences, TPO. When not working on academic ventures, Professor Choraś likes to relax with activities such as walking, tennis, and swimming.  相似文献   

7.
We propose a new model of restricted branching programs specific to solving GEN problems, which we call incremental branching programs. We show that syntactic incremental branching programs capture previously studied models of computation for the problem GEN, namely marking machines (Cook, S.A. in J. Comput. Syst. Sci. 9(3):308–316, 1974) and Poon’s extension (Proc. of the 34th IEEE Symp. on the Foundations of Computer Science, pp. 218–227, 1993) of jumping automata on graphs (Cook, S.A., Rackoff, C.W. in SIAM J. Comput. 9:636–652, 1980). We then prove exponential size lower bounds for our syntactic incremental model, and for some other variants of branching program computation for GEN. We further show that nondeterministic syntactic incremental branching programs are provably stronger than their deterministic counterpart when solving a natural NL-complete GEN sub-problem. It remains open if syntactic incremental branching programs are as powerful as unrestricted branching programs for GEN problems. A preliminary version of this paper appears as (Gál, A., Koucky, M., McKenzie, P., Incremental branching programs, in Proc. of the 2006 Computer Science in Russia Conference CSR06. Lecture Notes in Computer Science, vol. 3967, pp. 178–190, 2006). A. Gal supported in part by NSF Grant CCF-0430695 and an Alfred P. Sloan Research Fellowship. M. Koucky did part of this work while being a postdoctoral fellow at McGill University, Canada and at CWI, Amsterdam, Netherlands. Supported in part by NWO vici project 2004–2009, project No. 1M0021620808 of MŠMT ČR, grants 201/07/P276, 201/05/0124 of GA ČR, and Institutional Research Plan No. AV0Z10190503. P. McKenzie supported by the NSERC of Canada and the (Québec) FQRNT.  相似文献   

8.
In the article a certain class of feature extractors for face recognition is presented. The extraction is based on simple approaches: image scaling with pixel concatenation into a feature vector, selection of a small number of points from the face area, face image’s spectrum, and finally pixel intensities histogram. The experiments performed on several facial image databases (BioID [4], ORL face database [27], FERET [30]) show that face recognition using this class of extractors is particularly efficient and fast, and can have straightforward implementations in software and hardware systems. They can also be used in fast face recognition system involving feature-integration, as well as a tool for similar faces retrieval in 2-tier systems (as initial processing, before exact face recognition).
Paweł ForczmańskiEmail:
  相似文献   

9.
李鸣  郭晨皓  陈星 《计算机应用》2020,40(6):1593-1600
针对开发人员难以快速从众多模型中找到自己所需的模型的问题,提出了一种基于自然语言处理技术的视觉类深度神经网络的自动标注方法。首先,划分视觉类神经网络的领域类别,根据词频等信息计算关键词及其对应的权值;其次,建立关键词提取器从论文摘要中提取出关键词;最后,将提取得到的关键词和已知权值进行相似度计算,从而得到模型的应用领域。从三大国际计算机视觉领域会议,即国际计算机视觉大会(ICCV)、IEEE国际计算机视觉与模式识别会议(CVPR)和欧洲计算机视觉国际会议(ECCV)发表的论文中选取实验数据进行实验。实验结果表明,所提方法能够提供宏平均值为0.89的高精度分类结果,验证了该方法的有效性。  相似文献   

10.
We investigate the role of sparsity and localized features in a biologically-inspired model of visual object classification. As in the model of Serre, Wolf, and Poggio, we first apply Gabor filters at all positions and scales; feature complexity and position/scale invariance are then built up by alternating template matching and max pooling operations. We refine the approach in several biologically plausible ways. Sparsity is increased by constraining the number of feature inputs, lateral inhibition, and feature selection. We also demonstrate the value of retaining some position and scale information above the intermediate feature level. Our final model is competitive with current computer vision algorithms on several standard datasets, including the Caltech 101 object categories and the UIUC car localization task. The results further the case for biologically-motivated approaches to object classification. This paper updates and extends an earlier presentation (Mutch and Lowe 2006) of this research in CVPR 2006. J. Mutch’s research described in this paper was carried out at the University of British Columbia.  相似文献   

11.
Models that captures the common structure of an object class have appeared few years ago in the literature (Jojic and Caspi in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 212?C219, 2004; Winn and Jojic in Proceedings of International Conference on Computer Vision (ICCV), pp. 756?C763, 2005); they are often referred as ??stel models.?? Their main characteristic is to segment objects in clear, often semantic, parts as a consequence of the modeling constraint which forces the regions belonging to a single segment to have a tight distribution over local measurements, such as color or texture. This self-similarity within a region in a single image is typical of many meaningful image parts, even when across different images of similar objects, the corresponding parts may not have similar local measurements. Moreover, the segmentation itself is expected to be consistent within a class, although still flexible. These models have been applied mostly to segmentation scenarios. In this paper, we extent those ideas (1) proposing to capture correlations that exist in structural elements of an image class due to global effects, (2) exploiting the segmentations to capture feature co-occurrences and (3) allowing the use of multiple, eventually sparse, observation of different nature. In this way we obtain richer models more suitable to recognition tasks. We accomplish these requirements using a novel approach we dubbed stel component analysis. Experimental results show the flexibility of the model as it can deal successfully with image/video segmentation and object recognition where, in particular, it can be used as an alternative of, or in conjunction with, bag-of-features and related classifiers, where stel inference provides a meaningful spatial partition of features.  相似文献   

12.
Computer vision and recognition is playing an increasingly important role in modern intelligent control. Object detection is the first and most important step in object recognition. Traditionally, a special object can be recognized by the template matching method, but the recognition speed has always been a problem. In this article, an improved general genetic algorithm-based face recognition system is proposed. The genetic algorithm (GA) has been considered to be a robust and global searching method. Here, the chromosomes generated by GA contain the information needed to recognize the object. The purpose of this article is to propose a practical method of face detection and recognition. Finally, the experimental results, and a comparison with the traditional template matching method, and some other considerations, are also given. This work was presented in part at the 11th International Symposium on Artificial Life and Robotics, Oita, Japan, January 23–25, 2006  相似文献   

13.
This paper describes a novel method for tracking complex non-rigid motions by learning the intrinsic object structure. The approach builds on and extends the studies on non-linear dimensionality reduction for object representation, object dynamics modeling and particle filter style tracking. First, the dimensionality reduction and density estimation algorithm is derived for unsupervised learning of object intrinsic representation, and the obtained non-rigid part of object state reduces even to 2-3 dimensions. Secondly the dynamical model is derived and trained based on this intrinsic representation. Thirdly the learned intrinsic object structure is integrated into a particle filter style tracker. It is shown that this intrinsic object representation has some interesting properties and based on which the newly derived dynamical model makes particle filter style tracker more robust and reliable.Extensive experiments are done on the tracking of challenging non-rigid motions such as fish twisting with selfocclusion, large inter-frame lip motion and facial expressions with global head rotation. Quantitative results are given to make comparisons between the newly proposed tracker and the existing tracker. The proposed method also has the potential to solve other type of tracking problems.  相似文献   

14.
With the development in IT technology and with growing demands of users, a ubiquitous environment is being made. Because individual identification is important in ubiquitous environment, RFID technology would be used frequently. RFID is a radio frequency identification technology to replace bar code. The reader transmits query (request of user information) and tag-provides user information. RFID has various advantages, such as high speed identification rates, mass memory storages. However, eavesdropping is possible as well as a problem that user information is exposed (Juels et al. in Conference on Computer and Communications Security—ACM CCS, pp. 103–111, 2003; Ohkubo et al. in RFID Privacy Workshop 2003; Weis et al. in International Conference on Security in Pervasive Computing, pp. 201–212, 2003; Weis et al. in Cryptographic Hardware and Embedded Systems—CHES, pp. 454–469, 2002). Therefore, when off-line customer had visited bank for banking service, RNTS (RFID number ticket service) system provides both anonymity in customer identification and efficiency of banking service. In addition, RNTS system protects privacy of an off-line user visiting the bank and it is an efficient method offering service in order of arriving in the bank.  相似文献   

15.
In this paper, we present a new method for dealing with feature subset selection based on fuzzy entropy measures for handling classification problems. First, we discretize numeric features to construct the membership function of each fuzzy set of a feature. Then, we select the feature subset based on the proposed fuzzy entropy measure focusing on boundary samples. The proposed method can select relevant features to get higher average classification accuracy rates than the ones selected by the MIFS method (Battiti, R. in IEEE Trans. Neural Netw. 5(4):537–550, 1994), the FQI method (De, R.K., et al. in Neural Netw. 12(10):1429–1455, 1999), the OFEI method, Dong-and-Kothari’s method (Dong, M., Kothari, R. in Pattern Recognit. Lett. 24(9):1215–1225, 2003) and the OFFSS method (Tsang, E.C.C., et al. in IEEE Trans. Fuzzy Syst. 11(2):202–213, 2003).
Shyi-Ming ChenEmail:
  相似文献   

16.
17.
Consonants in written Hindi often carry annotations indicating the nature of the following vowel, which is not written separately. When there is no explicit marking, schwa is the default vowel, but this vowel does not always emerge in a word’s pronunciation. In addition, morphological boundaries can block the deletion of inherent schwas. Previous implementations of schwa deletion in the domain of text-to-speech synthesis (Narasimhan et al., International Journal of Speech Technology, 7(4):319–333, 2004; Choudhury and Basu, Proceedings of the International Conference on Knowledge-Based Computer Systems, 343–353, 2002) delete schwa in phonetic environments that obey the phonotactic constraints of Hindi within word boundaries. Instead of using segmental contexts, in conjunction with a morphological analysis, to predict schwa deletion, we used an account of syllable structure and stress assignment for two- and three-syllable words (Beckman and Pierrehumbert, forthcoming) to predict the presence and absence of schwa in a corpus of phonetically transcribed Hindi. Our algorithm scored as high as 95% accuracy on the deletion of schwa from a small corpus of Hindi words.  相似文献   

18.
We study the on-line minimum weighted bipartite matching problem in arbitrary metric spaces. Here, n not necessary disjoint points of a metric space M are given, and are to be matched on-line with n points of M revealed one by one. The cost of a matching is the sum of the distances of the matched points, and the goal is to find or approximate its minimum. The competitive ratio of the deterministic problem is known to be Θ(n), see (Kalyanasundaram, B., Pruhs, K. in J. Algorithms 14(3):478–488, 1993) and (Khuller, S., et al. in Theor. Comput. Sci. 127(2):255–267, 1994). It was conjectured in (Kalyanasundaram, B., Pruhs, K. in Lecture Notes in Computer Science, vol. 1442, pp. 268–280, 1998) that a randomized algorithm may perform better against an oblivious adversary, namely with an expected competitive ratio Θ(log n). We prove a slightly weaker result by showing a o(log 3 n) upper bound on the expected competitive ratio. As an application the same upper bound holds for the notoriously hard fire station problem, where M is the real line, see (Fuchs, B., et al. in Electonic Notes in Discrete Mathematics, vol. 13, 2003) and (Koutsoupias, E., Nanavati, A. in Lecture Notes in Computer Science, vol. 2909, pp. 179–191, 2004). The authors were partially supported by OTKA grants T034475 and T049398.  相似文献   

19.
A classic result known as the speed-up theorem in machine-independent complexity theory shows that there exist some computable functions that do not have best programs for them (Blum in J. ACM 14(2):322–336, 1967 and J. ACM 18(2):290–305, 1971). In this paper we lift this result into type-2 computations. Although the speed-up phenomenon is essentially inherited from type-1 computations, we observe that a direct application of the original proof to our type-2 speed-up theorem is problematic because the oracle queries can interfere with the speed of the programs and hence the cancellation strategy used in the original proof is no longer correct at type-2. We also argue that a type-2 analog of the operator speed-up theorem (Meyer and Fischer in J. Symb. Log. 37:55–68, 1972) does not hold, which suggests that this curious speed-up phenomenon disappears in higher-typed computations beyond type-2. The result of this paper adds one more piece of evidence to support the general type-2 complexity theory under the framework proposed in Li (Proceedings of the Third International Conference on Theoretical Computer Science, pp. 471–484, 2004 and Proceedings of Computability in Europe: Logical Approach to Computational Barriers, pp. 182–192, 2006) and Li and Royer (On type-2 complexity classes: Preliminary report, pp. 123–138, 2001) as a reasonable setup.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号