Identifying a discriminative feature can effectively improve the classification performance of aerial scene classification. Deep convolutional neural networks (DCNN) have been widely used in aerial scene classification for its learning discriminative feature ability. The DCNN feature can be more discriminative by optimizing the training loss function and using transfer learning methods. To enhance the discriminative power of a DCNN feature, the improved loss functions of pretraining models are combined with a softmax loss function and a centre loss function. To further improve performance, in this article, we propose hybrid DCNN features for aerial scene classification. First, we use DCNN models with joint loss functions and transfer learning from pretrained deep DCNN models. Second, the dense DCNN features are extracted, and the discriminative hybrid features are created using linear connection. Finally, an ensemble extreme learning machine (EELM) classifier is adopted for classification due to its general superiority and low computational cost. Experimental results based on the three public benchmark data sets demonstrate that the hybrid features obtained using the proposed approach and classified by the EELM classifier can result in remarkable performance. 相似文献
In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network (CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count (CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods. 相似文献
Model classification is essential to the management and reuse of 3D CAD models. Manual model classification is laborious and error prone. At the same time, the automatic classification methods are scarce due to the intrinsic complexity of 3D CAD models. In this paper, we propose an automatic 3D CAD model classification approach based on deep neural networks. According to prior knowledge of the CAD domain, features are selected and extracted from 3D CAD models first, and then pre-processed as high dimensional input vectors for category recognition. By analogy with the thinking process of engineers, a deep neural network classifier for 3D CAD models is constructed with the aid of deep learning techniques. To obtain an optimal solution, multiple strategies are appropriately chosen and applied in the training phase, which makes our classifier achieve better per-formance. We demonstrate the efficiency and effectiveness of our approach through experiments on 3D CAD model datasets. 相似文献
Conventional object recognition techniques rely heavily on manually annotated image datasets to achieve good performances. However, collecting high quality datasets is really laborious. The image search engines such as Google Images seem to provide quantities of object images. Unfortunately, a large portion of the search images are irrelevant. In this paper, we propose a semi-supervised framework for learning visual categories from Google Images. We exploit a co-training algorithm, the CoBoost algorithm, and integrate it with two kinds of features, the 1st and 2nd order features, which define bag of words representation and spatial relationship between local features, respectively. We create two boosting classifiers based on the 1st and 2nd order features in the training, during which one classifier provides labels for the other. The 2nd order features are generated dynamically rather than extracted exhaustively to avoid high computation. An active learning technique is also introduced to further improve the performance. Experimental results show that the object models learned from Google Images by our method are competitive with the state-of-the-art unsupervised approaches and some supervised techniques on the standard benchmark datasets. 相似文献
ABSTRACTA new architecture of deep neural networks, directed acyclic graph convolutional neural networks (DAG-CNNs), is used to classify heartbeats from electrocardiogram (ECG) signals into different subject-based classes. DAG-CNNs not only fuse the feature extraction and classification stages of the ECG classification into a single automated learning procedure, but also utilized multi-scale features and perform score-level fusion of multiple classifiers automatically. Therefore, DAG-CNN negates the necessity to extract hand-crafted features. In most of the current approaches, only the high level features which extracted by the last layer of CNN are used. Instead of performing feature level fusion manually and feeding the results into a classifier, the proposed multi-scale system can automatically learn different level of features, combine them and predict the output label. The results over the MIT-BIH arrhythmia benchmarks database demonstrate that the proposed system achieves a superior classification performance compared to most of the state-of-the-art methods. 相似文献
Ensemble methods aim at combining multiple learning machines to improve the efficacy in a learning task in terms of prediction accuracy, scalability, and other measures. These methods have been applied to evolutionary machine learning techniques including learning classifier systems (LCSs). In this article, we first propose a conceptual framework that allows us to appropriately categorize ensemble‐based methods for fair comparison and highlights the gaps in the corresponding literature. The framework is generic and consists of three sequential stages: a pre‐gate stage concerned with data preparation; the member stage to account for the types of learning machines used to build the ensemble; and a post‐gate stage concerned with the methods to combine ensemble output. A taxonomy of LCSs‐based ensembles is then presented using this framework. The article then focuses on comparing LCS ensembles that use feature selection in the pre‐gate stage. An evaluation methodology is proposed to systematically analyze the performance of these methods. Specifically, random feature sampling and rough set feature selection‐based LCS ensemble methods are compared. Experimental results show that the rough set‐based approach performs significantly better than the random subspace method in terms of classification accuracy in problems with high numbers of irrelevant features. The performance of the two approaches are comparable in problems with high numbers of redundant features. 相似文献
Sentiment analysis focuses on identifying and classifying the sentiments expressed in text messages and reviews. Social networks like Twitter, Facebook, and Instagram generate heaps of data filled with sentiments, and the analysis of such data is very fruitful when trying to improve the quality of both products and services alike. Classic machine learning techniques have a limited capability to efficiently analyze such large amounts of data and produce precise results; they are thus supported by deep learning models to achieve higher accuracy. This study proposes a combination of convolutional neural network and long short‐term memory (CNN‐LSTM) deep network for performing sentiment analysis on Twitter datasets. The performance of the proposed model is analyzed with machine learning classifiers, including the support vector classifier, random forest (RF), stochastic gradient descent (SGD), logistic regression, a voting classifier (VC) of RF and SGD, and state‐of‐the‐art classifier models. Furthermore, two feature extraction methods (term frequency‐inverse document frequency and word2vec) are also investigated to determine their impact on prediction accuracy. Three datasets (US airline sentiments, women's e‐commerce clothing reviews, and hate speech) are utilized to evaluate the performance of the proposed model. Experiment results demonstrate that the CNN‐LSTM achieves higher accuracy than those of other classifiers. 相似文献
The Chinese pronunciation system offers two characteristics that distinguish it from other languages: deep phonemic orthography and intonation variations. In this paper, we hypothesize that these two important properties can play a major role in Chinese sentiment analysis. In particular, we propose two effective features to encode phonetic information and, hence, fuse it with textual information. With this hypothesis, we propose Disambiguate Intonation for Sentiment Analysis (DISA), a network that we develop based on the principles of reinforcement learning. DISA disambiguates intonations for each Chinese character (pinyin) and, hence, learns precise phonetic representations. We also fuse phonetic features with textual and visual features to further improve performance. Experimental results on five different Chinese sentiment analysis datasets show that the inclusion of phonetic features significantly and consistently improves the performance of textual and visual representations and surpasses the state-of-the-art Chinese character-level representations. 相似文献
In today’s digital world, millions of individuals are linked to one another via the Internet and social media. This opens up new avenues for information exchange with others. Sentiment analysis (SA) has gotten a lot of attention during the last decade. We analyse the challenges of Sentiment Analysis (SA) in one of the Asian regional languages known as Marathi in this study by providing a benchmark setup in which we first produced an annotated dataset composed of Marathi text acquired from microblogging websites such as Twitter. We also choose domain experts to manually annotate Marathi microblogging posts with positive, negative, and neutral polarity. In addition, to show the efficient use of the annotated dataset, an ensemble-based model for sentiment analysis was created. In contrast to others machine learning classifier, we achieved better performance in terms of accuracy for ensemble classifier with 10-fold cross-validation (cv), outcomes as 97.77%, f-score is 97.89%. 相似文献
Accurate retinal vessel segmentation is very challenging. Recently, the deep learning based method has greatly improved performance. However, the non-vascular structures usually harm the performance and some low contrast small vessels are hard to be detected after several down-sampling operations. To solve these problems, we design a deep fusion network (DF-Net) including multiscale fusion, feature fusion and classifier fusion for multi-source vessel image segmentation. The multiscale fusion module allows the network to detect blood vessels with different scales. The feature fusion module fuses deep features with vessel responses extracted from a Frangi filter to obtain a compact yet domain invariant feature representation. The classifier fusion module provides the network more supervision. DF-Net also predicts the parameter of the Frangi filter to avoid manually picking the best parameters. The learned Frangi filter enhances the feature map of the multiscale network and restores the edge information loss caused by down-sampling operations. The proposed end-to-end network is easy to train and the inference time for one image is 41ms on a GPU. The model outperforms state-of-the-art methods and achieves the accuracy of 96.14%, 97.04%, 98.02% from three publicly available fundus image datasets DRIVE, STARE, CHASEDB1, respectively. The code is available at https://github.com/y406539259/DF-Net. 相似文献
Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single variable classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices. 相似文献
The rapidly increasing popularity of mobile devices has changed the methods with which people access various network services and increased network traffic markedly. Over the past few decades, network traffic identification has been a research hotspot in the field of network management and security monitoring. However, as more network services use encryption technology, network traffic identification faces many challenges. Although classic machine learning methods can solve many problems that cannot be solved by port- and payload-based methods, manually extract features that are frequently updated is time-consuming and labor-intensive. Deep learning has good automatic feature learning capabilities and is an ideal method for network traffic identification, particularly encrypted traffic identification; Existing recognition methods based on deep learning primarily use supervised learning methods and rely on many labeled samples. However, in real scenarios, labeled samples are often difficult to obtain. This paper adjusts the structure of the auxiliary classification generation adversarial network (ACGAN) so that it can use unlabeled samples for training, and use the wasserstein distance instead of the original cross entropy as the loss function to achieve semisupervised learning. Experimental results show that the identification accuracy of ISCX and USTC data sets using the proposed method yields markedly better performance when the number of labeled samples is small compared to that of convolutional neural network (CNN) based classifier. 相似文献
Social networking platforms have witnessed tremendous growth of textual, visual, audio, and mix-mode contents for expressing the views or opinions. Henceforth, Sentiment Analysis (SA) and Emotion Detection (ED) of various social networking posts, blogs, and conversation are very useful and informative for mining the right opinions on different issues, entities, or aspects. The various statistical and probabilistic models based on lexical and machine learning approaches have been employed for these tasks. The emphasis was given to the improvement in the contemporary tools, techniques, models, and approaches, are reflected in majority of the literature. With the recent developments in deep neural networks, various deep learning models are being heavily experimented for the accuracy enhancement in the aforementioned tasks. Recurrent Neural Network (RNN) and its architectural variants such as Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) comprise an important category of deep neural networks, basically adapted for features extraction in the temporal and sequential inputs. Input to SA and related tasks may be visual, textual, audio, or any combination of these, consisting of an inherent sequentially, we critically investigate the role of sequential deep neural networks in sentiment analysis of multimodal data. Specifically, we present an extensive review over the applicability, challenges, issues, and approaches for textual, visual, and multimodal SA using RNN and its architectural variants.
Boost learning algorithm, such as AdaBoost, has been widely used in a variety of applications in multimedia and computer vision. Relevance feedback-based image retrieval has been formulated as a classification problem with a small number of training samples. Several machine learning techniques have been applied to this problem recently. In this paper, we propose a novel paired feature AdaBoost learning system for relevance feedback-based image retrieval. To facilitate density estimation in our feature learning method, we propose an ID3-like balance tree quantization method to preserve most discriminative information. By using paired feature combination, we map all training samples obtained in the relevance feedback process onto paired feature spaces and employ the AdaBoost algorithm to select a few feature pairs with best discrimination capabilities in the corresponding paired feature spaces. In the AdaBoost algorithm, we employ Bayesian classification to replace the traditional binary weak classifiers to enhance their classification power, thus producing a stronger classifier. Experimental results on content-based image retrieval (CBIR) show superior performance of the proposed system compared to some previous methods. 相似文献
Machinery fault diagnosis is of great significance to improve the reliability of smart manufacturing. Deep learning based fault diagnosis methods have achieved great success. However, the features extracted by different models may vary resulting in ambiguous representation of the data, and even wasted time with manually selecting the optimal hyperparameters. To solve the problems, this paper proposes a new framework named Ensemble Sparse Supervised Model (ESSM), in which a typical deep learning model is treated as two phases of feature learning and model learning. In the feature learning phase, the original data is represented to be a feature matrix as non-redundant as possible by applying sparse filtering. Then, the feature matrix is fed into the model learning phase. Regularization, dropout and rectified linear unit (ReLU) are used in the model's neurons and layers to build a sparse deep neural network. Finally, the output of the sparse deep neural network provides feedback to the first phase to obtain better sparse features. In the proposed method, hyperparameters need to be pre-specified and a python library of talos is employed to finish the process automatically. The proposed method is verified using the bearing data provided by Case Western Reserve University. The result demonstrates that the proposed method can capture the effective pattern of data with the help of sparse constraints and simultaneously provide convenience for the operators with assuring performance. 相似文献