首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The defense techniques for machine learning are critical yet challenging due to the number and type of attacks for widely applied machine learning algorithms are significantly increasing. Among these attacks, the poisoning attack, which disturbs machine learning algorithms by injecting poisoning samples, is an attack with the greatest threat. In this paper, we focus on analyzing the characteristics of positioning samples and propose a novel sample evaluation method to defend against the poisoning attack catering for the characteristics of poisoning samples. To capture the intrinsic data characteristics from heterogeneous aspects, we first evaluate training data by multiple criteria, each of which is reformulated from a spectral clustering. Then, we integrate the multiple evaluation scores generated by the multiple criteria through the proposed multiple spectral clustering aggregation (MSCA) method. Finally, we use the unified score as the indicator of poisoning attack samples. Experimental results on intrusion detection data sets show that MSCA significantly outperforms the K-means outlier detection in terms of data legality evaluation and poisoning attack detection.  相似文献   

2.
近红外透射光谱聚类分析快速鉴别食用油种类   总被引:12,自引:1,他引:11  
以8种食用油纯油的43个样品为对象,研究了近红外透射光谱结合聚类分析法快速鉴别食用油种类的可行性.采集样品在12 500~4 000 cm-1范围内的傅立叶变换近红外透射光谱,利用光谱模式识别法中的聚类分析法对图谱进行定性分类鉴别.实验证明,光谱经二阶导数预处理后,最短距离法、最长距离法和方差平方和法均可准确无误地将食用油样品分为8类,判别模型对预测集样品的准确率达到100%.研究表明,近红外透射光谱结合聚类分析法可以为快速无损鉴别食用油种类提供一种准确可靠的方法.  相似文献   

3.
Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning. This paper implements and compares unsupervised and semi-supervised clustering analysis of BOAArgo ocean text data. Unsupervised K-Means and Affinity Propagation (AP) are two classical clustering algorithms. The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range. Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition, and use this data for semi-supervised cluster analysis. Several semi-supervised clustering algorithms were chosen for comparison of learning performance: Constrained-K-Means, Seeded-K-Means, SAP (Semi-supervised Affinity Propagation), LSAP (Loose Seed AP) and CSAP (Compact Seed AP). In order to adapt the single label, this paper improves the above algorithms to SCKM (improved Constrained-K-Means), SSKM (improved Seeded-K-Means), and SSAP (improved Semi-supervised Affinity Propagationg) to perform semi-supervised clustering analysis on the data. A DSAP (Double Seed AP) semi-supervised clustering algorithm based on compact seeds is proposed as the experimental data shows that DSAP has a better clustering effect. The unsupervised and semi-supervised clustering results are used to analyze the potential patterns of marine data.  相似文献   

4.
k-中心点聚类算法(k-medoids cluster algorithm,KCA)是改进的机器学习聚类算法,该方法通过初始聚类中心选取和聚类中心更新,对无标记训练样本的学习揭示数据的内在性质及规律,从而区分出机器的运行状态。提出了一种正交小波变换k-中心点聚类算法(orthogonal wavelet transform k-medoids clustering algorithm,OWTKCA)诊断方法,利用正交小波变换(orthogonal wavelet transformation,OWT)方法提取各细节信号作为训练样本,用KCA方法进行分类。通过滚动轴承的试验数据分类结果显示,该方法相对于没有提取特征值的KCA能有效处理复杂机械振动信号,明显提高了故障数据聚类效果,缩短了聚类时间,提高了智能诊断效率。  相似文献   

5.
As an important branch of machine learning, clustering analysis is widely used in some fields, e.g., image pattern recognition, social network analysis, information security, and so on. In this paper, we consider the designing of clustering algorithm in quantum scenario, and propose a quantum hierarchical agglomerative clustering algorithm, which is based on one dimension discrete quantum walk with single-point phase defects. In the proposed algorithm, two nonclassical characters of this kind of quantum walk, localization and ballistic effects, are exploited. At first, each data point is viewed as a particle and performed this kind of quantum walk with a parameter, which is determined by its neighbors. After that, the particles are measured in a calculation basis. In terms of the measurement result, every attribute value of the corresponding data point is modified appropriately. In this way, each data point interacts with its neighbors and moves toward a certain center point. At last, this process is repeated several times until similar data points cluster together and form distinct classes. Simulation experiments on the synthetic and real world data demonstrate the effectiveness of the presented algorithm. Compared with some classical algorithms, the proposed algorithm achieves better clustering results. Moreover, combining quantum cluster assignment method, the presented algorithm can speed up the calculating velocity.  相似文献   

6.
ICA的近红外光谱分析软件的研制   总被引:1,自引:0,他引:1  
研制了基于独立分量分析方法的近红外光谱分析软件.该软件包括光谱解析、光谱建模和未知成分含量测定三个模块,使用了小波分析、ICA和BP神经网络等数据处理方法.将这种软件用于实测的玉米近红外光谱分析,所得结果令人满意.使用LabVIEW与MATLAB软件混合编程,充分利用了各软件的优点,不仅程序简单,而且界面友好.  相似文献   

7.
Atomic force microscopy (AFM) has evolved to be one of the most powerful tools for the characterization of material surfaces especially at the nanoscale. Recent development of AFM has incorporated a suite of analytical techniques including surface‐enhanced Raman scattering (SERS) technique and infrared (IR) spectroscopy to further reveal chemical composition and map the chemical distribution. This incorporation not only elevates the functionality of AFM but also increases the resolution limitation of conventional IR and Raman spectroscopy. Despite the rapid development of such hybrid AFM techniques, many unique features, principles, applications, potential pitfalls or artifacts are not well known to the community. This review systematically summarizes the recent relevant literature on hybrid AFM principles and applications. It focuses specially on AFM‐IR and AFM‐Raman techniques. Various applications in different research fields are critically reviewed and discussed, highlighting the potentials of these hybrid AFM techniques. Here, the major drawbacks and limitations of these two hybrid AFM techniques are presented. The intentions of this article are to shed new light on the future research and achieve improvements in stability and reliability of the measurements.  相似文献   

8.
9.
Near-infrared (NIR) spectroscopy, one of the most rapidly growing methodologies in pharmaceutical analysis, has been used to analyze the pharmaceutical solid dosage form. The objective of this study was to examine the information that can be gathered from NIR spectroscopy and demonstrate the potential utility of the technique as an alternative to current methods of tablet performance testing. The tablet formulation included active drug (acetaminophen or theophylline), binder (hydroxyethylcellulose), filler (lactose, calcium sulfate, dibasic calcium phosphate dihydrate, or microcrystalline cellulose), and lubricant (magnesium stearate). The compression forces were varied from 5 to 25 kN. A Foss/NIRSystems scanning near-infrared spectrometer was used to measure the diffuse reflectance from the tablet surface. Each tablet was scanned on opposite sides to reduce the effects of positioning. First derivative and multiplicative scatter correction data treatments were explored. A calibration for compression force, independent of the filler, was developed. In addition, the spectra were able to distinguish among the fillers used. A comparison of these spectra with data collected earlier suggests that the technique could differentiate among drugs as well. Near-infrared diffuse reflection spectroscopy, when properly calibrated, can determine the compression force used to prepare a tablet. This measurement may be independent of the different active drugs or fillers used in the tablet formulations.  相似文献   

10.
In the current era of the internet, people use online media for conversation, discussion, chatting, and other similar purposes. Analysis of such material where more than one person is involved has a spate challenge as compared to other text analysis tasks. There are several approaches to identify users’ emotions from the conversational text for the English language, however regional or low resource languages have been neglected. The Urdu language is one of them and despite being used by millions of users across the globe, with the best of our knowledge there exists no work on dialogue analysis in the Urdu language. Therefore, in this paper, we have proposed a model which utilizes deep learning and machine learning approaches for the classification of users’ emotions from the text. To accomplish this task, we have first created a dataset for the Urdu language with the help of existing English language datasets for dialogue analysis. After that, we have preprocessed the data and selected dialogues with common emotions. Once the dataset is prepared, we have used different deep learning and machine learning techniques for the classification of emotion. We have tuned the algorithms according to the Urdu language datasets. The experimental evaluation has shown encouraging results with 67% accuracy for the Urdu dialogue datasets, more than 10, 000 dialogues are classified into five emotions i.e., joy, fear, anger, sadness, and neutral. We believe that this is the first effort for emotion detection from the conversational text in the Urdu language domain.  相似文献   

11.
Colour quantisation (CQ) is an important operation with many applications in graphics and image processing. Most CQ methods are essentially based on data clustering algorithms one of which is the popular k-means algorithm. Unfortunately, like many batch clustering algorithms, k-means is highly sensitive to the selection of the initial cluster centres. In this paper, we adapt Uchiyama and Arbib’s competitive learning algorithm to the CQ problem. In contrast to the batch k-means algorithm, this online clustering algorithm does not require cluster centre initialisation. Experiments on a diverse set of publicly available images demonstrate that the presented method outperforms some of the most popular quantisers in the literature.  相似文献   

12.
Near-infrared (NIR) spectroscopy, one of the most rapidly growing methodologies in pharmaceutical analysis, has been used to analyze the pharmaceutical solid dosage form. The objective of this study was to examine the information that can be gathered from NIR spectroscopy and demonstrate the potential utility of the technique as an alternative to current methods of tablet performance testing. The tablet formulation included active drug (acetaminophen or theophylline), binder (hydroxyethylcellulose), filler (lactose, calcium sulfate, dibasic calcium phosphate dihydrate, or microcrystalline cellulose), and lubricant (magnesium stearate). The compression forces were varied from 5 to 25 kN. A Foss/NIRSystems scanning near-infrared spectrometer was used to measure the diffuse reflectance from the tablet surface. Each tablet was scanned on opposite sides to reduce the effects of positioning. First derivative and multiplicative scatter correction data treatments were explored. A calibration for compression force, independent of the filler, was developed. In addition, the spectra were able to distinguish among the fillers used. A comparison of these spectra with data collected earlier suggests that the technique could differentiate among drugs as well. Near-infrared diffuse reflection spectroscopy, when properly calibrated, can determine the compression force used to prepare a tablet. This measurement may be independent of the different active drugs or fillers used in the tablet formulations.  相似文献   

13.
Many approaches have been tried for the classification of arrhythmia. Due to the dynamic nature of electrocardiogram (ECG) signals, it is challenging to use traditional handcrafted techniques, making a machine learning (ML) implementation attractive. Competent monitoring of cardiac arrhythmia patients can save lives. Cardiac arrhythmia prediction and classification has improved significantly during the last few years. Arrhythmias are a group of conditions in which the electrical activity of the heart is abnormal, either faster or slower than normal. It is the most frequent cause of death for both men and women every year in the world. This paper presents a deep learning (DL) technique for the classification of arrhythmias. The proposed technique makes use of the University of California, Irvine (UCI) repository, which consists of a high-dimensional cardiac arrhythmia dataset of 279 attributes. In this research, our goal was to classify cardiac arrhythmia patients into 16 classes depending on the characteristics of the electrocardiography dataset. The DL approach in the form of long short-term memory (LSTM) is an efficient technique to deal with reduced accuracy due to vanishing and exploding gradients in traditional DL frameworks for big data analysis. The goal of this research was to categorize cardiac arrhythmia patients by developing an efficient intelligent system using the LSTM DL algorithm. This approach to arrhythmia classification includes classification algorithms along with noise removal techniques. Therefore, we utilized principal components analysis (PCA) for noise removal, and LSTM for classification. This hybrid comprehensive arrhythmia classification approach performs better than previous approaches to arrhythmia classification. We attained a highest classification accuracy of 93.5% with the DL based disease classification system, and outperformed the earlier approaches used for cardiac arrhythmia classification.  相似文献   

14.
Although predictive machine learning for supply chain data analytics has recently been reported as a significant area of investigation due to the rising popularity of the AI paradigm in industry, there is a distinct lack of case studies that showcase its application from a practical point of view. In this paper, we discuss the application of data analytics in predicting first tier supply chain disruptions using historical data available to an Original Equipment Manufacturer (OEM). Our methodology includes three phases: First, an exploratory phase is conducted to select and engineer potential features that can act as useful predictors of disruptions. This is followed by the development of a performance metric in alignment with the specific goals of the case study to rate successful methods. Third, an experimental design is created to systematically analyse the success rate of different algorithms, algorithmic parameters, on the selected feature space. Our results indicate that adding engineered features in the data, namely agility, outperforms other experiments leading to the final algorithm that can predict late orders with 80% accuracy. An additional contribution is the novel application of machine learning in predicting supply disruptions. Through the discussion and the development of the case study we hope to shed light on the development and application of data analytics techniques in the analysis of supply chain data. We conclude by highlighting the importance of domain knowledge for successfully engineering features.  相似文献   

15.
Due to weak interactions between micrometer‐wavelength infrared (IR) light and nanosized samples, a high signal to noise ratio is a prerequisite in order to precisely characterize nanosized samples using IR spectroscopy. Traditional micrometer‐thick window substrates, however, have considerable IR absorption which may introduce unavoidable deformations and interruptions to IR spectra of nanoscale samples. A promising alternative is the use of a suspended graphene substrate which has ultrahigh IR transmittance (>97.5%) as well as unique mechanical properties. Here, an effective method is presented for fabrication of suspended graphene over circular holes up to 150 µm in diameter to be utilized as a transparent substrate for IR spectroscopy. It is demonstrated that the suspended graphene has little impact on the measured IR spectra, an advantage which has led to the discovery of several missing vibrational modes of a 20 nm thick PEO film measured on a traditional CaF2 substrate. This can provide a better understanding of molecules' fine structures and status of hanging bands. The unique optical properties of suspended graphene are determined to be superior to those of conventional IR window materials, giving this new substrate great potential as part of a new generation of IR transparent substrates, especially for use in examining nanoscale samples.  相似文献   

16.
Pedestrian detection and tracking are vital elements of today’s surveillance systems, which make daily life safe for humans. Thus, human detection and visualization have become essential inventions in the field of computer vision. Hence, developing a surveillance system with multiple object recognition and tracking, especially in low light and night-time, is still challenging. Therefore, we propose a novel system based on machine learning and image processing to provide an efficient surveillance system for pedestrian detection and tracking at night. In particular, we propose a system that tackles a two-fold problem by detecting multiple pedestrians in infrared (IR) images using machine learning and tracking them using particle filters. Moreover, a random forest classifier is adopted for image segmentation to identify pedestrians in an image. The result of detection is investigated by particle filter to solve pedestrian tracking. Through the extensive experiment, our system shows 93% segmentation accuracy using a random forest algorithm that demonstrates high accuracy for background and roof classes. Moreover, the system achieved a detection accuracy of 90% using multiple template matching techniques and 81% accuracy for pedestrian tracking. Furthermore, our system can identify that the detected object is a human. Hence, our system provided the best results compared to the state-of-art systems, which proves the effectiveness of the techniques used for image segmentation, classification, and tracking. The presented method is applicable for human detection/tracking, crowd analysis, and monitoring pedestrians in IR video surveillance.  相似文献   

17.
The measurement of the physical and chemical ("physicochemical") properties of nanomaterials used in industry and science including chemistry, pharmacy, medicine, toxicology, etc., is time-consuming, expensive and requires a lot of experience of a well trained lab staff. Near-infrared spectroscopy (NIR; 4.000-12.000 cm(-1)), working in the wavelength region with the highest IR energy, allows obtaining multifactorial information of the material under investigation due to the occurrence of a high number of combination and overtone vibrations. Coupling of an optimized and well-designed measurement technique with multivariate data analysis (MVA) leads to a non-destructive, fast, reliable and robust novel NIR technique for the fast and non-invasive physicochemical characterization, which is suitable for high-throughput quality control due to the short analyses times of only a few seconds. In the following chapters, the patented basic NIR techniques full-filling these aims are introduced, described, summarized and critically discussed.  相似文献   

18.
近红外光谱学是近十年来发展最快、最引人注目的光谱分析技术之一,其高精度测量依赖于化学计量学方法以准确提取光谱信息.针对近红外光谱单尺度传统建模方法中存在的信息易丢失问题,发展了一种多尺度建模新方法.多尺度建模可有效协同利用信号的时/频多尺度特性,并将多尺度特性以加权形式统一映射到多元校正空间,有效避免了信息丢失.该算法成功地应用于面粉中掺杂有毒非法添加剂硼砂含量检测,经验证后模型的预测值与真实值的相关系数和预测均方根误差分别为0.974和0.0019,其预测相对误差为1.9%.研究结果表明,多尺度建模方法完全满足近红外光谱高精度测量的要求.  相似文献   

19.
对于链路状态数据库的网络传输异常数据检测存在检测数据不完整、较为敏感、检测效率差的问题,提出基于机器学习的分布式网络传输异常数据智能检测方法,通过K最近邻分簇算法对分布式网络节点实施分簇,利用贝叶斯分类算法检测簇头是否出现异常;确定异常簇后,选取小波阈值降噪方法对异常簇内数据进行降噪处理,在此基础上,采用遗传算法检测降...  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号