首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
With the ubiquitous collection of data and creation of large distributed repositories, enabling search over this data while respecting access control is critical. A related problem is that of ensuring privacy of the content owners while still maintaining an efficient index of distributed content. We address the problem of providing privacy-preserving search over distributed access-controlled content. Indexed documents can be easily reconstructed from conventional (inverted) indexes used in search. Currently, the need to avoid breaches of access-control through the index requires the index hosting site to be fully secured and trusted by all participating content providers. This level of trust is impractical in the increasingly common case where multiple competing organizations or individuals wish to selectively share content. We propose a solution that eliminates the need of such a trusted authority. The solution builds a centralized privacy-preserving index in conjunction with a distributed access-control enforcing search protocol. Two alternative methods to build the centralized index are proposed, allowing trade offs of efficiency and security. The new index provides strong and quantifiable privacy guarantees that hold even if the entire index is made public. Experiments on a real-life dataset validate performance of the scheme. The appeal of our solution is twofold: (a) content providers maintain complete control in defining access groups and ensuring its compliance, and (b) system implementors retain tunable knobs to balance privacy and efficiency concerns for their particular domains. Dr. Vaidya’s work was supported by the National Science Foundation under grant CNS-0746943 and by a research resources grant from Rutgers Business School, Newark and New Brunswick.  相似文献   

2.
The task of classifying observations into known groups is a common problem in decision making. A wealth of statistical approaches, commencing with Fisher's linear discriminant function, and including variations to accomodate a variety of modeling assumptions, have been proposed. In addition, nonparametric approaches based on various mathematical programming models have also been proposed as solutions. All of these proposed aolutions have performed well when conditions favorable to the specific model are present. The modeler, therefore, can usually be assured of a good solution to his problem of he chooses a model which fits his situation. In this paper, the performance of a neural network as a classifier is evaluated. It is found that the performance of the neural network is comparable to the best of otheother methods under a wide variety of modeling assumptions. The use of neural networks as classifiers thus relieves the modeler of testing assumptions which would otherwise be critical to the performance of the usual classification techniques.  相似文献   

3.

The use of artificial neural networks for various problems has provided many benefits in various fields of research and engineering. Yet, depending on the problem, different architectures need to be developed and most of the time the design decision relies on a trial and error basis as well as on the experience of the developer. Many approaches have been investigated concerning the topology modelling, training algorithms, data processing. This paper proposes a novel automatic method for the search of a neural network architecture given a specific task. When selecting the best topology, our method allows the exploration of a multidimensional space of possible structures, including the choice of the number of neurons, the number of hidden layers, the types of synaptic connections, and the use of transfer functions. Whereas the backpropagation algorithm is being conventionally used in the field of neural networks, one of the known disadvantages of the technique represents the possibility of the method to reach saddle points or local minima, hence overfitting the output data. In this work, we introduce a novel strategy which is capable to generate a network topology with overfitting being avoided in the majority of the cases at affordable computational cost. In order to validate our method, we provide several numerical experiments and discuss the outcomes.

  相似文献   

4.
Semiconductor wafer defect inspection is an important process before die packaging. The defective regions are usually identified through visual judgment with the aid of a scanning electron microscope. Dozens of people visually check wafers and hand-mark their defective regions. Consequently, potential misjudgment may be introduced due to human fatigue. In addition, the process can incur significant personnel costs. Prior work has proposed automated visual wafer defect inspection that is based on supervised neural networks. Since it requires learned patterns specific to each application, its disadvantage is the lack of product flexibility. Self-organizing neural networks (SONNs) have been proven to have the capabilities of unsupervised auto-clustering. In this paper, an automatic wafer inspection system based on a self-organizing neural network is proposed. Based on real-world data, experimental results show, with good performance, that the proposed method successfully identifies the defective regions on wafers.  相似文献   

5.
小波变换与概率神经网络的心电图分类   总被引:1,自引:1,他引:1  
提出了一种实时高效的心电图分类理论与方法。首先对心电图进行六尺度小波分解,将含有主要噪声的尺度进行系数置零,再将剩余层进行小波重构,从而达到除噪的目的。利用数学形态学定位心电图P、Q、R、S、T波位置,并提取计算各波间距离和斜率等12个特征值作为概率神经网络的输入向量,从而实现心电图的六分类。  相似文献   

6.
Automatic text classification based on vector space model (VSM), artificial neural networks (ANN), K-nearest neighbor (KNN), Naives Bayes (NB) and support vector machine (SVM) have been applied on English language documents, and gained popularity among text mining and information retrieval (IR) researchers. This paper proposes the application of VSM and ANN for the classification of Tamil language documents. Tamil is morphologically rich Dravidian classical language. The development of internet led to an exponential increase in the amount of electronic documents not only in English but also other regional languages. The automatic classification of Tamil documents has not been explored in detail so far. In this paper, corpus is used to construct and test the VSM and ANN models. Methods of document representation, assigning weights that reflect the importance of each term are discussed. In a traditional word-matching based categorization system, the most popular document representation is VSM. This method needs a high dimensional space to represent the documents. The ANN classifier requires smaller number of features. The experimental results show that ANN model achieves 93.33% which is better than the performance of VSM which yields 90.33% on Tamil document classification.  相似文献   

7.
A Fuzzy Approach to Classification of Text Documents   总被引:1,自引:0,他引:1       下载免费PDF全文
This paper discusses the classification problems of text documents. Based on the concept of the proximity degree, the set of words is partitioned into some equivalence classes.Particularly, the concepts of the semantic field and association degree are given in this paper.Based on the above concepts, this paper presents a fuzzy classification approach for document categorization. Furthermore, applying the concept of the entropy of information, the approaches to select key words from the set of words covering the classification of documents and to construct the hierarchical structure of key words are obtained.  相似文献   

8.
Training speed of the classifier without degrading its predictive capability is an important concern in text classification. Feature selection plays a key role in this context. It selects a subset of most informative words (terms) from the set of all words. The correlative association of words towards the classes increases an incertitude for the words to represent a class. The representative words of a class are either of positive or negative nature. The standard feature selection methods, viz. Mutual Information (MI), Information Gain (IG), Discriminating Feature Selection (DFS) and Chi Square (CHI), do not consider positive and negative nature of the words that affects the performance of the classifiers. To address this issue, this paper presents a novel feature selection method named Correlative Association Score (CAS). It combines the strength, mutual information, and strong association of the words to determine their positive and negative nature for a class. CAS selects a few (k) informative words from the set of all words (m). These informative words generate a set of N-grams of length 1-3. Finally, the standard Apriori algorithm ensembles the power of CAS and CHI to select the top most, b informative N-grams, where b is a number set by an empirical evaluation. Multinomial Naive Bayes (MNB) and Linear Support Vector Machine (LSVM) classifiers evaluate the performance of the selected N-Grams. Four standard text data sets, viz. Webkb, 20Newsgroup, Ohsumed10, and Ohsumed23 are used for experimental analysis. Two standard performance measures named Macro_F1 and Micro_F1 show a significant improvement in the results using proposed CAS method.  相似文献   

9.
In this paper, an S-transform-based neural network structure is presented for automatic classification of power quality disturbances. The S-transform (ST) technique is integrated with neural network (NN) model with multi-layer perceptron to construct the classifier. Firstly, the performance of ST is shown for detecting and localizing the disturbances by visual inspection. Then, ST technique is used to extract the significant features of distorted signal. In addition, an optimum combination of the most useful features is identified for increasing the accuracy of classification. Features extracted by using the S-transform are applied as input to NN for automatic classification of the power quality (PQ) disturbances that solves a relatively complex problem. Six single disturbances and two complex disturbances as well pure sine (normal) selected as reference are considered for the classification. Sensitivity of proposed expert system under different noise conditions is investigated. The analysis and results show that the classifier can effectively classify different PQ disturbances.  相似文献   

10.
Multilayer perceptron (MLP) (trained with back propagation learning algorithm) takes large computational time. The complexity of the network increases as the number of layers and number of nodes in layers increases. Further, it is also very difficult to decide the number of nodes in a layer and the number of layers in the network required for solving a problem a priori. In this paper an improved particle swarm optimization (IPSO) is used to train the functional link artificial neural network (FLANN) for classification and we name it ISO-FLANN. In contrast to MLP, FLANN has less architectural complexity, easier to train, and more insight may be gained in the classification problem. Further, we rely on global classification capabilities of IPSO to explore the entire weight space, which is plagued by a host of local optima. Using the functionally expanded features; FLANN overcomes the non-linear nature of problems. We believe that the combined efforts of FLANN and IPSO (IPSO + FLANN = ISO ? FLANN) by harnessing their best attributes can give rise to a robust classifier. An extensive simulation study is presented to show the effectiveness of proposed classifier. Results are compared with MLP, support vector machine(SVM) with radial basis function (RBF) kernel, FLANN with gradiend descent learning and fuzzy swarm net (FSN).  相似文献   

11.
Content based music genre classification is a key component for next generation multimedia search agents. This paper introduces an audio classification technique based on audio content analysis. Artificial Neural Networks (ANNs), specifically multi-layered perceptrons (MLPs) are implemented to perform the classification task. Windowed audio files of finite length are analyzed to generate multiple feature sets which are used as input vectors to a parallel neural architecture that performs the classification. This paper examines a combination of linear predictive coding (LPC), mel frequency cepstrum coefficients (MFCCs), Haar Wavelet, Daubechies Wavelet and Symlet coefficients as feature sets for the proposed audio classifier. Parallel to MLP, a Gaussian radial basis function (GRBF) based ANN is also implemented and analyzed. The obtained prediction accuracy of 87.3% in determining the audio genres claims the efficiency of the proposed architecture. The ANN prediction values are processed by a rule based inference engine (IE) that presents the final decision.  相似文献   

12.
Self-care problems classification is one of the important challenges for occupational therapists. Extent and variety of disorders make the self-care problems classification process complex and time-consuming. To overcome this challenge, an expert model is proposed innovatively in this research. The proposed model is based on Probabilistic Neural Network (PNN) and Genetic Algorithm (GA) for classifying self-care problems of children with physical and motor disability. In this model, PNN is employed as a classifier and GA is applied for feature selection. The PNN is trained by using a standard ICF-CY dataset. Based on ICF-CY, occupational therapists must evaluate many features to diagnose self-care problems. According to the experiences of occupational therapists, these features have different effects on classification. Hence, GA is employed to select relevant and important features in self-care problems classification. Since the classification rules are important for occupational therapists, the self-care problems classification rules are extracted additionally by using the CART algorithm. The experimental results show that by using the feature selection algorithm, the accuracy and time complexity of classification are improved in comparison to other models. The proposed model can classify self-care problems of children with 94.28% accuracy by using only 16.5% of all features.  相似文献   

13.
This paper describes MetaIndex, an automatic indexing program that creates symbolic representations of documents for the purpose of document retrieval. MetaIndex uses a simple transition network parser to recognize a language that is derived from the set of main concepts in the Unified Medical Language System Metathesaurus (Meta-1). MetaIndex uses a hierarchy of medical concepts, also derived from Meta-1, to represent the content of documents. The goal of this approach is to improve document retrieval performance by better representation of documents. An evaluation method is described, and the performance of MetaIndex on the task of indexing the Slice of Life medical image collection is reported.  相似文献   

14.
15.
关系分类是自然语言处理领域的一项重要语义处理任务。传统的关系分类方法通过人工设计各类特征以及各类核函数来对句子内部2个实体之间的关系进行判断。近年来,关系分类方法的主要工作集中于通过各类神经网络获取句子的语义特征表示来进行分类,以减少手动构造各类特征。在句子中,不同关键词对关系分类任务的贡献程度是不同的,然而重要的词义有可能出现在句子中的任意位置。为此,提出了一种基于注意力的混合神经网络关系分类模型来捕获重要的语义信息,用来进行关系分类,该方法是一种端到端的方法。实验结果表明了该方法的有效性。  相似文献   

16.
This work describes a framework that combines techniques from Adaptive Hypermedia and Natural Language processing in order to create, in a fully automated way, on-line information systems from linear texts in electronic format, such as textbooks. The process is divided into two steps: an off-line processing step, which analyses the source text, and an on-line step, which executes when a user connects to the system with a web browser, moment at which the contents and hyperlinks are generated. The framework has been implemented as the Welkin system, which has been used to build three adaptive on-line information sites in a quick and easy way. Some controlled experiments have been performed with real users aimed to provide positive feedback on the implementation of the system.  相似文献   

17.
Multimedia Tools and Applications - Diagnosis of Glaucoma eye disease is a challenging task for CADx (computer-aided diagnostics) systems. An automatic CADx framework is developed for diagnosing...  相似文献   

18.
One of the imperative problems in the realm of wireless sensor networks is the problem of wireless sensors localization. Despite the fact that much research has been conducted in this area, many of the proposed approaches produce unsatisfactory results when exposed to the harsh, uncertain, noisy conditions of a manufacturing environment. In this study, we develop an artificial neural network approach to moderate the effect of the miscellaneous noise sources and harsh factory conditions on the localization of the wireless sensors. Special attention is given to investigate the effect of blockage and ambient conditions on the accuracy of mobile node localization. A simulator, simulating the noisy and dynamic shop conditions of manufacturing environments, is employed to examine the neural network proposed. The neural network performance is also validated through some actual experiments in real-world environment prone to different sources of noise and signal attenuation. The simulation and experimental results demonstrate the effectiveness and accuracy of the proposed methodology.  相似文献   

19.
针对不平衡图像分类中少数类查全率低、分类结果总代价高,以及人工提取特征主观性强而且费时费力的问题,提出了一种基于Triplet-sampling的卷积神经网络(Triplet-sampling CNN)和代价敏感支持向量机(CSSVM)的不平衡图像分类方法——Triplet-CSSVM。该方法将分类过程分为特征学习和代价敏感分类两部分。首先,利用误差公式为三元损失函数的卷积神经网络端对端地学习将图像映射到欧几里得空间的编码方法;然后,结合采样方法重构数据集,使其分布平衡化;最后,使用CSSVM分类算法给不同类别赋以不同的代价因子,获得最佳代价最小的分类结果。在深度学习框架Caffe上使用人像数据集FaceScrub进行实验。实验结果表明,所提方法在1∶3的不平衡率下,与VGGNet-SVM方法相比,少数类的精确率提高了31个百分点,召回率提高了71个百分点。  相似文献   

20.
The decision tree learning algorithms, e.g., C5, are good at dataset classification. But those algorithms usually work with only one attribute at a time and adopt the greedy method to build the decision tree. The dependencies among attributes are not considered in those algorithms. Unfortunately, in the real world, most datasets contain attributes, which are dependent. Thus, the results generated by those algorithms are not the optimal learning results. However, it is a combinatorial explosion problem for considering multiple attributes at a time. So, it is very important to construct a model to efficiently discovery the dependencies among attributes, and to improve the accuracy and effectiveness of the decision tree learning algorithms. Generally, these dependencies are classified into two types: categorical-type and numerical-type dependencies. This paper proposes a Neural Decision Tree (NDT) model, to deal with these two kinds of dependencies. The NDT model combines the neural network technologies and the traditional decision-tree learning capabilities, to handle the complicated and real cases. According to the experiments on ten datasets from the UCI database repository, the NDT model can significantly improve the accuracy and effectiveness of C5.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号