首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
通过介绍改进的多代理系统在分析基因表达数据中的应用,揭示了运用普通分类器找出有效的分类基因,而且这些普通分类器在研究基因表达数据库时可被代理使用.这样就可得到一个具有大量基因特征的小子集,并且用它可以帮助辨认病人的临床状况和特征.实验表明代理通过改进和相互协作可改进其性能,并通过两个著名的众所周知的基因表达问题来展示项目的研究成果.  相似文献   

2.
基于基因表达谱运用信息科学的方法和技术建立胃癌的分类模型,关键在于准确找出决定样本类别的一组特征基因.针对该问题在分析胃癌的基因表达谱基础上研究了胃癌特征基因选取问题.本文提出了一种新的特征基因选取方法--CLUSTER_S2N法,并用支持向量机作为分类器,以分类错误率为标准进行了胃癌的分类预测实验,实验结果表明了该方法的可行性和有效性.  相似文献   

3.
基于模糊粗糙集的肿瘤分类特征基因选取   总被引:2,自引:0,他引:2  
依据基因表达谱有效建立肿瘤分类模型的关键在于,准确找出决定样本类剐的一组特征基因.粗糙集理论作为一种新的软计算方法能够保持在原数据集的分类能力不变的基础上,对属性极大约简,从大量基因中找到对分类有效的基因.由于基因表达谱数据集的连续性,为了避免运用粗糙集方法所必需的离散化过程带来的信息丢失,尝试将模糊粗糙集应用于特征基因的选取,提出了基于互信息的模糊粗糙集属性约简算法,运用于基因表达谱数据集的基因选取.然后分别采用KNN和C5.0分类器进行特征基因分类性能进行检验.以急性白血病亚型(leukemia Microarray)和直肠癌(colon Microarray)分类特征基因选取为例进行实验,结果表明了上述方法的可行性和有效性.  相似文献   

4.
《计算机工程》2017,(1):115-119
Web代理服务器缓存能在一定程度上减少网络拥塞现象和用户的访问延迟,减轻服务器负载。然而Web代理缓存的缓存命中率和字节命中率较低,并不能很好地起到加速网络请求响应的效果。为此,研究监督学习方法,使用树扩展朴素贝叶斯分类器对Web日志数据进行分类,进而预测可能会再次访问到的Web对象,并结合最近最少使用(LRU)算法,提出一种新的缓存策略。实验结果表明,树扩展的贝叶斯分类器在精度和召回率指标上优于朴素贝叶斯和BP神经网络等分类器,通过树扩展的贝叶斯分类器优化后的缓存策略与普通LRU算法相比,不仅可以提高缓存的效率,而且可有效提高Web代理缓存的请求命中率和字节命中率。  相似文献   

5.
通用深度学习算法提取的医学手骨图像特征不能很好地区分相近年龄图像的差异, 这导致骨龄分类器的预测精度较低. 根据基于深度学习的轻量级神经网络MobileNet设计了一种改进的骨龄分类器RIL-MobileNetV3 Large, 通过改进LBP处理层得到了具有细致纹理特征的手骨数据集并引入注意力机制进行自动定位, 通过学习处理层处理后的手骨X光片中的深层区域特征完成识别和骨龄的分类, 在公共数据集上进行实验并对该分类器进行多次训练调优, 结果表明改进设计的分类器在骨龄预测中具有高达94.204%的准确率和0.350岁的均值误差, 而且改进的轻量级网络为可移动智能便携预测骨龄奠定基础.  相似文献   

6.
样本类型无关的多类特征基因选择方法   总被引:1,自引:0,他引:1  
分类特征基因是基因表达谱数据分析中的重点,目前的特征基因选择方法均没有考虑到基因在不同类别中分布失衡给特征基因选择算法带来的影响。提出一种样本无关的特征基因选择方法,该方法利用改进地类间差异函数和类内波动函数,根据两个函数的一致性选择每个类别的鉴别基因。该方法不仅适用于多类样本,对于各类样本数量不均衡以及基因在各类中分布失调的样本同样有效。实验结果表明,该方法确保了特征矢量的均衡性,提高了分类器的分类性能。  相似文献   

7.
针对肿瘤基因表达谱样本少,维数高的特点,提出一种用于肿瘤信息基因提取和亚型识别的集成分类器算法.该算法根据基因的Fisher比率值建立候选子集,再采用相关系数和互信息两种度量方法,分别构造反映基因共表达行为和调控关系的特征子集.粒子群优化算法分别与SVM和KNN构成两个基分类器,从候选子集中提取信息基因并对肿瘤亚型进行分类,最后利用绝对多数投票方法对基分类器的结果进行整合.G.Gordon肺癌亚型识别的实验结果表明了该算法的可行性和有效性.  相似文献   

8.
针对中国地鼠基因表达谱数据维数高和样本小的特点,提出一种基于支持向量机(SVM)的分类特征基因选取方法。该方法利用改进的Fisher判别(FDR)基因特征计分准则剔除分类无关基因,提出由空间距离和功能距离组成的新距离作为相似性度量的标准进行冗余基因的剔除,采用SVM作为分类器检验特征基因的分类性能。实验结果表明,该方法有效地剔除了分类无关基因和冗余基因,选取的特征基因满足对中国地鼠正确分类的最小基因数。  相似文献   

9.
顾清华  张晓玥  陈露 《控制与决策》2022,37(10):2456-2466
当使用代理辅助进化算法求解昂贵高维多目标优化问题时,代理模型通常用于近似昂贵的适应度函数.然而,随着目标数的增加,近似误差将逐渐累积,计算量也会急剧增加.对此,提出一种基于改进集成学习分类的代理辅助进化算法,使用一种改进的装袋集成学习分类器作为代理模型.首先,从被昂贵的适应度评价的个体中选择一组分类边界,将所有个体分成两类;其次,利用这些带有分类标签的个体训练分类器,以对候选个体的类别进行预测;最后,选择有前途的个体进行昂贵适应度评价.实验结果表明,算法中所提出的代理模型可有效提高基于分类的代理辅助进化算法求解昂贵高维多目标优化问题的能力,且与目前流行的代理辅助进化算法相比,基于改进集成学习分类的代理辅助进化算法更具竞争力.  相似文献   

10.
曹娟  张颖淳  赵玲 《计算机科学》2013,40(7):226-228,265
依据基因表达谱建立有效肿瘤分类模型的关键在于准确找出决定样本类别的一组特征基因。粗糙集理论已成功应用于肿瘤分类特征基因选取中。然而,粗糙集方法处理连续值的基因表达谱数据集所必需的离散化过程会使得部分信息丢失,对所选取的特征基因的分类精度造成一定影响。因此,曾提出基于互信息的模糊粗糙集基因表达谱数据集特征基因的选取算法。然而,该算法计算代价较高,当所选取的基因数较多时难以实现。为此,对 该算法进行了 改进,从最大相关性和最重要性(最小冗余)两方面对互信息进行了近似替代计算,大大降低了算法的复杂度,提高了算法的效率。以急性白血病亚型(leukemia)、直肠癌(colon)和乳腺癌(Breast)分类特征基因选取为例进行实验,然后分别采用1NN和SVM分类器进行特征基因分类精度检验,结果证实了新方法的可行性和有效性。  相似文献   

11.
In a DNA microarray dataset, gene expression data often has a huge number of features(which are referred to as genes) versus a small size of samples. With the development of DNA microarray technology, the number of dimensions increases even faster than before, which could lead to the problem of the curse of dimensionality. To get good classification performance, it is necessary to preprocess the gene expression data. Support vector machine recursive feature elimination (SVM-RFE) is a classical method for gene selection. However, SVM-RFE suffers from high computational complexity. To remedy it, this paper enhances SVM-RFE for gene selection by incorporating feature clustering, called feature clustering SVM-RFE (FCSVM-RFE). The proposed method first performs gene selection roughly and then ranks the selected genes. First, a clustering algorithm is used to cluster genes into gene groups, in each which genes have similar expression profile. Then, a representative gene is found to represent a gene group. By doing so, we can obtain a representative gene set. Then, SVM-RFE is applied to rank these representative genes. FCSVM-RFE can reduce the computational complexity and the redundancy among genes. Experiments on seven public gene expression datasets show that FCSVM-RFE can achieve a better classification performance and lower computational complexity when compared with the state-the-art-of methods, such as SVM-RFE.  相似文献   

12.
A multi-agent architecture for control of AGV systems   总被引:2,自引:0,他引:2  
Agent is an autonomous, computational entity that can be viewed as perceiving its environment and acting upon it. Agents are event-driven objects that can be integrated in automated manufacturing environments to control certain tasks. In this paper a set of agents (a multi-agent system) is introduced to control an automated manufacturing environment. The architecture includes functions at the manufacturing cell level, materials handling and transport level, and factory scheduling level. Communication between these agents is accomplished by using a relational database (blackboard system). The relational database also integrates the requirements of a manufacturing execution system within the multi-agent task structure, which is unique to this architecture. Manufacturing cell and scheduling agents have been previously described in the literature. Here we focus our attention on the functions of the agents of the transport system, which is composed of a set of AGVs.  相似文献   

13.
Microarray technologies are employed to simultaneously measure expression levels of thousands of genes. Data obtained from such experiments allow inference of individual gene functions, help to identify genes from specific tissues, to analyze the behavior of gene expression levels under various environmental conditions and under different cell cycle stages, and to identify inappropriately transcribed genes and several genetic diseases, among many other applications. As thousands of genes may be involved in a microarray experiment, computational tools for organizing and providing possible visualizations of the genes and their relationships are crucial to the understanding and analysis of the data. This work proposes an algorithm based on artificial immune systems for organizing gene expression data in order to simultaneously reveal multiple features in large amounts of data. A distinctive property of the proposed algorithm is the ability to provide a diversified set of high-quality rearrangements of the genes, opening up the possibility of identifying various co-regulated genes from representative graphical configurations of the expression levels. This is a very useful approach for biologists, because several co-regulated genes may exist under different conditions.  相似文献   

14.
针对较大规模结肠癌基因表达谱信息,对其噪声处理在基因标签提取问题中的作用进行了研究。不考虑噪声,用ReCorre算法确定分类基因,再用增l减r搜索算法确定基因标签组,对每个基因标签组使用基于支持向量机的留一交叉检验,确定最优的基因标签。分析噪声的影响,对于数据噪声,利用小波阈值去噪的方法滤除;对于无用基因,采用交替选择算法处理,进而重新确定基因标签。实验证明对肿瘤基因表达谱中噪声的处理有助于获取分类能力更好的基因标签。  相似文献   

15.
Molecular level diagnostics based on microarray technologies can offer the methodology of precise, objective, and systematic cancer classification. Genome-wide expression patterns generally consist of thousands of genes. It is desirable to extract some significant genes for accurate diagnosis of cancer because not all genes are associated with a cancer. In this paper, we have used representative gene vectors that are highly discriminatory for cancer classes and extracted multiple significant gene subsets based on those representative vectors respectively. Also, an ensemble of neural networks learned from the multiple significant gene subsets is proposed to classify a sample into one of several cancer classes. The performance of the proposed method is systematically evaluated using three different cancer types: Leukemia, colon, and B-cell lymphoma.  相似文献   

16.
For one to infer the structures of a gene regulatory network (GRN), it is important to identify, for each gene in the GRN, which other genes can affect its expression and how they can affect it. For this purpose, many algorithms have been developed to generate hypotheses about the presence or absence of interactions between genes. These algorithms, however, cannot be used to determine if a gene activates or inhibits another. To obtain such information to better infer GRN structures, we propose a fuzzy data mining technique here. By transforming quantitative expression values into linguistic terms, it defines a measure of fuzzy dependency among genes. Using such a measure, the technique is able to discover interesting fuzzy dependency relationships in noisy, high dimensional time series expression data so that it can not only determine if a gene is dependent on another but also if a gene is supposed to be activated or inhibited. In addition, the technique can also predict how a gene in an unseen sample (i.e., expression data that are not in the original database) would be affected by other genes in it and this makes statistical verification of the reliability of the discovered gene interactions easier. For evaluation, the proposed technique has been tested using real expression data and experimental results show that the use of fuzzy-logic based technique in gene expression data analysis can be quite effective.  相似文献   

17.
曲英伟  郑广海 《微机发展》2003,13(10):120-121
提出一种基于模板分析方法,说明Agent之间通信如何使用这种方法去构造和分解消息,并说明这种方法减少了在ACL上的假设和消息格式的使用。模板的应用降低了Agent之间相互操作的需求,允许Agent在一个开放的多Agent系统中通信,可以事先不考虑缺少ACL协议和消息格式要求。  相似文献   

18.
Recently, microarray technology has widely used on the study of gene expression in cancer diagnosis. The main distinguishing feature of microarray technology is that can measure thousands of genes at the same time. In the past, researchers always used parametric statistical methods to find the significant genes. However, microarray data often cannot obey some of the assumptions of parametric statistical methods, or type I error may be over expanded. Therefore, our aim is to establish a gene selection method without assumption restriction to reduce the dimension of the data set. In our study, adaptive genetic algorithm/k-nearest neighbor (AGA/KNN) was used to evolve gene subsets. We find that AGA/KNN can reduce the dimension of the data set, and all test samples can be classified correctly. In addition, the accuracy of AGA/KNN is higher than that of GA/KNN, and it only takes half the CPU time of GA/KNN. After using the proposed method, biologists can identify the relevant genes efficiently from the sub-gene set and classify the test samples correctly.  相似文献   

19.
Selecting high discriminative genes from gene expression data has become an important research. Not only can this improve the performance of cancer classification, but it can also cut down the cost of medical diagnoses when a large number of noisy, redundant genes are filtered. In this paper, a hybrid Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) method is used for gene selection, and Support Vector Machine (SVM) is adopted as the classifier. The proposed approach is tested on three benchmark gene expression datasets: Leukemia, Colon and breast cancer data. Experimental results show that the proposed method can reduce the dimensionality of the dataset, and confirm the most informative gene subset and improve classification accuracy.  相似文献   

20.
Recently, biology has been confronted with large multidimensional gene expression data sets where the expression of thousands of genes is measured over dozens of conditions. The patterns in gene expression are frequently explained retrospectively by underlying biological principles. Here we present a method that uses text analysis to help find meaningful gene expression patterns that correlate with the underlying biology described in scientific literature. The main challenge is that the literature about an individual gene is not homogenous and may addresses many unrelated aspects of the gene. In the first part of the paper we present and evaluate the neighbor divergence per gene (NDPG) method that assigns a score to a given subgroup of genes indicating the likelihood that the genes share a biological property or function. To do this, it uses only a reference index that connects genes to documents, and a corpus including those documents. In the second part of the paper we present an approach, optimizing separating projections (OSP), to search for linear projections in gene expression data that separate functionally related groups of genes from the rest of the genes; the objective function in our search is the NDPG score of the positively projected genes. A successful search, therefore, should identify patterns in gene expression data that correlate with meaningful biology. We apply OSP to a published gene expression data set; it discovers many biologically relevant projections. Since the method requires only numerical measurements (in this case expression) about entities (genes) with textual documentation (literature), we conjecture that this method could be transferred easily to other domains. The method should be able to identify relevant patterns even if the documentation for each entity pertains to many disparate subjects that are unrelated to each other.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号