共查询到18条相似文献,搜索用时 203 毫秒
1.
2.
3.
基于目前肿瘤基因表达谱数据在医学相关结合行业的广泛普及,运用特征选择算法对其处理成了如今大量学者们的重点研究方向.基于此,提出了一种FCBF-Lasso结合算法.首先,采用FCBF算法对各基因数据集进行特征选择,通过删除冗余的和不相关的特征,得到特征子集;然后,再利用Lasso方法对得到特征子集进行特征选择,进一步地删... 相似文献
4.
利用肿瘤基因表达谱建立有效的"预测性"分类模型,对肿瘤的不同亚型进行准确判别并找出决定样本类别的一组特征基因是当前生物信息学研究的重要课题.本文在分析肿瘤基因表达谱特征的基础上,以急性白血病的基因表达谱为例,研究了肿瘤亚型识别与分类特征基因选取问题.在类别可分离性判据的问题上,修正了已有的"信噪比"指标,据此进行无关基因的剔除,并以支持向量机作为分类器进行肿瘤亚型的识别.在特征基因选取问题上,本文从生物学分析出发,首先剔除无关基因和具有较强相关性的冗余基因,然后采用顺序浮动搜索算法进行分类特征基因的选取.实验结果表明了上述方法的可行性和有效性. 相似文献
5.
有关基因挖掘及其功能分析的研究已有很多。近年来,研究者已进行了基因表达数据分析中的特征基因提取、基于粗糙集的基因表达数据分类研究、粒计算在基因微阵列数据特征选择中的应用等研究。应用粒计算约简理论对基因表达数据进行分析,有助于发现具有不同效用的基因;在粒计算的基础上对特征基因进行挖掘,是当今生物学与信息技术学相互联系进行研究的重点和热点。 相似文献
6.
cDNA生物芯片表达数据广泛用于生物医学研究,利用计算机对其进行处理还有很多挑战性课题。该文提出了一种新的基于不变基因的多类生物芯片监督型集合cDNA表达数据标准化方法。在达到标准化的同时,该方法也可直接用于基因表达数据的特征选择,实验证明效果较好。 相似文献
7.
8.
9.
该文提出了一种基于主曲线的脱机手写数字识别方法.该方法将主曲线及知识约简算法运用于识别模型中.主曲线是主成份分析的非线性推广,它是通过数据分布"中间"并满足"自相合"的光滑曲线.它较好地反映了数据分布的结构特征.粗糙集理论的知识约简是从决策表中获取决策(分类)规则的有效工具.本文将主曲线用于训练数据的特征提取,根据主曲线的特征生成决策表;利用我们提出的知识约简算法对决策表进行处理,自动获得分类规则.这种方法既符合人的识别习惯,又克服了利用统计特征识别所带来的不足.实验结果表明了该方法能有效提高手写数字的识别率,为脱机手写数字识别的研究提供了一条新途径. 相似文献
10.
基于MST的基因数据社团挖掘算法 总被引:1,自引:0,他引:1
使用机器学习方法来分析生物信息学中一些复杂的基因表达数据是目前重要的研究领域之一.使用社团挖掘的方法对基因表达数据进行分类,社团内由类似的基因数据组成,研究和分析每个社团的结构和功能以及社团之间的关系,这对深刻认识诸多生物过程的本质有重要意义.将最小生成树的概念引入生物信息学中基因表达数据的社团挖掘分析中,设计了最小生成树来表示基因表达数据和基于此的社团挖掘算法,针对该算法提出一些目标函数,来判别基因表达数据社团挖掘算法的性能.最后,通过实验验证了该算法对于一些目标函数能够产生最优的社团划分,并且社团挖掘算法的性能良好. 相似文献
11.
12.
An evolutionary approach for gene expression patterns 总被引:1,自引:0,他引:1
Huai-Kuang Tsai Jinn-Moon Yang Yuan-Fang Tsai Cheng-Yan Kao 《IEEE transactions on information technology in biomedicine》2004,8(2):69-78
This study presents an evolutionary algorithm, called a heterogeneous selection genetic algorithm (HeSGA), for analyzing the patterns of gene expression on microarray data. Microarray technologies have provided the means to monitor the expression levels of a large number of genes simultaneously. Gene clustering and gene ordering are important in analyzing a large body of microarray expression data. The proposed method simultaneously solves gene clustering and gene-ordering problems by integrating global and local search mechanisms. Clustering and ordering information is used to identify functionally related genes and to infer genetic networks from immense microarray expression data. HeSGA was tested on eight test microarray datasets, ranging in size from 147 to 6221 genes. The experimental clustering and visual results indicate that HeSGA not only ordered genes smoothly but also grouped genes with similar gene expressions. Visualized results and a new scoring function that references predefined functional categories were employed to confirm the biological interpretations of results yielded using HeSGA and other methods. These results indicate that HeSGA has potential in analyzing gene expression patterns. 相似文献
13.
针对基因表达数据高维小样本特性所带来的维数灾难问题,结合回归和类别保留投影方法,提出一种新的基因表达数据降维方法,叫稀疏类别保留投影.相比类别保留投影,能有效避免类别保留投影在基因表达数据降维上存在的矩阵奇异和过拟合问题.通过对真实基因表达数据进行数据可视化和分类识别,验证了方法的有效性. 相似文献
14.
Cluster analysis of gene expression data based on self-splitting and merging competitive learning 总被引:8,自引:0,他引:8
Shuanhu Wu Liew A.W.-C. Hong Yan Mengsu Yang 《IEEE transactions on information technology in biomedicine》2004,8(1):5-15
Cluster analysis of gene expression data from a cDNA microarray is useful for identifying biologically relevant groups of genes. However, finding the natural clusters in the data and estimating the correct number of clusters are still two largely unsolved problems. In this paper, we propose a new clustering framework that is able to address both these problems. By using the one-prototype-take-one-cluster (OPTOC) competitive learning paradigm, the proposed algorithm can find natural clusters in the input data, and the clustering solution is not sensitive to initialization. In order to estimate the number of distinct clusters in the data, we propose a cluster splitting and merging strategy. We have applied the new algorithm to simulated gene expression data for which the correct distribution of genes over clusters is known a priori. The results show that the proposed algorithm can find natural clusters and give the correct number of clusters. The algorithm has also been tested on real gene expression changes during yeast cell cycle, for which the fundamental patterns of gene expression and assignment of genes to clusters are well understood from numerous previous studies. Comparative studies with several clustering algorithms illustrate the effectiveness of our method. 相似文献
15.
Gridding microarray images remains, at present, a major bottleneck. It requires human intervention which causes variations of the gene expression results. In this paper, an original and fully automatic approach for accurately locating a distorted grid structure in a microarray image is presented. The gridding process is expressed as an optimization problem which is solved by using a genetic algorithm (GA). The GA determines the line-segments constituting the grid structure. The proposed method has been compared with existing software tools as well as with a recently published technique. For this purpose, several real and artificial microarray images containing more than one million spots have been used. The outcome has shown that the accuracy of the proposed method achieves the high value of 94% and it outperforms the existing approaches. It is also noise-resistant and yields excellent results even under adverse conditions such as arbitrary grid rotations, and the appearance of various spot sizes. 相似文献
16.
Debashis Ghosh 《The Journal of VLSI Signal Processing》2004,38(3):277-286
High-throughput gene expression technologies such as microarrays have been utilized in a variety of scientific applications. In this article, we develop multivariate techniques for visualizing gene regulatory networks using independent components analysis (ICA) techniques. A desirable feature of the ICA method is that it approximates a biological model for the gene expression. The methods are outlined and illustrated with application to yeast gene expression data. 相似文献
17.
Jing L. Ng M. K. Liu Y. 《IEEE transactions on information technology in biomedicine》2010,14(1):107-118
18.
Mohamad MS Omatu S Deris S Yoshioka M 《IEEE transactions on information technology in biomedicine》2011,15(6):813-822
Gene expression data are expected to be of significant help in the development of efficient cancer diagnoses and classification platforms. In order to select a small subset of informative genes from the data for cancer classification, recently, many researchers are analyzing gene expression data using various computational intelligence methods. However, due to the small number of samples compared to the huge number of genes (high dimension), irrelevant genes, and noisy genes, many of the computational methods face difficulties to select the small subset. Thus, we propose an improved (modified) binary particle swarm optimization to select the small subset of informative genes that is relevant for the cancer classification. In this proposed method, we introduce particles' speed for giving the rate at which a particle changes its position, and we propose a rule for updating particle's positions. By performing experiments on ten different gene expression datasets, we have found that the performance of the proposed method is superior to other previous related works, including the conventional version of binary particle swarm optimization (BPSO) in terms of classification accuracy and the number of selected genes. The proposed method also produces lower running times compared to BPSO. 相似文献