首页 | 本学科首页   官方微博 | 高级检索  
     

基于数字签名与Trie的保序子矩阵约束查询
引用本文:姜涛,李战怀,尚学群,陈伯林,李卫榜,殷知磊.基于数字签名与Trie的保序子矩阵约束查询[J].软件学报,2017,28(8):2175-2195.
作者姓名:姜涛  李战怀  尚学群  陈伯林  李卫榜  殷知磊
作者单位:西北工业大学计算机学院, 陕西 西安 710072,西北工业大学计算机学院, 陕西 西安 710072,西北工业大学计算机学院, 陕西 西安 710072,西北工业大学计算机学院, 陕西 西安 710072,西北工业大学计算机学院, 陕西 西安 710072,西北工业大学计算机学院, 陕西 西安 710072
基金项目:国家重点基础研究发展规划(973)No.2012CB316203;国家自然科学基金No.61033007,61272121,61332014,61572367,61472321,61502390;国家高技术研究发展计划(863)No.2015AA015307;中央高校基本科研业务费专项资金No.3102015JSJ0011;西北工业大学研究生创业种子基金No.Z2012128
摘    要:目前基因芯片技术飞速发展,促使生物学家积累了大量的在不同实验条件下的基因表达数据.事实证明基因芯片数据分析在理解基因功能、基因调控和分子生命过程中发挥着重要作用.保序子矩阵(OPSM)是基因芯片数据分析技术中一种有效的模型,其可以发现在部分基因和不同实验条件下具有相同表达趋势的聚类.在分析基因表达机理过程中,OPSM的检索无疑节省了生物学家的时间与精力.目前OPSM的查询主要是基于关键词的检索方法,但是分析者对结果具有微弱的控制力.通常分析者所能决定临时的参数设置往往偏离其领域知识,致使检索结果与真实想要结果相去甚远.为了解决上述问题,本文提出两类基于数字签名与Trie的OPSM索引与约束查询方法.在真实数据上进行了大量的实验,实验结果表明所提出方法具有良好的有效性与可扩展性.

关 键 词:基因表达数据  OPSM  约束查询  数字签名  Trie  枚举序列
收稿时间:2016/1/20 0:00:00
修稿时间:2016/5/20 0:00:00

Constrained Query of Order-Preserving Submatrix Based on Signature and Trie
JIANG Tao,LI Zhan-Huai,SHANG Xue-Qun,CHEN Bo-Lin,LI Wei-Bang and YIN Zhi-Lei.Constrained Query of Order-Preserving Submatrix Based on Signature and Trie[J].Journal of Software,2017,28(8):2175-2195.
Authors:JIANG Tao  LI Zhan-Huai  SHANG Xue-Qun  CHEN Bo-Lin  LI Wei-Bang and YIN Zhi-Lei
Affiliation:School of Computer Science, Northwestern Polytechnical University, Xi''an 710072, China,School of Computer Science, Northwestern Polytechnical University, Xi''an 710072, China,School of Computer Science, Northwestern Polytechnical University, Xi''an 710072, China,School of Computer Science, Northwestern Polytechnical University, Xi''an 710072, China,School of Computer Science, Northwestern Polytechnical University, Xi''an 710072, China and School of Computer Science, Northwestern Polytechnical University, Xi''an 710072, China
Abstract:The advances of microarray technology have made large amount of gene expression data available from a variety of different experimental conditions. Analyzing the microarray data plays a key role in understanding the gene functions, gene regulation and cellular process. Order-Preserving SubMatrix (OPSM) is an important model in microarray data analysis, which captures the identicaltendency of gene expressions across a subset of conditions. In the process of analyzing mechanism of gene expression, OPSM search undoubtedly saves the time and effort of biologists. However, OPSM retrievalmainly depends on keyword search, which has a weak control on the obtained clusters. Typically, the analyst can determine the ad-hoc parameters which are far from the declarative specification of desired properties on operation and concept. Motivated by given much more accurate query relevancy, the paper proposes twokinds ofOPSM indexing and constrained query methods based on signature and Trie. We conduct extensive experiments on real datasets, and the experimental results demonstrate the proposed methods have better behaviors than the state-of-the-art methods on efficiency and effectiveness.
Keywords:gene expression data  OPSM  constrained query  signature  trie  enumerated sequence
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号