您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (4): 9-14.doi: 10.6040/j.issn.1672-3961.1.2016.229

• • 上一篇    下一篇

重建二倍体个体单体型的改进环基算法

邬慧敏1,吴璟莉1,2,3   

  1. 1. 广西师范大学计算机科学与信息工程学院, 广西 桂林 541004;2. 广西师范大学广西多源信息挖掘与安全重点实验室, 广西 桂林 541004;3. 广西区域多源信息集成与智能处理协同创新中心, 广西 桂林 541004
  • 收稿日期:2016-03-01 出版日期:2016-08-20 发布日期:2016-03-01
  • 通讯作者: 吴璟莉(1978— ),女,广西桂林人,教授,博士,主要研究方向为生物信息学.E-mail: wjlhappy@mailbox.gxnu.edu.cn E-mail:782245542@qq.com
  • 作者简介:邬慧敏(1993— ),女,江西赣州人,硕士研究生,主要研究方向为生物信息学.E-mail:782245542@qq.com
  • 基金资助:
    国家自然科学基金资助项目(61363035,61502111);广西自然科学基金资助项目(2015GXNSFAA139288,2013GXNSFBA019263,2012GXNSFAA053219);“八桂学者”工程专项;广西多源信息挖掘与安全重点实验室系统性研究资助基金(14-A-03-02,15-A-03-02)

An improved cycle basis algorithm for haplotyping a diploid individual

WU Huimin1, WU Jingli1,2,3   

  1. 1. College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, Guangxi, China;
    2. Guangxi Key Lab of Multi-source Information Mining &
    Security, Guangxi Normal University, Guilin 541004, Guangxi, China;
    3. Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Guilin 541004, Guangxi, China
  • Received:2016-03-01 Online:2016-08-20 Published:2016-03-01

摘要: HapCompass算法是求解最少带权边删除模型(the minimum weighted edge removal, MWER)的有效启发式方法,该算法采用删除权值绝对值最小的边的方式消除冲突环基,当同时存在多条权值绝对值最小的边时,HapCompass随机选择删除边,导致求解方案的不确定性,降低重建效果。针对该问题,提出IHapCompass算法,改进去边规则,利用(00)/(11)和(01)/(10)分型的片段支持差异数与总片段数之间的比值来确定删除边,对随机取值问题做出有效限定。此外,IHapCompass以单体型中0/1取值的概率为图中孤立点赋值,明确孤立点取值。采用真实单体型数据进行测试,结果表明,IHapCompass算法在各种参数设置下,均能获得较算法HapCompass、DGS和Fast Hare更高的单体型重建率,具有较高的执行效率。该算法为求解二倍体个体单体型重建问题提供一定的参考。

关键词: 重建, 图, 最少带权边删除, 序列分析, 单体型

Abstract: HapCompass is an effective heuristic algorithm for solving the minimum weighted edge removal(MWER)model. The HapCompass algorithm eliminated conflicting cycle basis by deleting the edge with the minimum absolute value of weight. When there were several edges whose absolute values of weights were equal to the minimum one, HapCompass chose the deleted one randomly, which led to produce uncertain solutions and decrease the reconstruction effect. Aiming at this problem, algorithm IHapCompass was proposed. It improved the rules of deleting edges to limit the random value effectively. IHapCompass took advantage of the relativity between the difference of fragment number of(00)/(11)and that of(01)/(10)and the total fragment number to ascertain the deleted edge. It made use of the probability of 0/1 in haplotypes to determine the SNP value for an isolated vertex, which ascertained the value of an isolated vertex. Experiments were implemented by using the real haplotypes. The results showed that under different parameter settings, the IHapCompass algorithm can obtain higher reconstruction rate than the HapCompass, the DGS and the Fast Hare algorithm, and has high efficiency. The IHapCompass algorithm could provide a reference for the study on reconstructing single diploid individual haplotypes.

Key words: reconstruction, graph, the minimum weighted edge removal, sequence analysis, haplotype

中图分类号: 

  • TP301.6
[1] LANCIA G, BAFNA V, ISTRAIL S, et al. SNPs problems, complexity, and algorithms[C] // Proceedings of ESA-01. Aarhusv Denmark: Springer Press, 2001:182-193.
[2] LIPPERT R, SCHWARTZ R, LANCIA G, et al. Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem[J]. Briefings in Bioinformatics, 2002, 3(1):23-31.
[3] SCHWARTZ R. Theory and algorithms for the haplotype assembly problem[J]. Communications in Information & Systems, 2010, 10(1):23-38.
[4] GERACI F. A comparison of several algorithms for the single individual SNP haplotyping reconstruction problem[J]. Bioinformatics, 2010, 26(18):2217-2225.
[5] AGUIAR D, ISTRAIL S. HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data[J]. Journal of Computational Biology, 2012, 19(6):577-590.
[6] XIE Minzhu, WANG Jianxin, CHEN Xin. LGH: a fast and accurate algorithm for single individual haplotyping based on a two-locus linkage graph[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12(6):1255-1266.
[7] RIZZI R, BAFNA V, ISTRAIL S, et al. Practical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem[C] //Proceedings of WABI-02. Rome, Italy: Springer Press, 2002:29-43.
[8] BAFNA V, ISTRAIL S, LANCIA G, et al. Polynomial and APX-hard cases of the individual haplotyping problem[J]. Theoretical Computer Science, 2005, 335(1):109-125.
[9] XIE Minzhu, CHEN Jian'er, WANG Jianxin. Research on parameterized algorithms of the individual haplotyping problem[J]. Journal of Bioinformatics and Computational Biology, 2007, 5(3):795-816.
[10] CILIBRASI R, IERSEL L V, KELK S, et al. The complexity of the single individual SNP haplotyping problem[J]. Algorithmica, 2007, 49(1):13-36.
[11] WANG Ruisheng, WU Lingyun, LI Zhenping, et al. Haplotype reconstruction from SNP fragments by minimum error correction[J]. Bioinformatics, 2005, 21(10):2456-2462.
[12] PANCONESI A, SOZIO M. Fast hare: a fast heuristic for single individual SNP haplotype reconstruction[C] //Proceedings of WABI-04. Bergen, Norway: Springer Press, 2004:266-277.
[13] GENOVESE L M, GERACI F, PELLEGRINI M. SpeedHap: an accurate heuristic for the single individual SNP haplotyping problem with many gaps, high reading error rate and low coverage[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2008, 5(4):492-502.
[14] LEVY S, SUTTON G, NG P C, et al. The diploid genome sequence of an individual human[J]. PLOS Biology, 2007, 5(10):2113-2144.
[15] BANSAL V, BAFNA V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem[J]. Bioinformatics, 2008, 24(16):i153-i159.
[16] MAZROUEE S, WANG W. FastHap: fast and accurate single individual haplotype reconstruction using fuzzy conflict graphs[J]. Bioinformatics, 2014, 30(17):i371-i378.
[17] 吴璟莉. 遗传多态性检测中组合优化问题的研究[D]. 长沙: 中南大学, 2008. WU Jingli. Research on the combinatorial optimization problem in detection of genetic diversities[D]. Changsha: Central South University, 2008.
[18] WU Jingli, LIANG Binbin. A fast and accurate algorithm for diploid individual haplotype reconstruction[J]. Journal of Bioinformatics and Computational Biology, 2013, 11(4):1350010.
[19] The International HapMap Consortium. The international HapMap project[J]. Nature, 2003, 426(6968):789-796.
[20] MYERS G. A dataset generator for whole genome shotgun sequencing[C] //Proceedings of ISMB-99. Heidelberg, Germany: AAAI Press, 1999:202-210.
[21] RICHTER D C, OTT F, AUCH A F, et al. MetaSim: a sequencing simulator for genomics and metagenomics[J]. Plos One, 2008, 3(10):1-12.
[1] 陈海永,余力,刘辉,杨佳博,胡启迪. 基于经验小波的太阳能电池缺陷图像融合[J]. 山东大学学报 (工学版), 2018, 48(5): 24-31.
[2] 王国新,陈凤东,刘国栋. 基于彩色伪随机编码结构光特征提取方法[J]. 山东大学学报 (工学版), 2018, 48(5): 55-60.
[3] 胡建平,李鑫,谢琪,李玲,张道畅. 基于Delaunay三角化的二维无约束优化EMD方法[J]. 山东大学学报 (工学版), 2018, 48(5): 9-15, 37.
[4] 梁蒙蒙,周涛,夏勇,张飞飞,杨健. 基于PSO-ConvK卷积神经网络的肺部肿瘤图像识别[J]. 山东大学学报 (工学版), 2018, 48(5): 77-84.
[5] 黄劲潮. 基于快速区域建议网络的图像多目标分割算法[J]. 山东大学学报(工学版), 2018, 48(4): 20-26.
[6] 张宪红,张春蕊. 基于六维前馈神经网络模型的图像增强算法[J]. 山东大学学报(工学版), 2018, 48(4): 10-19.
[7] 江珊珊,杨静,范丽亚. 基于PDEs的图像特征提取方法[J]. 山东大学学报(工学版), 2018, 48(4): 27-36.
[8] 何文杰 ,何伟超,孙权森. 压缩感知重构算法的并行化及GPU加速[J]. 山东大学学报(工学版), 2018, 48(3): 110-114.
[9] 叶子云,杨金锋. 一种基于加权图模型的手指静脉识别方法[J]. 山东大学学报(工学版), 2018, 48(3): 103-109.
[10] 李雨鑫,普园媛,徐丹,钱文华,刘和娟. 深度卷积神经网络嵌套fine-tune的图像美感品质评价[J]. 山东大学学报(工学版), 2018, 48(3): 60-66.
[11] 张佩瑞,杨燕,邢焕来,喻琇瑛. 基于核K-means的增量多视图聚类算法[J]. 山东大学学报(工学版), 2018, 48(3): 48-53.
[12] 王雪琴,李树荣,于妤,王家岩. 带几何约束的彩色图像选择性分割[J]. 山东大学学报(工学版), 2018, 48(2): 22-29.
[13] 张振月,李斐,江铭炎. 基于低秩表示投影的无监督人脸特征提取[J]. 山东大学学报(工学版), 2018, 48(1): 15-20.
[14] 王梦园,张雄,马亮,彭开香. 基于因果拓扑图的工业过程故障诊断[J]. 山东大学学报(工学版), 2017, 47(5): 187-194.
[15] 李明虎,李钢,钟麦英. 动态核主元分析在无人机故障诊断中的应用[J]. 山东大学学报(工学版), 2017, 47(5): 215-222.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 程代展,李志强. 非线性系统线性化综述(英文)[J]. 山东大学学报(工学版), 2009, 39(2): 26 -36 .
[2] 王勇, 谢玉东.

大流量管道煤气的控制技术研究

[J]. 山东大学学报(工学版), 2009, 39(2): 70 -74 .
[3] 刘新1 ,宋思利1 ,王新洪2 . 石墨配比对钨极氩弧熔敷层TiC增强相含量及分布形态的影响[J]. 山东大学学报(工学版), 2009, 39(2): 98 -100 .
[4] 田芳1,张颖欣2,张礼3,侯秀萍3,裘南畹3. 新型金属氧化物薄膜气敏元件基材料的开发[J]. 山东大学学报(工学版), 2009, 39(2): 104 -107 .
[5] 陈华鑫, 陈拴发, 王秉纲. 基质沥青老化行为与老化机理[J]. 山东大学学报(工学版), 2009, 39(2): 125 -130 .
[6] 赵延风1,2, 王正中1,2 ,芦琴1,祝晗英3 . 梯形明渠水跃共轭水深的直接计算方法[J]. 山东大学学报(工学版), 2009, 39(2): 131 -136 .
[7] 李士进,王声特,黄乐平. 基于正反向异质性的遥感图像变化检测[J]. 山东大学学报(工学版), 2018, 48(3): 1 -9 .
[8] 赵科军 王新军 刘洋 仇一泓. 基于结构化覆盖网的连续 top-k 联接查询算法[J]. 山东大学学报(工学版), 2009, 39(5): 32 -37 .
[9] 赵治广,王登杰,田云飞 . 基于灰色理论的路基沉降研究[J]. 山东大学学报(工学版), 2007, 37(3): 86 -88 .
[10] 姚占勇,商庆森,赵之仲,贾朝霞 . 界面条件对半刚性沥青路面结构应力分布的影响[J]. 山东大学学报(工学版), 2007, 37(3): 93 -99 .