您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2016, Vol. 46 ›› Issue (4): 9-14.doi: 10.6040/j.issn.1672-3961.1.2016.229

• • 上一篇    下一篇

重建二倍体个体单体型的改进环基算法

邬慧敏1,吴璟莉1,2,3   

  1. 1. 广西师范大学计算机科学与信息工程学院, 广西 桂林 541004;2. 广西师范大学广西多源信息挖掘与安全重点实验室, 广西 桂林 541004;3. 广西区域多源信息集成与智能处理协同创新中心, 广西 桂林 541004
  • 收稿日期:2016-03-01 出版日期:2016-08-20 发布日期:2016-03-01
  • 通讯作者: 吴璟莉(1978— ),女,广西桂林人,教授,博士,主要研究方向为生物信息学.E-mail: wjlhappy@mailbox.gxnu.edu.cn E-mail:782245542@qq.com
  • 作者简介:邬慧敏(1993— ),女,江西赣州人,硕士研究生,主要研究方向为生物信息学.E-mail:782245542@qq.com
  • 基金资助:
    国家自然科学基金资助项目(61363035,61502111);广西自然科学基金资助项目(2015GXNSFAA139288,2013GXNSFBA019263,2012GXNSFAA053219);“八桂学者”工程专项;广西多源信息挖掘与安全重点实验室系统性研究资助基金(14-A-03-02,15-A-03-02)

An improved cycle basis algorithm for haplotyping a diploid individual

WU Huimin1, WU Jingli1,2,3   

  1. 1. College of Computer Science and Information Technology, Guangxi Normal University, Guilin 541004, Guangxi, China;
    2. Guangxi Key Lab of Multi-source Information Mining &
    Security, Guangxi Normal University, Guilin 541004, Guangxi, China;
    3. Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Guilin 541004, Guangxi, China
  • Received:2016-03-01 Online:2016-08-20 Published:2016-03-01

摘要: HapCompass算法是求解最少带权边删除模型(the minimum weighted edge removal, MWER)的有效启发式方法,该算法采用删除权值绝对值最小的边的方式消除冲突环基,当同时存在多条权值绝对值最小的边时,HapCompass随机选择删除边,导致求解方案的不确定性,降低重建效果。针对该问题,提出IHapCompass算法,改进去边规则,利用(00)/(11)和(01)/(10)分型的片段支持差异数与总片段数之间的比值来确定删除边,对随机取值问题做出有效限定。此外,IHapCompass以单体型中0/1取值的概率为图中孤立点赋值,明确孤立点取值。采用真实单体型数据进行测试,结果表明,IHapCompass算法在各种参数设置下,均能获得较算法HapCompass、DGS和Fast Hare更高的单体型重建率,具有较高的执行效率。该算法为求解二倍体个体单体型重建问题提供一定的参考。

关键词: 重建, 图, 最少带权边删除, 序列分析, 单体型

Abstract: HapCompass is an effective heuristic algorithm for solving the minimum weighted edge removal(MWER)model. The HapCompass algorithm eliminated conflicting cycle basis by deleting the edge with the minimum absolute value of weight. When there were several edges whose absolute values of weights were equal to the minimum one, HapCompass chose the deleted one randomly, which led to produce uncertain solutions and decrease the reconstruction effect. Aiming at this problem, algorithm IHapCompass was proposed. It improved the rules of deleting edges to limit the random value effectively. IHapCompass took advantage of the relativity between the difference of fragment number of(00)/(11)and that of(01)/(10)and the total fragment number to ascertain the deleted edge. It made use of the probability of 0/1 in haplotypes to determine the SNP value for an isolated vertex, which ascertained the value of an isolated vertex. Experiments were implemented by using the real haplotypes. The results showed that under different parameter settings, the IHapCompass algorithm can obtain higher reconstruction rate than the HapCompass, the DGS and the Fast Hare algorithm, and has high efficiency. The IHapCompass algorithm could provide a reference for the study on reconstructing single diploid individual haplotypes.

Key words: reconstruction, graph, the minimum weighted edge removal, sequence analysis, haplotype

中图分类号: 

  • TP301.6
[1] LANCIA G, BAFNA V, ISTRAIL S, et al. SNPs problems, complexity, and algorithms[C] // Proceedings of ESA-01. Aarhusv Denmark: Springer Press, 2001:182-193.
[2] LIPPERT R, SCHWARTZ R, LANCIA G, et al. Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem[J]. Briefings in Bioinformatics, 2002, 3(1):23-31.
[3] SCHWARTZ R. Theory and algorithms for the haplotype assembly problem[J]. Communications in Information & Systems, 2010, 10(1):23-38.
[4] GERACI F. A comparison of several algorithms for the single individual SNP haplotyping reconstruction problem[J]. Bioinformatics, 2010, 26(18):2217-2225.
[5] AGUIAR D, ISTRAIL S. HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data[J]. Journal of Computational Biology, 2012, 19(6):577-590.
[6] XIE Minzhu, WANG Jianxin, CHEN Xin. LGH: a fast and accurate algorithm for single individual haplotyping based on a two-locus linkage graph[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12(6):1255-1266.
[7] RIZZI R, BAFNA V, ISTRAIL S, et al. Practical algorithms and fixed-parameter tractability for the single individual SNP haplotyping problem[C] //Proceedings of WABI-02. Rome, Italy: Springer Press, 2002:29-43.
[8] BAFNA V, ISTRAIL S, LANCIA G, et al. Polynomial and APX-hard cases of the individual haplotyping problem[J]. Theoretical Computer Science, 2005, 335(1):109-125.
[9] XIE Minzhu, CHEN Jian'er, WANG Jianxin. Research on parameterized algorithms of the individual haplotyping problem[J]. Journal of Bioinformatics and Computational Biology, 2007, 5(3):795-816.
[10] CILIBRASI R, IERSEL L V, KELK S, et al. The complexity of the single individual SNP haplotyping problem[J]. Algorithmica, 2007, 49(1):13-36.
[11] WANG Ruisheng, WU Lingyun, LI Zhenping, et al. Haplotype reconstruction from SNP fragments by minimum error correction[J]. Bioinformatics, 2005, 21(10):2456-2462.
[12] PANCONESI A, SOZIO M. Fast hare: a fast heuristic for single individual SNP haplotype reconstruction[C] //Proceedings of WABI-04. Bergen, Norway: Springer Press, 2004:266-277.
[13] GENOVESE L M, GERACI F, PELLEGRINI M. SpeedHap: an accurate heuristic for the single individual SNP haplotyping problem with many gaps, high reading error rate and low coverage[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2008, 5(4):492-502.
[14] LEVY S, SUTTON G, NG P C, et al. The diploid genome sequence of an individual human[J]. PLOS Biology, 2007, 5(10):2113-2144.
[15] BANSAL V, BAFNA V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem[J]. Bioinformatics, 2008, 24(16):i153-i159.
[16] MAZROUEE S, WANG W. FastHap: fast and accurate single individual haplotype reconstruction using fuzzy conflict graphs[J]. Bioinformatics, 2014, 30(17):i371-i378.
[17] 吴璟莉. 遗传多态性检测中组合优化问题的研究[D]. 长沙: 中南大学, 2008. WU Jingli. Research on the combinatorial optimization problem in detection of genetic diversities[D]. Changsha: Central South University, 2008.
[18] WU Jingli, LIANG Binbin. A fast and accurate algorithm for diploid individual haplotype reconstruction[J]. Journal of Bioinformatics and Computational Biology, 2013, 11(4):1350010.
[19] The International HapMap Consortium. The international HapMap project[J]. Nature, 2003, 426(6968):789-796.
[20] MYERS G. A dataset generator for whole genome shotgun sequencing[C] //Proceedings of ISMB-99. Heidelberg, Germany: AAAI Press, 1999:202-210.
[21] RICHTER D C, OTT F, AUCH A F, et al. MetaSim: a sequencing simulator for genomics and metagenomics[J]. Plos One, 2008, 3(10):1-12.
[1] 邓彬, 张宗包, 赵文猛, 罗新航, 吴秋伟. 基于云边协同和图神经网络的电动汽车充电站负荷预测方法[J]. 山东大学学报 (工学版), 2025, 55(5): 62-69.
[2] 周遵富,张乾,石计亮,岳诗琴. 基于纹理和结构交互的人脸图像修复[J]. 山东大学学报 (工学版), 2025, 55(4): 18-28.
[3] 韩小凡,刁振宇,张承宇,聂慧佳,赵秀阳,牛冬梅. 基于注意力和视图信息的单图三维模型检索[J]. 山东大学学报 (工学版), 2025, 55(4): 48-55.
[4] 王旭峰, 周迪,张风雷,宋雪萌,刘萌. 基于多粒度对齐网络的图像-文本匹配方法[J]. 山东大学学报 (工学版), 2025, 55(4): 29-39.
[5] 吕斌,刘淼,吴建清,张子毅,陈启香. 数字地图拼接技术综述[J]. 山东大学学报 (工学版), 2025, 55(3): 1-15.
[6] 高君健,廖祝华,刘毅志,赵肄江. 基于分层多智能体强化学习的个性化与信号控制联合路径引导方法[J]. 山东大学学报 (工学版), 2025, 55(3): 34-45.
[7] 李丰,文益民. 融合多尺度视觉和文本语义特征的图像描述生成算法[J]. 山东大学学报 (工学版), 2025, 55(3): 80-87.
[8] 周彦冰,马士伦,文益民. 基于图结构的概念漂移检测[J]. 山东大学学报 (工学版), 2025, 55(2): 88-96.
[9] 李伟豪,王苹苹,许万博,魏本征. 结构先验引导的多模态腰椎MRI图像分割算法[J]. 山东大学学报 (工学版), 2025, 55(1): 66-76.
[10] 刘全金,嵇文,胡浪涛,黄汇磊,杨瑞,李翔,高泽文,魏本征. 基于双解码器的医学图像分割模型[J]. 山东大学学报 (工学版), 2024, 54(6): 8-18.
[11] 马军,车进,贺愉婷,马鹏森. 基于空间注意力及条件增强的文本生成图像方法[J]. 山东大学学报 (工学版), 2024, 54(6): 49-56.
[12] 鲁志恒,霍延强,韩汶,杜聪,刘轶鹏,张宏博. 基于图像数据和碎石集料级配与用量的碎石集料空隙率快速检测方法[J]. 山东大学学报 (工学版), 2024, 54(6): 89-99.
[13] 薛健,赵琳,张浩,杨璐,郝凡昌. 改进Faster R-CNN的交通标志检测算法[J]. 山东大学学报 (工学版), 2024, 54(5): 34-41.
[14] 邹正标,刘毅志,廖祝华,赵肄江. 动态交通流量预测的时空注意力图卷积网络[J]. 山东大学学报 (工学版), 2024, 54(5): 50-61.
[15] 林振宇,邵蓥侠. 基于盖根堡多项式最佳平方近似的谱图网络[J]. 山东大学学报 (工学版), 2024, 54(5): 93-100.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 施来顺,万忠义 . 新型甜菜碱型沥青乳化剂的合成与性能测试[J]. 山东大学学报(工学版), 2008, 38(4): 112 -115 .
[2] 岳远征. 远离平衡态玻璃的弛豫[J]. 山东大学学报(工学版), 2009, 39(5): 1 -20 .
[3] 李辉平, 赵国群, 张雷, 贺连芳. 超高强度钢板热冲压及模内淬火工艺的发展现状[J]. 山东大学学报(工学版), 2010, 40(3): 69 -74 .
[4] 孙从征,管从胜,秦敬玉,程川 . 铝合金化学镀镍磷合金结构和性能[J]. 山东大学学报(工学版), 2007, 37(5): 108 -112 .
[5] 薛翊国,李术才,赵岩,苏茂鑫,李为腾,丁志海. 青岛胶州湾海底隧道F44含水断层注浆前后TSP探测分析[J]. 山东大学学报(工学版), 2009, 39(2): 108 -112 .
[6] 胡天亮,李鹏,张承瑞,左毅 . 基于VHDL的正交编码脉冲电路解码计数器设计[J]. 山东大学学报(工学版), 2008, 38(3): 10 -13 .
[7] 田芳1,张颖欣2,张礼3,侯秀萍3,裘南畹3. 新型金属氧化物薄膜气敏元件基材料的开发[J]. 山东大学学报(工学版), 2009, 39(2): 104 -107 .
[8] 王杉,李田泽 . 一种绕线转子感应电机控制的新方法[J]. 山东大学学报(工学版), 2008, 38(3): 86 -89 .
[9] 罗运虎,邢丽冬,王勤,刘海春,翁晓光 . 需求侧2种可中断负荷备用市场报价策略的协调[J]. 山东大学学报(工学版), 2008, 38(3): 77 -80 .
[10] 孙宗耀,刘允刚 . 一类2维不确定非线性系统自适应输出反馈镇定[J]. 山东大学学报(工学版), 2007, 37(5): 34 -39 .