山东大学学报 (工学版) ›› 2026, Vol. 56 ›› Issue (1): 63-71.doi: 10.6040/j.issn.1672-3961.0.2024.177
• 机器学习与数据挖掘 • 上一篇
朱东晴1,张骏温1,何莲英1,王睿1,刘吉强1,2,张大林2,3*
ZHU Dongqing1, ZHANG Junwen1, HE Lianying1, WANG Rui1, LIU Jiqiang1,2, ZHANG Dalin2,3*
摘要: 针对符号执行路径爆炸问题,设计一种基于主动学习引导的符号执行路径探索策略(path exploration strategy for symbolic execution based on active learning, ALS)。预测待测状态池中所有状态奖励值,自动标注预测准确的高奖励值状态,反馈给模型以更新预测模型;根据模型性能和预测精度判断是否进入下一轮测试;最后一轮模型更新结束后,部分预测模型将自动进行迭代训练,生成多个子模型,共同用于符号执行的路径探索。试验结果表明,在相同试验条件下,ALS相较其他基线方法往往能够实现更高代码覆盖和分支覆盖率,发现更多未定义行为违规数和真实世界程序漏洞,且其生成测试用例质量较高,用作模糊测试初始种子能发现更多程序路径。
中图分类号:
| [1] 吴皓, 周世龙, 史东辉, 等. 符号执行技术及应用研究综述[J]. 计算机工程与应用, 2023, 59(8): 56-72. WU Hao, ZHOU Shilong, SHI Donghui, et al. Review of symbolic execution technology and applications[J]. Computer Engineering and Applications, 2023, 59(8): 56-72. [2] CADAR C, SEN K. Symbolic execution for software testing: three decades later[J]. Communications of the ACM, 2013, 56(2): 82-90. [3] 陈德蕾, 王成, 陈建伟, 等. 基于门控循环单元与主动学习的协同过滤推荐算法[J]. 山东大学学报(工学版), 2020, 50(1): 21-27. CHEN Delei, WANG Cheng, CHEN Jianwei, et al. GRU-based collaborative filtering recommendation algorithm with active learning[J]. Journal of Shandong University(Engineering Science), 2020, 50(1): 21-27. [4] 龚楷伦, 翟婷婷, 唐鸿成. 一种面向多标签分类的在线主动学习算法[J].山东大学学报(工学版), 2022, 52(2): 80-88. GONG Kailun, ZHAI Tingting, TANG Hongcheng. An online active learning algorithm for multi-label classification[J]. Journal of Shandong University(Engineering Science), 2022, 52(2): 80-88. [5] BALDONI R, COPPA E, D'ELIA D C, et al. A survey of symbolic execution techniques[J]. ACM Computing Surveys(CSUR), 2019, 51(3): 1-39. [6] AVGERINOS T, CHA S K, REBERT A, et al. Automatic exploit generation[J]. Communications of the ACM, 2014, 57(2): 74-84. [7] MA K K, YIT PHANG K, FOSTER J S, et al. Directed symbolic execution[C] //Static Analysis: 18th Interna-tional Symposium. Venice, Italy: Springer-Verlag, 2011: 95-111. [8] ZHANG Y F, CHEN Z B, WANG J, et al. Regular property guided dynamic symbolic execution[C] //Proceedings of the 37thInternational Conference on Software Engineering. Florence, Italy: IEEE, 2015: 643-653. [9] CADAR C, DUNBAR D, ENGLER D R. Klee: unassisted and automatic generation of high-coverage tests for complex systems programs[C] //8th USENIX Symposium on Operating Systems Design and Implementation. San Diego, USA: USENIX, 2008: 209-224. [10] GODEFROID P. Compositional dynamic test generation[C] //Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming languages. Nice, USA: ACM, 2007: 47-54. [11] ANAND S, GODEFROID P, TILLMANN N. Demand-driven compositional symbolic execution[C] //Tools and Algorithms for the Construction and Analysis of Systems: 14th International Conference, TACAS 2008. Berlin, Germany: Springer, 2008: 367-381. [12] GODEFROID P, LUCHAUP D. Automatic partial loop summarization in dynamic test generation[C] //Proceedings of the 2011 International Symposium on Software Testing and Analysis. New York, USA: ACM, 2011: 23-33. [13] XIE X F, CHEN B H, LIU Y, et al. Proteus: computing disjunctive loop summary via path dependency analysis[C] //Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. New York, USA: ACM, 2016: 61-72. [14] KINDER J. Efficient symbolic execution for software testing[C] //FMCAD. Lausanne, Switzerland: IEEE, 2014: 5. [15] 邓维, 李兆鹏. 形状分析符号执行引擎中的状态合并[J]. 计算机科学, 2017, 44(2): 209-215. DENG Wei, LI Zhaopeng. State merging in shape analysis symbolic execution engine[J]. Computer Science, 2017, 44(2): 209-215. [16] KHOO Y P, CHANG B Y E, FOSTER J S. Mixing type checking and symbolic execution[C] //Proceedings of the 31st ALM SIGPLAN Conference on Programming Language Design and Implementation. New York, USA: ACM, 2010: 436-447. [17] SHOSHITAISHVILI Y, WANG R Y, HAUSER C, et al. Firmalice-automatic detection of authentication bypass vulnerabilities in binary firmware[C] //NDSS 2015.San Diego,USA:Internet Society Press, 2015: 8-11. [18] CHA S K, AVGERINOS T, REBERT A, et al. Unleashing mayhem on binary code[C] //2012 IEEE Symposium on Security and Privacy. San Francisco, USA:IEEE, 2012: 380-394. [19] GAO F J, WANG L Z, LI X D. Bovinspector: automatic inspection and repair of buffer overflow vulnerabilities[C] //Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. New York, USA: ACM, 2016: 786-791. [20] HE J X, SIVANRUPAN G, TSANKOV P, et al. Learning to explore paths for symbolic execution[C] //Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. New York, USA: ACM, 2021: 2526-2540. [21] CHA S, HONG S, LEE J, et al. Automatically generating search heuristics for concolic testing[C] //Proceedings of the 40th International Conference on Software Engineering. New York, USA: ACM, 2018: 1244-1254. [22] CHA S, OH H. Concolic testing with adaptively changing search heuristics[C] //Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.New York, USA: ACM, 2019: 235-245. [23] LI X, LIANG Y J, QIAN H, et al. Symbolic execution of complex program driven by machine learning based constraint solving[C] //Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering.New York, USA:ACM, 2016: 554-559. [24] CHA S, LEE S, OH H. Template-guided concolic testing via online learning[C] //Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. New York, USA: ACM, 2018: 408-418. [25] Settles B. Active learning literature survey[R]. University of Wisconsin-Madison, USA: Computer SciencesDepartment, 2009. [26] WU D R. Pool-based sequential active learning for regression[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 30(5): 1348-1359. [27] BELUCH W H, GENEWEIN T, NÜRNBERGER A, et al. The power of ensembles for active learning in image classification[C] //2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City, USA: IEEE, 2018: 9368-9377. [28] LI Y, SU Z D, WANG L Z, et al. Steering symbolic execution to less traveled paths[C] //Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications. Indianapolis, USA: ACM, 2013: 19-32. [29] SVOZIL D, KVASNICKA V, POSPICHAL J. Introduction to multi-layer feed-forward neural networks[J]. Chemometrics and Intelligent Laboratory Systems, 1997, 39(1): 43-62. [30] LIPTON Z C, BERKOWITZ J, ELKAN C. A critical review of recurrent neural networks for sequence learning[EB/OL].(2015-05-29)[2024-07-22]. https://arxiv.org/abs/1506.00019 [31] SAHA S, RAGHAVA G P S. Prediction of continuous B-cell epitopes in an antigen using recurrent neural network[J]. Proteins: Structure, Function, and Bioinformatics, 2006, 65(1): 40-48. [32] BUSSE F, NOWACK M, CADAR C. Running symbolic execution forever[C] //Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis.New York, USA: ACM, 2020: 63-74. [33] BURNIM J, SEN K. Heuristics for scalable dynamic test generation[C] //2008 23rd IEEE/ACM International Conference on Automated Software Engineering. L'Aquila, Italy: IEEE, 2008: 443-446. [34] OGNAWALA S, HUTZELMANN T, PSALLIDA E, et al. Improving function coverage with munch: a hybrid fuzzing and directed symbolic execution approach[C] //Proceedings of the 33rd Annual ACM Symposium on Applied Computing.Pau, France:ACM, 2018: 1475-1482. [35] HE L Y, ZHANG D L, ZHU D Q, et al. Path exploration strategy for symbolic execution based on multi-strategy active learning[C] // Proceedings of the 15th Asia-Pacific Symposium on Internetware. New York, USA: ACM, 2024: 165-168. |
| [1] | 龚楷伦,翟婷婷,唐鸿成. 一种面向多标签分类的在线主动学习算法[J]. 山东大学学报 (工学版), 2022, 52(2): 80-88. |
| [2] | 陈德蕾, 王成, 陈建伟, 吴以茵. 基于门控循环单元与主动学习的协同过滤推荐算法[J]. 山东大学学报 (工学版), 2020, 50(1): 21-27. |
| [3] | 谢伙生,刘敏. 一种基于主动学习的集成协同训练算法[J]. 山东大学学报(工学版), 2012, 42(3): 1-5. |
|
||