您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2017, Vol. 47 ›› Issue (1): 42-47.doi: 10.6040/j.issn.1672-3961.1.2016.150

• • 上一篇    下一篇

基于SVM的安卓恶意软件检测

张玉玲,尹传环*   

  1. 北京交通大学计算机与信息技术学院, 北京 100044
  • 收稿日期:2016-03-31 出版日期:2017-02-20 发布日期:2016-03-31
  • 通讯作者: 尹传环(1976— ),男,北京人,副教授,博士,主要研究方向为机器学习与入侵检测. E-mail:chyin@bjtu.edu.cn E-mail:14120451@bjtu.edu.cn
  • 作者简介:张玉玲(1990— ),女,河南安阳人,硕士研究生,主要研究方向为机器学习. E-mail:14120451@bjtu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61105056)

Android malware detection based on SVM

ZHANG Yuling, YIN Chuanhuan*   

  1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
  • Received:2016-03-31 Online:2017-02-20 Published:2016-03-31

摘要: 为了有效检测恶意软件,减少恶意软件对安卓平台的安全造成的威胁,在对现有数据集分析研究的基础上,提出概率统计和特征抽取两种策略,分别用这两种策略对提取的特征进行降维处理,减少不确定性数据,再用线性支持向量机(support vector Machine, SVM)分类,模型训练时间缩短为原来的16.7%,并且检测未知恶意软件的准确率明显提高。将该降维策略在其他常用算法上进行试验,结果表明改进后的数据有助于提高这些算法的分类准确率。

关键词: SVM, 概率统计, 特征抽取, 降维, 安卓恶意软件

Abstract: In order to detect malware effectively and reduce the threat of malicious software on Android platform security, two strategies that were probability statistics embedding and feature extraction were proposed based on the analysis of existing data sets.These strategies were used to transform high-dimensional data into low-dimensional data so as to reduce the dimension and the uncertainty of the extracted features. Support vector machine were used to classify these data. With these strategies, the time complexity of training process was reduced to 16.7 percent of the original time, and the ability of detecting unknown malware families was improved obviously. Moreover, these strategies were used with some popular classification algorithms, and the experimental results revealed that these strategies could achieve a better detection rate.

Key words: Android malware, SVM, probability statistics, feature extraction, dimensionality reduction

中图分类号: 

  • TP391
[1] STRATEGY Analytics. Android captures record 88 percent share of globalsmartphone shipments in Q3 2016[EB/OL]. [2016-11-17]. https://www.strategyanalytics.com/strategy-analytics.
[2] MOBILE Security. 2014 Mobile Threat Report[EB/OL]. [2016-11-17]. https://www.lookout.com/resources/reports/mobile-threat-report.
[3] LI Jun. 360发布手机安全报告恶意程序去年增4倍[J]. 计算机与网络, 2015, 41(3):89-89. LIU J. 360 delivered Mobile Security Report: Malicious programs increased four times last year[J].Computer & Network, 2015, 41(3):89-89.
[4] 丰生强. Android 软件安全与逆向分析[M]. 北京:人民邮电出版社, 2013.
[5] BURGUERA I, ZURUTUZA U, NADJM-TEHRANI S. Crowdroid: behavior-based malware detection system for Android[C] //ACM Workshop on Security and Privacy in Smartphones and Mobile Devices. Chicago, Illinois, USA: ACM, 2011: 15-26.
[6] TAM K, KHAN S J, FATTORI A, et al. CopperDroid: Automatic reconstruction of Android malware behaviors[C] //Proceedings of the Symposium on Network and Distributed System Security. San Diego, CA, USA: NDSS, 2015.
[7] ENCK W, GILBERT P, HAN S, et al. TaintDroid: An information-flow tracking system for realtime privacy monitoring on smart phones[J]. ACM Transactions on Computer Systems, 2014, 32(2):393-407.
[8] ENCK W, ONGTANG M, MCDANIEL P. On lightweight mobile phone application certification[C] //Proceedings of the 16th ACM Conference on Computer and Communications Security. New York, USA: ACM, 2009: 235-245.
[9] FELT A P, CHIN E, HANNA S, et al. Android permissions demystified[C] //Proceedings of the 18th ACM Conference on Computer and Communications Security. New York, USA: ACM, 2011: 627-638.
[10] GRACE M, ZHOU Y, ZHANG Q, et al. RiskRanker: scalable and accurate zero-day Android malware detection[C] //Proceedings of the 10th International Conference on Mobile Systems, Applications, and Services. New York, USA: ACM, 2012: 281-294.
[11] YUAN Z, LU Y, WANG Z, et al. Droid-Sec: deep learning in android malware detection[C] //Proceedings of the 2014 ACM conference on SIGCOMM. New York, USA: ACM, 2014: 371-372.
[12] SHEEN S A, NITHA R, NATARAJAN V. Android based malware detection using a multifeature collaborative decision fusion approach[J]. Neurocomputing, 2015, 151:905-912.
[13] ARP D, PREITZENBARTH M S, HÜBNER M, et al. Drebin: effective and explainable detection of android malware in your pocket[C] //Proceedings of the Annual Symposium on Network and Distributed System Security. San Diego, CA, USA: NDSS, 2014.
[14] ZHOU Y, JIANG X. Dissecting Android malware: characterization and evolution[C] //IEEE Symposium on Security & Privacy. San Francisco, CA, USA: IEEE, 2012: 95-109.
[15] CORMEN T H. Introductionto Algorithms[M]. Massachusetts: MIT Press, 2009.
[16] BLOOM B H. Space/time tradeoffs in hash coding with allowable errors[J]. Communication of the ACM, 1970, 13(7):422-426.
[17] FAN R E, CHANG K W, HSIEH C J, et al. LIBLINEAR: A library for large linear classification[J]. Journal of Machine Learning research(JMLR), 2008, 9:1871-1874.
[18] CORTES C, VAPNIK V. Support-vector networks[J]. Machine Learning, 1995, 20(3):273-297.
[19] 吴倩,赵晨啸,郭莹.Android安全机制解析与应用实践[M].北京:机械工业出版社,2013.
[20] AVDIIENKO V, KUZNETSOV K, GORLA A, et al. Mining apps for abnormal usage of sensitive data[C] //2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. Florence, Italy: IEEE, 2015,1: 426-436.
[21] CHANG C C, LIN C J. LIBSVM: a library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3):1-27.
[1] 钱文光,李会民. 一种相似子空间嵌入算法[J]. 山东大学学报(工学版), 2018, 48(1): 8-14.
[2] 刘晓明,牛新生,张怡,曹本庆,施啸寒,张友泉,张杰,安鹏,汪湲. 基于NASA观测数据的风电出力时空分布及波动特性分析[J]. 山东大学学报(工学版), 2016, 46(4): 111-116.
[3] 梅清琳,张化祥. 基于全局距离和类别信息的邻域保持嵌入算法[J]. 山东大学学报(工学版), 2016, 46(1): 10-14.
[4] 周哲, 商琳. 一种基于动态词典和三支决策的情感分析方法[J]. 山东大学学报(工学版), 2015, 45(1): 19-23.
[5] 文志强,朱文球,胡永祥. 半调图像的分类方法[J]. 山东大学学报(工学版), 2013, 43(4): 7-12.
[6] 李富贵1,2,黄添强1,2*,苏立超1,2,苏伟峰3. 融合多特征的异源视频复制-粘贴篡改检测[J]. 山东大学学报(工学版), 2013, 43(4): 32-38.
[7] 严云洋1,2,唐岩岩2,刘以安2,张天翼3. 使用多尺度LBP特征和SVM的火焰识别算法[J]. 山东大学学报(工学版), 2012, 42(5): 47-52.
[8] 张永军1,刘金岭2,于长辉3. 基于词贡献度的垃圾短信分类方法[J]. 山东大学学报(工学版), 2012, 42(5): 87-90.
[9] 王熙照,白丽杰*,花强,刘玉超. null[J]. 山东大学学报(工学版), 2011, 41(4): 1-6.
[10] 崔燕,范丽亚. 高维数据正定核与不定核的KPCA变换阵比较[J]. 山东大学学报(工学版), 2011, 41(1): 17-23.
[11] 贺广南,杨育彬*. 基于流形学习的图像检索算法研究[J]. 山东大学学报(工学版), 2010, 40(5): 129-136.
[12] 戴平,李宁*. 一种基于SVM的快速特征选择方法[J]. 山东大学学报(工学版), 2010, 40(5): 60-65.
[13] 曾雪强1,李国正2. 基于偏最小二乘降维的分类模型比较[J]. 山东大学学报(工学版), 2010, 40(5): 41-47.
[14] 梁塽,许洁萍*,李欣. 歌词与内容相结合的流行音乐结构分析[J]. 山东大学学报(工学版), 2010, 40(5): 77-81.
[15] 张道强. 知识保持的嵌入方法[J]. 山东大学学报(工学版), 2010, 40(2): 1-10.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!