Journal of Shandong University(Engineering Science) ›› 2015, Vol. 45 ›› Issue (5): 22-28.doi: 10.6040/j.issn.1672-3961.0.2014.367

Previous Articles     Next Articles

Parallel implementing probabilistic spreading algorithm using MapReduce programming mode

HE Dongzhi, ZHANG Jifeng, ZHAO Pengfei   

  1. School of Software Engineering, Beijing University of Technology, Beijing 100124, China
  • Published:2020-05-26

Abstract: In order to overcome the limitations of the serial probabilistic spreading algorithm in dealing with large-scale dataset, a parallelization of the algorithm was put forth by using MapReduce. The complex computing tasks were decomposed into a series of MapReduce job flow for distributed parallel processing on Hadoop. The input and output data of every step were stored in the Hadoop distributed file system. Hit ratio was used to compare the parallelizable probabilistic spreading algorithm versus the global ranking method performance. Speedups of the parallelizable algorithm were compared while the amount of data and the number of nodes was different. Experiment results showed that the probabilistic spreading algorithm based on MapReduce had good parallelism and had higher hit ratio than the global ranking method. Data scale that can be handled by the serial algorithm was expanded, and the operation speed of the algorithm was raised.

Key words: MapReduce, cloud computing paltform, bipartite network, probabilistic spreading algorithm, distributed

CLC Number: 

  • TP311
[1] 王国霞,刘贺平.个性化推荐系统综述[J].计算机工程与应用, 2012, 48(7):66-76. WANG Guoxia, LIU Heping. Survey of personalized recommendation system[J]. Computer Engineering and Applications, 2012, 48(7):66-76.
[2] L(¨overU)Linyuan, MEDO Matús, CHI Hoyeung, et al. Recommender systems[J]. Physics Reports, 2012, 519(1):1-49.
[3] 朱郁筱, 吕琳媛. 推荐系统评价指标综述[J]. 电子科技大学学报, 2012, 41(2):163-175. ZHU Yuxiao, L(¨overU)Linyuan. Evaluation metrics for recommender systems[J]. Journal of University of Electronic Science and Technology of China, 2012, 41(2):163-175.
[4] ZHOU Tao, REN Jie, MEDO Matús, et al. Bipartite network projection and personal recommendation[J]. Physical Review E, 2007, 76(4):046115.
[5] LATAPY Matthieu, MAGNIEN Clémence, DELVECCHIO Nathalie. Basic notions for the analysis of large two-mode networks[J]. Social Networks, 2008, 30(1):31-48.
[6] 吴亚晶, 张鹏, 狄增如, 等. 二分网络研究[J]. 复杂系统与复杂性科学, 2010, 7(1):1-12. WU Yajing, ZHANG Peng, DI Zengru, et al. Study on bipartite networks[J]. Complex Systems and Complexity Science, 2010, 7(1):1-12.
[7] 刘亮亮, 曹菡, 韩亚楠. 基于群体动力学的协同过滤算法及应用[J]. 计算机应用研究, 2014, 31(12):3603-3605, 3612. LIU Liangliang, CAO Han, HAN Yanan. Algorithm and application of collaborative filtering based on group dynamics[J]. Application Research of Computers, 2014, 31(12):3603-3605, 3612.
[8] KOREN Yehuda, BELL Robert. Recommender systems handbook[M]. New York: Springer, 2011:145-186.
[9] ADOMAVICIUS G, TUZHILIN A. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions[J]. Knowledge and Data Engineering, 2005, 17(6):734-749.
[10] 陈吉荣, 乐嘉锦. 基于 Hadoop 生态系统的大数据解决方案综述[J]. 计算机工程与科学, 2013, 35(10):25-35. CHEN Jirong, LE Jiajin. Reviewing the big data solution based on hadoop ecosystem[J]. Computer Engineering & Science, 2013, 35(10):25-35.
[11] 冯汉超, 周凯东. 分布式系统下大数据存储结构优化研究[J]. 河北工程大学学报:自然科学版, 2014, 31(4):69-73. FENG Hanchao, ZHOU Kaidong. Research on optimizing big data storage structure in distributed system[J]. Journal of Hebei University of Engineering:Natural Science Edition, 2014, 31(4):69-73.
[12] 刘飞. 基于云计算的分布式存储系统的研究和应用[D]. 西安:西安工业大学, 2012. LIU Fei. Research and application of distributed storage system based on cloud computing[D]. Xi'an:Xi'an Technological University, 2012.
[13] BORTHAKUR D. HDFS architecture guide[J]. Hadoop Apache Project, 2008:53.
[14] 黄文依, 王劲松, 林胜. HDFS可视化操作研究与实现[J]. 天津理工大学学报, 2012, 28(1):31-34. HANG Wenyi, WANG Jinsong, LIN Sheng. Research and implementation of HDFS visualized operation[J]. Journal of Tianjin University of Technology, 2012, 28(1):31-34.
[15] KAVULYA S, TAN J, GANDHI R, et al. An analysis of traces from a production mapreduce cluster[C] //Processdings of Cluster, Cloud and Grid Computing(CCGrid). Melbourne:IEEE, 2010:94-103.
[16] LIN Wenhui, LEI Zhenming, LIU Jun, et al. MapReduce optimization algorithm based on machine learning in heterogeneous cloud environment[J]. The Journal of China Universities of Posts and Telecommunications, 2013, 20(6):77-121.
[17] 应毅,刘亚军. MapReduce并行计算技术发展综述[J]. 计算机系统应用, 2014,23(4):1-6,11. YING Yi, LIU Yajun. Survey of developments of MapReduce parallel computing technology[J]. Computer Systems & Applications, 2014, 23(4):1-6,11.
[18] DITTRICH J, QUIANÉRUIZ J A. Efficient big data processing in hadoop MapReduce[J]. Proceedings of the VLDB Endowment, 2012, 5(12):2014-2015.
[19] 赵卫中,马慧芳,傅燕翔,等.基于云计算平台Hadoop的并行k-means聚类算法设计研究[J].计算机科学, 2011(10):166-168. ZHAO Weizhong, MA Huifang, FU Yanxiang, et al. Research on parallel k-means algorithm design based on hadoop platform[J]. Computer Science, 2011(10):166-168.
[20] HASSANZADEH O, KEMENTSIETSIDIS A, KIMEL-FELD B, et al. Next generation data analytics at IBM research[J]. Proceedings of the VLDB Endowment, 2013, 6(11):1174-1175.
[21] BORTHAKUR D, GRAY J, SARMA J S, et al. Apache hadoop goes realtime at facebook[C] //Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data. New York:ACM, 2011:1071-1080.
[22] LEE H, KWON J. Similar user clustering based on movielens data set[J]. Advanced Science and Technology Letters, 2014, 51:32-35.
[1] SUN Donglei, SUN Yi, LIU Rui, SUN Pengkai, ZHANG Yumin. Ridge regression-based method for predicting distributed photovoltaic consumption capacity in distribution networks [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 149-157.
[2] LIU Xinfeng, ZHANG YiNi, XU Huisan, SONG Ling, CHEN Mengya. Shadow occlusion diagnosis of distributed photovoltaic power station based on random forest and expert system [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 98-104.
[3] Shizhan SONG,Haoyu CHEN,Jian ZHANG,Kun WANG,Qingshui HAO. Voltage control method of urban distribution network considering street light charging pile access [J]. Journal of Shandong University(Engineering Science), 2020, 50(3): 104-110.
[4] Fangyun HAN, Liang QIAO, Bincheng ZHAO, Li ZHANG. Weighted value of solar tariff based on time-of-use electricity price [J]. Journal of Shandong University(Engineering Science), 2019, 49(6): 93-97.
[5] Fei WANG, Chunyi WANG, Chuanyong WANG, Guangfeng ZHAO, Mu LI, Xiaohan SHI. Restrictions and countermeasures of distributed PV integration based on secondary battery energy storage system [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 109-115.
[6] Qingtao QU,Qicheng LIU,Chunxiao MU. A parallel adaptive news topic tracking algorithm based on N-Gram language model [J]. Journal of Shandong University(Engineering Science), 2018, 48(6): 37-43.
[7] WANG Qi, SUN Zhumei, LIU Shaohong, BAI Jianyun. Integration transform of dust removal system based on fieldbus compatible technology [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 37-41.
[8] QIN Liguo, HE Xiao, ZHOU Donghua. A new distributed formation for multi-agent systems with constant time delays [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(5): 79-88.
[9] LIU Yang, LIU Bo, WANG Feng. Optimization algorithm for big data mining based on parameter server framework [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(4): 1-6.
[10] LI Qingdong, LUO Weichao, YE Yingxin, ZHANG Chengrui, HU Tianliang. Sharing mechanism of machine tool resource based on ontology [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(3): 130-138.
[11] SUN Yibing, FU Minyue, WANG Bingchang, ZHANG Huanshui. Distributed state estimation algorithm for large-scale dynamic systems [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(6): 62-68.
[12] FAN Debin, DENG Changshou, YUAN Sihao, TAN Xujie, DONG Xiaogang. Distributed particle swarm optimization algorithm based on mapreduce [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(6): 23-30.
[13] CHEN Jiming, SUN Mingyu, YOU Jujuan, KANG Zhongjian. Research of reactive power optimization based on subspace bacterial colony chemotaxis algorithm to in distribution networks with distributed generation#br# [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(2): 49-54.
[14] CHEN Wen-qiang1, LIN Chen1,2, CHEN Ke3, CHEN Jin-xiu1, ZOU Quan1,2*. Distributed affinity propagation clustering algorithm based on GraphLab [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2013, 43(5): 13-18.
[15] XU Ming-xia1,2, JIANG Xin-liang1, SUN Ming-ting3. Simplified calculation method under uniform distributed load for composite wall panels [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(6): 80-85.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] ZHANG Yong-hua,WANG An-ling,LIU Fu-ping . The reflected phase angle of low frequent inhomogeneous[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 22 -25 .
[2] KONG Xiang-zhen,LIU Yan-jun,WANG Yong,ZHAO Xiu-hua . Compensation and simulation for the deadband of the pneumatic proportional valve[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 99 -102 .
[3] LAI Xiang . The global domain of attraction for a kind of MKdV equations[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 87 -92 .
[4] YU Jia yuan1, TIAN Jin ting1, ZHU Qiang zhong2. Computational intelligence and its application in psychology[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 1 -5 .
[5] LI Ke,LIU Chang-chun,LI Tong-lei . Medical registration approach using improved maximization of mutual information[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 107 -110 .
[6] JI Tao,GAO Xu/sup>,SUN Tong-jing,XUE Yong-duan/sup>,XU Bing-yin/sup> . Characteristic analysis of fault generated traveling waves in 10 Kv automatic blocking and continuous power transmission lines[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 111 -116 .
[7] QIN Tong, SUN Fengrong*, WANG Limei, WANG Qinghao, LI Xincai. 3D surface reconstruction using the shape based interpolation guided by maximal discs[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 1 -5 .
[8] SUN Dianzhu, ZHU Changzhi, LI Yanrui. [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 84 -86 .
[9] YUE Yuan-Zheng. Relaxation in glasses far from equilibrium[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(5): 1 -20 .
[10] ZHANG Ai-juan. Synthesis of bone-like hydroxyapatite in simulated body fluid[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(3): 86 -90 .