JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2013, Vol. 43 ›› Issue (2): 29-34.

• Articles • Previous Articles     Next Articles

Independent component analysis and co-training based Web spam detection

GAO Shuang1,2, ZHANG Hua-xiang1,2*, FANG Xiao-nan1,2   

  1. 1. Department of Information Science and Engineering, Shandong Normal University, Jinan 250014, China;
    2. Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan 250014, China
  • Received:2012-12-05 Online:2013-04-20 Published:2012-12-05

Abstract:

Web spam detection is of great significance, and there only exists a small number of labeled pages. Thus, the semi-supervised co-training was used to detect the Web spam pages. The page features were divided into two views, the content view and the link view. First, the independent components of each view were extracted by  the independent component analysis, and then the co-training was used to detect the label of each Web page. Experimental results showed that this method could effectively improve the recognition accuracy of Web spam. The results also verified that two respective independent component analyses of each view were more effective than the other methods.

Key words: independent component analysis, co-training, multi-view classification, Web spam detection

CLC Number: 

  • TP391
[1] YANG Yawei, SONG Bing, SHI Hongbo. Chemical process monitoring based on two step subspace division [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(5): 110-117.
[2] WANG Li, ZHOU Yiqi, YU Gang, MI Yongzhen. Time-frequency analysis of cabin noise based on ensemble empirical mode decomposition and independent component analysis#br# [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(2): 80-88.
[3] XIE Huo-sheng, LIU Min. An ensemble co-training algorithm based on active learning [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(3): 1-5.
[4] DONG Zhi-qiang1, LIU Ju1, ZOU Xin2, DU Jun1. Speech signal representation and feature extraction based on ICA [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(4): 19-22.
[5] , . Research on Face Image Retrieval Based on Principal Independent [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(4): 0-0 .
[6] SUN Guo-xia,SUN Xing-hua,BAI Shu-zhong,LIU Ju,SUN Jian-de . Research on face image retrieval based on principal independent content features [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(4): 81-84 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!