JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2013, Vol. 43 ›› Issue (2): 29-34.

• Articles • Previous Articles     Next Articles

Independent component analysis and co-training based Web spam detection

GAO Shuang1,2, ZHANG Hua-xiang1,2*, FANG Xiao-nan1,2   

  1. 1. Department of Information Science and Engineering, Shandong Normal University, Jinan 250014, China;
    2. Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan 250014, China
  • Received:2012-12-05 Online:2013-04-20 Published:2012-12-05

Abstract:

Web spam detection is of great significance, and there only exists a small number of labeled pages. Thus, the semi-supervised co-training was used to detect the Web spam pages. The page features were divided into two views, the content view and the link view. First, the independent components of each view were extracted by  the independent component analysis, and then the co-training was used to detect the label of each Web page. Experimental results showed that this method could effectively improve the recognition accuracy of Web spam. The results also verified that two respective independent component analyses of each view were more effective than the other methods.

Key words: independent component analysis, co-training, multi-view classification, Web spam detection

CLC Number: 

  • TP391
[1] YANG Yawei, SONG Bing, SHI Hongbo. Chemical process monitoring based on two step subspace division [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(5): 110-117.
[2] WANG Li, ZHOU Yiqi, YU Gang, MI Yongzhen. Time-frequency analysis of cabin noise based on ensemble empirical mode decomposition and independent component analysis#br# [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(2): 80-88.
[3] XIE Huo-sheng, LIU Min. An ensemble co-training algorithm based on active learning [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2012, 42(3): 1-5.
[4] DONG Zhi-qiang1, LIU Ju1, ZOU Xin2, DU Jun1. Speech signal representation and feature extraction based on ICA [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(4): 19-22.
[5] , . Research on Face Image Retrieval Based on Principal Independent [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(4): 0-0 .
[6] SUN Guo-xia,SUN Xing-hua,BAI Shu-zhong,LIU Ju,SUN Jian-de . Research on face image retrieval based on principal independent content features [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2007, 37(4): 81-84 .
Viewed
Full text
463
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 0 463

  From Others local
  Times 27 436
  Rate 6% 94%

Abstract
787
Just accepted Online first Issue
0 0 787
  From Others
  Times 787
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!