JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2011, Vol. 41 ›› Issue (6): 12-17.

• Articles • Previous Articles     Next Articles

Feature engineering for Chinese part-of-speech tagging

YU Jiang-de1,  ZHOU Hong-yu1, YU Zheng-tao2   

  1. 1. School of Computer and Information Engineering, Anyang Normal University, Anyang 455002, China;
    2. School of Information Engineering and Automation, Kunming University of Science and
     Technology, Kunming  650051, China
  • Received:2011-04-15 Online:2011-12-16 Published:2011-04-15

Abstract:

Context features have a major impact on  the performance of Chinese part-of-speech tagging. In order to improve  the performance, the feature engineering for Chinese part-of-speech tagging was explored by the using maximum entropy model. Two key issues of feature engineering, the size of the feature window and the feature templates, were  studied. Closed evaluations were performed on PKU, NCC and CTB corpus from the Bakeoff-2007. Then,   comparative experiments about the training process and tagging accuracy for Chinese part-of-speech tagging were performed on different feature windows,  the “5 words” and “3 words” feature windows, and different feature templates: single-word, doubleword and mixing feature templates. Experimental results showed  that the feature window including 3 words was better  than that of 5 words, and the performance increased 10% using single-word feature templates than double-word feature templates. All the results  showed  that the feature window including 3 words and single-word feature templates were  appropriate for Chinese part-of-speech tagging.

Key words: Chinese part-of-speech tagging, maximum entropy model, context feature, feature window, feature template

CLC Number: 

  • TP391
[1] YU Jiang-de1, SUI Dan1, FAN Xiao-zhong2. Word-position-based tagging for Chinese word segmentation [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 117-122.
Viewed
Full text
296
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 0 296

  From Others local
  Times 40 256
  Rate 14% 86%

Abstract
744
Just accepted Online first Issue
0 0 744
  From Others
  Times 744
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!