JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2011, Vol. 41 ›› Issue (6): 12-17.

• Articles • Previous Articles     Next Articles

Feature engineering for Chinese part-of-speech tagging

YU Jiang-de1,  ZHOU Hong-yu1, YU Zheng-tao2   

  1. 1. School of Computer and Information Engineering, Anyang Normal University, Anyang 455002, China;
    2. School of Information Engineering and Automation, Kunming University of Science and
     Technology, Kunming  650051, China
  • Received:2011-04-15 Online:2011-12-16 Published:2011-04-15

Abstract:

Context features have a major impact on  the performance of Chinese part-of-speech tagging. In order to improve  the performance, the feature engineering for Chinese part-of-speech tagging was explored by the using maximum entropy model. Two key issues of feature engineering, the size of the feature window and the feature templates, were  studied. Closed evaluations were performed on PKU, NCC and CTB corpus from the Bakeoff-2007. Then,   comparative experiments about the training process and tagging accuracy for Chinese part-of-speech tagging were performed on different feature windows,  the “5 words” and “3 words” feature windows, and different feature templates: single-word, doubleword and mixing feature templates. Experimental results showed  that the feature window including 3 words was better  than that of 5 words, and the performance increased 10% using single-word feature templates than double-word feature templates. All the results  showed  that the feature window including 3 words and single-word feature templates were  appropriate for Chinese part-of-speech tagging.

Key words: Chinese part-of-speech tagging, maximum entropy model, context feature, feature window, feature template

CLC Number: 

  • TP391
[1] YU Jiang-de1, SUI Dan1, FAN Xiao-zhong2. Word-position-based tagging for Chinese word segmentation [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2010, 40(5): 117-122.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!