Journal of Shandong University(Engineering Science) ›› 2022, Vol. 52 ›› Issue (6): 105-114.doi: 10.6040/j.issn.1672-3961.0.2021.304

• Machine Learning & Data Mining • Previous Articles     Next Articles

Named entity recognition model based on dilated convolutional block architecture

Yue YUAN1(),Yanli WANG2,Kan LIU2,*()   

  1. 1. Department of Information Management, Peking University, Beijing 100871, China
    2. School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan 430073, Hubei, China
  • Received:2021-06-09 Online:2022-12-20 Published:2022-12-23
  • Contact: Kan LIU E-mail:yuangyue@qq.com;liukan@zuel.edu.cn

Abstract:

Inspired by the dilated convolution, a column-wise dilated convolution towards two dimensional text embedding was proposed and a dilated convolutional block architecture was designed. A named entity recognition model based on the architecture was built for further experiments. In the named entity recognition experiment, the model surpassed other baseline models in the metrics of precision, recall, and F1 value, respectively reaching 0.918 7, 0.879 4, and 0.898 6, indicating that the dilated convolutional block architecture obtained features from context information, thereby supporting the extraction of the long-term dependency. The receptive field experiment showed that it was necessary to jointly adjust the dilation rate and the convolution kernel size to reduce the "gridding effect". The dilated convolutional block architecture proposed could effectively perform the task of named entity recognition.

Key words: named entity recognition, dilated convolutional block architecture, receptive field, neural network, deep learning

CLC Number: 

  • TP391

Fig.1

Examples of receptive field expansion when increasing kernel size S or dilation rate r"

Fig.2

Comparison of receptive field between stacking standard convolutional layers or dilated convolutional layers"

Fig.3

Structure of dilated convolutional block"

Fig.4

Two-dimensional text embedding and dilated convolution on picture"

Fig.5

Column-wise dilated convolution"

Fig.6

Dilated convolutional block architecture"

Fig.7

Named entity recognition model based on dilated convolutional block architecture"

Fig.8

Named entity recognition model based on dilated convolutional layer"

Table 1

Statistics of dataset  个"

新闻句子数量 平均句子长度(汉字字符) 最长句子长度(汉字字符) 最短句子长度(汉字字符) 人物标签数量 地点标签数量 组织标签数量
50 729 47 1476 5 19 588 39 394 21 902

Table 2

Experiment results of named entity recognition (Location)"

是否有CRF层 模型 P R F1
没有CRF层 Bi-LSTM-Softmax 0.750 2 0.715 0 0.732 2
IDCNN-Softmax 0.769 9 0.770 9 0.770 4
DCL-Bi-LSTM-Softmax 0.792 9 0.758 4 0.775 3
DCBA-Bi-LSTM-Softmax 0.797 6 0.824 8 0.811 0
有CRF层 Bi-LSTM-CRF 0.869 9 0.808 5 0.838 0
IDCNN-CRF 0.900 9 0.824 8 0.861 2
DCL-Bi-LSTM-CRF 0.906 1 0.861 7 0.883 3
DCBA-Bi-LSTM-CRF 0.918 7 0.879 4 0.898 6

Table 3

Experiment results of named entity recognition (Person)"

是否有CRF层 模型 P R F1
没有CRF层 Bi-LSTM-Softmax 0.764 5 0.770 7 0.767 6
IDCNN-Softmax 0.797 8 0.783 8 0.790 7
DCL-Bi-LSTM-Softmax 0.816 9 0.816 5 0.816 7
DCBA-Bi-LSTM-Softmax 0.823 0 0.796 9 0.809 7
有CRF层 Bi-LSTM-CRF 0.846 0 0.817 0 0.831 3
IDCNN-CRF 0.849 7 0.815 0 0.832 0
DCL-Bi-LSTM-CRF 0.897 4 0.837 7 0.866 5
DCBA-Bi-LSTM-CRF 0.889 1 0.824 1 0.855 3

Table 4

Experiment results of named entity recognition (Organization)"

是否有CRF层 模型 P R F1
没有CRF层 Bi-LSTM-Softmax 0.543 3 0.622 1 0.580 0
IDCNN-Softmax 0.631 8 0.676 9 0.653 6
DCL-Bi-LSTM-Softmax 0.633 8 0.652 9 0.643 2
DCBA-Bi-LSTM-Softmax 0.681 1 0.733 3 0.706 2
有CRF层 Bi-LSTM-CRF 0.755 6 0.731 8 0.743 5
IDCNN-CRF 0.774 9 0.770 8 0.772 9
DCL-Bi-LSTM-CRF 0.835 4 0.777 6 0.805 4
DCBA-Bi-LSTM-CRF 0.835 3 0.815 2 0.825 1

Table 5

Experiment results of named entity recognition (Total)"

是否有CRF层 模型 P R F1
没有CRF层 Bi-LSTM-Softmax 0.704 4 0.712 9 0.708 6
IDCNN-Softmax 0.747 1 0.754 8 0.751 0
DCL-Bi-LSTM-Softmax 0.765 0 0.754 4 0.759 6
DCBA-Bi-LSTM-Softmax 0.779 0 0.796 2 0.787 5
有CRF层 Bi-LSTM-CRF 0.837 0 0.794 7 0.815 3
IDCNN-CRF 0.855 8 0.810 1 0.832 3
DCL-Bi-LSTM-CRF 0.888 3 0.835 9 0.861 3
DCBA-Bi-LSTM-CRF 0.891 0 0.847 9 0.868 9

Fig.9

Learning curves for models using CRF"

Fig.10

Learning curves for models using Softmax"

Table 6

Settings and results for receptive field experiments (Setting 1)"

模型 r S(高×宽×数量) 感受野 P R F1
DCL-Bi-LSTM-Softmax 1 10×50×1 10 0.642 8 0.646 2 0.644 5
DCBA-Bi-LSTM-Softmax 1 10×50×1 10 0.689 0 0.721 7 0.705 0

Table 7

Settings and results for receptive field experiments (Setting 2)"

模型 r S(高×宽×数量) 感受野 P R F1
DCL-Bi-LSTM-Softmax 2 10×50×1 19 0.658 0 0.646 5 0.652 2
DCBA-Bi-LSTM-Softmax 2 10×50×1 19 0.723 4 0.744 2 0.733 6

Table 8

Settings and results for receptive field experiments (Setting 3)"

模型 r S(高×宽×数量) 感受野 P R F1
DCL-Bi-LSTM-Softmax 4 10×50×1 37 0.673 2 0.657 5 0.665 3
DCBA-Bi-LSTM-Softmax 4 10×50×1 37 0.754 1 0.756 1 0.755 1

Table 9

Settings and results for receptive field experiments (Setting 4)"

模型 r S(高×宽×数量) 感受野 P R F1
DCL-Bi-LSTM-Softmax 8 10×50×1 73 0.650 2 0.643 7 0.647 0
DCBA-Bi-LSTM-Softmax 8 10×50×1 73 0.753 2 0.751 6 0.752 4
1 PANCHENDRARAJAN R, AMARESAN A. Bidirectional LSTM-CRF for named entity recognition[C]//Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation. Hong Kong, China: Association for Computational Linguistics, 2018: 531-540.
2 LI L , XU W , YU H . Character-level neural network model based on Nadam optimization and its application in clinical concept extraction[J]. Neurocomputing, 2020, 414 (16): 182- 190.
3 SHARMA R , MORWAL S , AGARWAL B , et al. A deep neural network-based model for named entity recognition for Hindi language[J]. Neural Computing and Applications, 2020, 32 (20): 16191- 16203.
doi: 10.1007/s00521-020-04881-z
4 WU C , WU F , QI T , et al. Detecting entities of works for chinese chatbot[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2020, 19 (6): 1- 13.
5 LI X , ZHANG H , ZHOU X H . Chinese clinical named entity recognition with variant neural structures based on BERT methods[J]. Journal of Biomedical Informatics, 2020, 107 (18): 103422.
6 JIA C, SHI Y, YANG Q, et al. Entity enhanced bert pre-training for chinese NER[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Punta Cana, Dominica: Association for Computational Linguistics, 2020: 6384-6396.
7 HAN Y, YAN Y, HAN Y, et al. Chinese grammatical error diagnosis based on RoBERTa-BiLSTM-CRF model[C]//Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications. Suzhou, China: Association for Computational Lingu-istics, 2020: 97-101.
8 YANG Z, DAI Z, YANG Y, et al. Xlnet: generalized autoregressive pretraining for language understanding[C]// Proceedings of Advances in Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2019: 5753-5763.
9 DEVLIN J, CHANG M W, LEE K, et al. Bert: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, USA: Association for Computational Linguistics, 2019: 4171-4186.
10 ZHANG Z, HAN X, LIU Z, et al. ERNIE: enhanced language representation with informative entities[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019: 1441-1451.
11 CHEN L C , PAPANDREOU G , KOKKINOS I , et al. Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40 (4): 834- 848.
12 WANG Z , JI S . Smoothed dilated convolutions for improved dense prediction[J]. Data Mining and Knowledge Discovery, 2021, 35 (4): 1- 27.
13 MEHTA S, RASTEGARI M, CASPI A, et al. Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV). Munich, Germany: Springer, 2018: 552-568.
14 STRUBELL E, VERGA P, BELANGER D, et al. Fast and accurate entity recognition with iterated dilated convolutions[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics, 2017: 2670-2680.
15 KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, USA: Association for Computational Linguistics, 2014: 655-665.
16 GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks[C]//Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. Ft. Lauderdale, USA: AISTATS, 2011: 315-323.
17 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
18 WANG P, CHEN P, YUAN Y, et al. Understanding convolution for semantic segmentation[C]//Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). Tahoe City, USA: IEEE, 2018: 1451-1460.
19 LIPPMANN R, CAMPBELL W, CAMPBELL J. An overview of the darpa data driven discovery of models (d3m) program[C]//Proceedings of 29th Conference on Neural Information Processing Systems. Barcelona, Spain: MIT Press, 2016: 1-2.
[1] HUANG Fang, WANG Xin, GAO Guohai, SHEN Lingzhen, FU Xun, FANG Yu. Mining Top-k frequent patterns for graphs based on subjective and objective metrics [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 1-12.
[2] WANG Yuou, YUAN Yingchun, HE Zhenxue, HE Chen. University academic named entity recognition based on the fusion of multi-feature and multi-head self-attention mechanism [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 35-44.
[3] SHAO Mengwei, YUAN Shifei, ZHOU Hongzhi, WANG Naihua. Optimisation of finned tube structure based on BP neural network and genetic algorithm [J]. Journal of Shandong University(Engineering Science), 2025, 55(6): 76-82.
[4] LI Changgang, LI Baoliang, CAO Yongji, WANG Jiaying. Review and prospect on artificial intelligence application in power system power flow calculation [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 1-17.
[5] DENG Bin, ZHANG Zongbao, ZHAO Wenmeng, LUO Xinhang, WU Qiuwei. Cloud-edge collaborative and graph neural network based load forecasting method for electric vehicle charging stations [J]. Journal of Shandong University(Engineering Science), 2025, 55(5): 62-69.
[6] ZHOU Qunying, SUI Jiacheng, ZHANG Ji, WANG Hongyuan. Industrial product surface defect detection based on self supervised convolution and parameter free attention mechanism [J]. Journal of Shandong University(Engineering Science), 2025, 55(4): 40-47.
[7] XUE Bingbing, WANG Yong, YANG Weihao, WANG Chuan, YU Di, WANG Xu. Real-time expressway traffic data imputation and state prediction based on ETC system data [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 58-71.
[8] DONG Mingshu, CHEN Liqi, MA Chuanyi, ZHANG Zhuhao, SUN Renjuan, GUAN Yanhua, ZHUANG Peizhi. Deep learning-based intelligent judgment for radar detection of pavement cracks [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 72-79.
[9] JIA Xuan, XU Jikai, REN Yijing, LIU Decai, XU Qiang, ZHANG Li. Calculation method of theoretical line loss in transformer districts based on sample expansion and data-driven [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 158-164.
[10] ZHU Ming, SHI Chenglong, LÜ Pan, LIU Xianrong, SUN Chi, CHEN Jiancheng, FAN Hongyun. Deformation prediction method and engineering application of deep foundation pit based on optimized LSTM method [J]. Journal of Shandong University(Engineering Science), 2025, 55(3): 141-148.
[11] Jiachun LI,Bowen LI,Jianbo CHANG. An efficient and lightweight RGB frame-level face anti-spoofing model [J]. Journal of Shandong University(Engineering Science), 2023, 53(6): 1-7.
[12] Hong YU,Juan DU,Lin WEI,Li ZHANG. Data fitting method for electricity consumption of power market users considering behavioral characteristics [J]. Journal of Shandong University(Engineering Science), 2023, 53(4): 113-119.
[13] Haoyuan LI,Jingming YU,Guilin ZHANG,Bin ZHANG. Optimization of manufacturing parameters for optical fiber preform core based on intelligent algorithm [J]. Journal of Shandong University(Engineering Science), 2023, 53(4): 149-156.
[14] Tongyu JIANG, Fan CHEN, Hongjie HE. Lightweight face super-resolution network based on asymmetric U-pyramid reconstruction [J]. Journal of Shandong University(Engineering Science), 2022, 52(1): 1-8.
[15] Jianqing WU,Xiuguang SONG. Review on development of simultaneous localization and mapping technology [J]. Journal of Shandong University(Engineering Science), 2021, 51(5): 16-31.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LI Kan . Empolder and implement of the embedded weld control system[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(4): 37 -41 .
[2] SHI Lai-shun,WAN Zhong-yi . Synthesis and performance evaluation of a novel betaine-type asphalt emulsifier[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2008, 38(4): 112 -115 .
[3] LAI Xiang . The global domain of attraction for a kind of MKdV equations[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(1): 87 -92 .
[4] YU Jia yuan1, TIAN Jin ting1, ZHU Qiang zhong2. Computational intelligence and its application in psychology[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 1 -5 .
[5] CHEN Rui, LI Hongwei, TIAN Jing. The relationship between the number of magnetic poles and the bearing capacity of radial magnetic bearing[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(2): 81 -85 .
[6] WANG Bo,WANG Ning-sheng . Automatic generation and combinatory optimization of disassembly sequence for mechanical-electric assembly[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(2): 52 -57 .
[7] ZHANG Ying,LANG Yongmei,ZHAO Yuxiao,ZHANG Jianda,QIAO Peng,LI Shanping . Research on technique of aerobic granular sludge cultivationby seeding EGSB anaerobic granular sludge[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(4): 56 -59 .
[8] Yue Khing Toh1, XIAO Wendong2, XIE Lihua1. Wireless sensor network for distributed target tracking: practices via real test bed development[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 50 -56 .
[9] SUN Weiwei, WANG Yuzhen. Finite gain stabilization of singlemachine infinite bus system subject to saturation[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2009, 39(1): 69 -76 .
[10] HAO Ranhang,CHEN Shouyu . The theory, model and method of water resources evaluationombining quantity with quality[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2006, 36(3): 46 -50 .