Journal of Shandong University(Engineering Science) ›› 2021, Vol. 51 ›› Issue (2): 115-121.doi: 10.6040/j.issn.1672-3961.0.2020.348

Previous Articles    

Script identification of Central Asian document images based on LTP and HOG texture feature fusion

WU Zhengjian1, MUTALLIP Mamut2, HORNISA Mamat1, ALIM Aysa1,3, KURBAN Ubul1,3*   

  1. 1. School of Information Science Engineering, Xinjiang University, Urumqi 830046, Xinjiang, China;
    2.The Library, Xinjiang University, Urumqi 830046, Xinjiang, China;
    3. The Key Lab. of Xinjiang Multilingual Information Technology, Urumqi 830046, Xinjiang, China
  • Published:2021-04-16

Abstract: Due to the existence of a number of scripts with high similarity in Central Asia, a document image script identification method based on the cross-fusion of a unified local ternary pattern(riu2-LTP)with rotational invariance and histogram of oriented gradients(HOG)features was proposed. An SVM classifier was used to perform experiments on a database containing a total of 10 000 images of 10 scripts. In order to improve multi-script identification, Bayesian optimized SVM hyperparameters were used. The method first extracted riu2-LTP with a radius of and a sampling 8 points for the document images; HOG was extracted from the database again; the cross-fusion method was to incorporate the 20-dimensional riu2-LTP features and 36-dimensional HOG features sequentially into the new feature set, respectively. The experiments showed that the average recognition rate of this method reached 99%, which was better than the single LTP, riu2-LTP, and HOG methods.

Key words: LTP, HOG, feature fusion, Bayesian optimization, script identification

CLC Number: 

  • TP391
[1] 王刚,靳彦青,刘立柱,等.基于多特征融合的东亚文种识别[J].计算机科学,2013,40(1):273-276. WANG Gang, JIN Yanqing, LIU Lizhu, et al. East Asian script identification based on multi-feature[J]. Computer Science, 2013, 40(1):273-276.
[2] FERRER M A, MORALES A, PAL U. LBP based line-wise script identification[C] //2013 12th International Conference on Document Analysis and Recognition(ICDAR). Washington DC, USA: Springer, 2013:369-373.
[3] SINGH S, KUMAR A, SHAW D K, et al. Script separation in machine printed bilingual(Devnagari and Gurumukhi)documents using morphological approach[C] //2014 Twentieth National Conference on Communications(NCC). Kanpur, Indian: IEEE, 2014: 1-5.
[4] 童莉,周林,平西建,等.基于高斯衍生滤波器组的文种识别算法[J].数据采集与处理,2014,29(5):713-719. TONG Li, ZHOU Lin, PING Xijian, et al. Script identification based on gaussian derivative filter bank[J]. Journal of Data Acquisition and Processing, 2014, 29(5):713-719.
[5] 买买提依明·哈斯木,吾守尔·斯拉木,维尼拉·木沙江,等.基于统计专用字符的维、哈、柯文文种识别研究[J].中文信息学报,2015,29(2):111-117. MAIMAITIYIM Hasimu, WUSHOUER Silamu, WEINILA Mushajiang, et al. Unique character based statistical language identification for Uyghur, Kazak and Kyrgyz[J]. Journal of Chinese Information Processing, 2015, 29(2):111-117.
[6] SHI Baoguang, BAI Xiang, YAO Cong. Script identification in the wild via discriminative convolutional neural network[J]. Pattern Recognition, 2016, 52(282):448-458.
[7] SINGH P K, DALAL S K, SARKAR R, et al. Page-level script identification from multi-script handwritten documents[C] //2015 Third International Conference on Computer, Communication, Control and Information Technology(C3IT). Hooghly, India: IEEE, 2015:1-6.
[8] SINGH P K, CHATTERJEE I, SARKAR R. Page-level handwritten script identification using modified log-Gabor filter based features[C] //2015 IEEE 2nd International Conference on Recent Trends in Information Systems(RTTS).Kolkata, India: IEEE, 2015: 225-230.
[9] MEHRI M, GOMEZ-KRAMER P, HEROUX P, et al. A texture-based pixel labeling approach for historical books[J]. Pattern Analysis & Applications, 2017, 20(2):325-364.
[10] HAN X K, AYSA A, MAMT H, et al. Script identification of central asian printed document images based on nonsubsampled contourlet transform[J]. Engineering Letters, 2017, 4(25): 389-395.
[11] 布阿加姑丽·米吉提,库尔班·吾布力,努尔毕亚·亚地卡尔,等.纹理特征加权融合的中亚多文种文档图像文种识别[J].计算机工程与应用,2017,53(20):187-194. BURJIAGULI Mijiti, KURBAN Ubul, NUERBYA Yad-ikar, et al. Weighted fusion of texture features based central Asian multi-scripts identification[J]. Computer Engineering and Applications, 2017, 53(20):187-194.
[12] UBUL K, TURSUN G, AYSA A, et al. Script identification of multi-script documents[J]. IEEE Access, 2017, 5(99): 6546-6559.
[13] ABBASI S, TAJERIPOUR F. Detection of brain tumor in 3D MRI images using local binary patterns and histogram orientation gradient[J]. Neurocomputing, 2017, 219:526-535.
[14] XIE Z, JIANG P, ZHANG S. Fusion of LBP and HOG using multiple kernel learning for infrared face recognition[C] //2017 IEEE/ACIS 16th International Conference on Computer and Information Science(ICIS). Wuhan, China: IEEE, 2017:81-84.
[15] SONG T, LUO L, XIN L, et al. Multi-scale cross-band encoding of sectored local binary pattern for robust texture classification[C] //2018 24th International Conference on Pattern Recognition(ICPR). Beijing, China: IEEE, 2018: 1163-1168.
[16] DALAL N,TRIGGS B. Histograms of oriented gradients for human detection[C] //2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR).San Diego, USA: IEEE, 2005: 886-893.
[17] SADANI A, BELAD A K A. Co-occurrence matrix of oriented gradients for word script and nature identifi-cation[C] //International Conference on Document Analysis & Recognition(DAR). Tunis, Tunisia: IEEE, 2015: 16-20.
[18] TIKADER A, PUHAN N B. Histogram of oriented gradients for English-Bengali script recognition[C] //2014 International Conference for Convergence for Technology(CT). Pune, India: IEEE, 2014:1-5.
[19] TIAN Shangxuan, BHATTACHARYA U, LU Shijian, et al. Multilingual scene character recognition with co-occurrence of histogram of oriented gradients[J]. Pattern Recognition, 2016, 51: 125-134.
[20] MICHAEL G, JASPER S, RYAN A. Bayesian optimization with unknown constraints[C] //2014 In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence(UAI). Quebec, Canada: AUAI, 2014:250-259.
[21] SNOEK J, LAROCHELLE H, ADAMS R P. Practical Bayesian optimization of machine learning algorithms[C] //2012 Advances in Neural Information Processing Systems(NIPS). NV, USA: MIT Press, 2012: 2951-2959.
[22] BUVAJAR M, ALIMJAN A,NURBIYA Y, et al. Script identification based on HSV features[C] //2016 Chinese Conference on Pattern Recognition(CCPR). Chengdu, China: Springer, 2016: 588-597.
[23] 李顺,木特力铺·马木提,吾尔尼沙·买买提,等.基于离散曲波变换的多文种文档图像文种识别[J].计算机工程与设计,2019,40(5):1376-1382. LI Shun, MAMUTI Mutelipu, MAIMAITI Wuernisha, et al. Multilingual document image recognition based on discrete curvelet transform [J]. Computer Engineering and Design, 2019, 40(5):1376-1382.
[24] MAHMUT M, GENC Y. A deep-learning approach to optical character recognition for Uighur language[C] //2019 International Conference on Advances in Computing, Communication and Control(ICAC3). Mumbai, India: IEEE, 2019: 1-6.
[25] ADDIS D, LIU C, TA V. Printed ethiopic script recognition by using LSTM networks[C] //2018 International Conference on System Science and Engineering(ICSSE). Taipei, China: IEEE, 2018: 1-6.
[1] CAO Chunhong, DUAN Hongxuan, CAO Ling, ZHANG Lele, HU Kai, XIAO Fen. Real-time semantic segmentation of high-resolution remote sensing image based on multi-level feature cascade [J]. Journal of Shandong University(Engineering Science), 2021, 51(2): 19-25.
[2] Pu ZHANG,Chang LIU,Yong WANG. Suggestion sentence classification model based on feature fusion and ensemble learning [J]. Journal of Shandong University(Engineering Science), 2018, 48(5): 47-54.
[3] DONG Zhen, YANG Yonglu, XIONG Guodong, LAI Yanhua, LYU Mingxin. Preparation and optimization of compound bonded material used in reducing the contact thermal resistance [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(3): 143-150.
[4] JIA Hongshuai, ZHAO Xuejin, HU Tianliang, ZHANG Chengrui. Image distortion correction technology of mask image projection stereolithography [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2017, 47(1): 97-103.
[5] MOU Chunqian, TANG Yan. A novel 3D model retrieval method fusing global and local information [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(6): 48-53.
[6] ZHANG Luchen, LI Shuchen, LI Shucai, LIAO Qikai. Effect on the performance of shotcrete mixed with silica fume and fly ash [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(5): 102-109.
[7] WANG Bin, CHANG Faliang, LIU Chunsheng. Traffic sign classification based on multi-feature fusion [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(4): 34-40.
[8] LEI Zhengbao, LIAO Zhuo, LIU Zhuchun. Optimized design of cross-wound cable barrier end anchorage [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2016, 46(3): 93-98.
[9] LUO Yanhua, SHE Shijie, CAO Weiguo, PAN Feng. Effects of reaction conditions on the size distribution of iron phosphate [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(1): 82-87.
[10] XIANG Lei, XU Jun. Nuclei detection of breast histopathology based on HOG feature and sliding window [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(1): 37-44.
[11] KONG Chao1,2, ZHANG Huaxiang1,2*, LIU Li1,2. A semi-supervised image retrieval algorithm based onfeature fusion of the region of interest [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2014, 44(3): 22-28.
[12] WANG Gang1, 2, JIANG Yu-jing2, LI Shucai3. Rapid feedback analysis method for underground caverns during constructing [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(4): 133-136.
[13] MENG Fanping, ZHOU Weizhi*, MA Yuhong, GAO Jinqiang. Parameter optimization for immobilization and Pb (II) adsorption of
microbe exopolysaccharide SM-A87 EPS
[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(3): 160-166.
[14] GUO Fen-hong1,2, XIONG Gang-qiang1,3. A class of discrete orthogonal piecewise polynomials and its applications [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(2): 29-35.
[15] CAI Nian, ZHANG Guo-hong, LOU Peng-xu, DAI Qing-yun. Image retrieval for a design patent based on shape
features and texture features
[J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2011, 41(2): 1-4.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!