您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报 (工学版) ›› 2021, Vol. 51 ›› Issue (2): 115-121.doi: 10.6040/j.issn.1672-3961.0.2020.348

• • 上一篇    

基于LTP和HOG纹理特征融合的中亚文档图像文种识别

吴正健1,木特力甫·马木提2,吾尔尼沙·买买提1,阿力木江·艾沙1,3,库尔班·吾布力1,3*   

  1. 1.新疆大学信息科学与工程学院, 新疆 乌鲁木齐 830046;2.新疆大学图书馆, 新疆 乌鲁木齐 830046;3.新疆多语种信息技术重点实验室, 新疆 乌鲁木齐 830046
  • 发布日期:2021-04-16
  • 作者简介:吴正健(1995— ),男,安徽芜湖人,硕士研究生,主要研究方向为模式识别,计算机视觉. E-mail:wzj199538@gmail.com. *通信作者简介:库尔班·吾布力(1974— ),男,维吾尔族,新疆巴楚人,博士,教授,主要研究方向为数字图像处理与模式识别. E-mail:Kurban@xju.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61862061,6161563052,61363064);新疆大学博士科研启动基金项目(BS180268);新疆维吾尔自治区高校科研计划创新团队基金项目(XJEDU2017T002)

Script identification of Central Asian document images based on LTP and HOG texture feature fusion

WU Zhengjian1, MUTALLIP Mamut2, HORNISA Mamat1, ALIM Aysa1,3, KURBAN Ubul1,3*   

  1. 1. School of Information Science Engineering, Xinjiang University, Urumqi 830046, Xinjiang, China;
    2.The Library, Xinjiang University, Urumqi 830046, Xinjiang, China;
    3. The Key Lab. of Xinjiang Multilingual Information Technology, Urumqi 830046, Xinjiang, China
  • Published:2021-04-16

摘要: 针对中亚地区存在一些相似度较高的文种,提出一种基于具有旋转不变性的统一局部三值模式(rotation invariant uniform local ternary pattern,riu2-LTP)和方向梯度直方图(histogram of oriented gradients,HOG)特征交叉融合的文档图像文种方法。使用SVM分类器对包含10个文种共10 000张图片的数据库进行试验;为了提高多文种识别效果,采用贝叶斯优化SVM的超参数。对文档图像提取了半径为1,采样点为8的riu2-LTP;重新对数据库提取HOG;采用交叉融合方法将20维riu2-LTP特征与36维HOG特征分别依次融入到新的特征集。试验表明,本研究方法平均查准率达到99%,相较于单一LTP、riu2-LTP和HOG方法有更好性能。

关键词: LTP, HOG, 特征融合, 贝叶斯优化, 文种识别

Abstract: Due to the existence of a number of scripts with high similarity in Central Asia, a document image script identification method based on the cross-fusion of a unified local ternary pattern(riu2-LTP)with rotational invariance and histogram of oriented gradients(HOG)features was proposed. An SVM classifier was used to perform experiments on a database containing a total of 10 000 images of 10 scripts. In order to improve multi-script identification, Bayesian optimized SVM hyperparameters were used. The method first extracted riu2-LTP with a radius of and a sampling 8 points for the document images; HOG was extracted from the database again; the cross-fusion method was to incorporate the 20-dimensional riu2-LTP features and 36-dimensional HOG features sequentially into the new feature set, respectively. The experiments showed that the average recognition rate of this method reached 99%, which was better than the single LTP, riu2-LTP, and HOG methods.

Key words: LTP, HOG, feature fusion, Bayesian optimization, script identification

中图分类号: 

  • TP391
[1] 王刚,靳彦青,刘立柱,等.基于多特征融合的东亚文种识别[J].计算机科学,2013,40(1):273-276. WANG Gang, JIN Yanqing, LIU Lizhu, et al. East Asian script identification based on multi-feature[J]. Computer Science, 2013, 40(1):273-276.
[2] FERRER M A, MORALES A, PAL U. LBP based line-wise script identification[C] //2013 12th International Conference on Document Analysis and Recognition(ICDAR). Washington DC, USA: Springer, 2013:369-373.
[3] SINGH S, KUMAR A, SHAW D K, et al. Script separation in machine printed bilingual(Devnagari and Gurumukhi)documents using morphological approach[C] //2014 Twentieth National Conference on Communications(NCC). Kanpur, Indian: IEEE, 2014: 1-5.
[4] 童莉,周林,平西建,等.基于高斯衍生滤波器组的文种识别算法[J].数据采集与处理,2014,29(5):713-719. TONG Li, ZHOU Lin, PING Xijian, et al. Script identification based on gaussian derivative filter bank[J]. Journal of Data Acquisition and Processing, 2014, 29(5):713-719.
[5] 买买提依明·哈斯木,吾守尔·斯拉木,维尼拉·木沙江,等.基于统计专用字符的维、哈、柯文文种识别研究[J].中文信息学报,2015,29(2):111-117. MAIMAITIYIM Hasimu, WUSHOUER Silamu, WEINILA Mushajiang, et al. Unique character based statistical language identification for Uyghur, Kazak and Kyrgyz[J]. Journal of Chinese Information Processing, 2015, 29(2):111-117.
[6] SHI Baoguang, BAI Xiang, YAO Cong. Script identification in the wild via discriminative convolutional neural network[J]. Pattern Recognition, 2016, 52(282):448-458.
[7] SINGH P K, DALAL S K, SARKAR R, et al. Page-level script identification from multi-script handwritten documents[C] //2015 Third International Conference on Computer, Communication, Control and Information Technology(C3IT). Hooghly, India: IEEE, 2015:1-6.
[8] SINGH P K, CHATTERJEE I, SARKAR R. Page-level handwritten script identification using modified log-Gabor filter based features[C] //2015 IEEE 2nd International Conference on Recent Trends in Information Systems(RTTS).Kolkata, India: IEEE, 2015: 225-230.
[9] MEHRI M, GOMEZ-KRAMER P, HEROUX P, et al. A texture-based pixel labeling approach for historical books[J]. Pattern Analysis & Applications, 2017, 20(2):325-364.
[10] HAN X K, AYSA A, MAMT H, et al. Script identification of central asian printed document images based on nonsubsampled contourlet transform[J]. Engineering Letters, 2017, 4(25): 389-395.
[11] 布阿加姑丽·米吉提,库尔班·吾布力,努尔毕亚·亚地卡尔,等.纹理特征加权融合的中亚多文种文档图像文种识别[J].计算机工程与应用,2017,53(20):187-194. BURJIAGULI Mijiti, KURBAN Ubul, NUERBYA Yad-ikar, et al. Weighted fusion of texture features based central Asian multi-scripts identification[J]. Computer Engineering and Applications, 2017, 53(20):187-194.
[12] UBUL K, TURSUN G, AYSA A, et al. Script identification of multi-script documents[J]. IEEE Access, 2017, 5(99): 6546-6559.
[13] ABBASI S, TAJERIPOUR F. Detection of brain tumor in 3D MRI images using local binary patterns and histogram orientation gradient[J]. Neurocomputing, 2017, 219:526-535.
[14] XIE Z, JIANG P, ZHANG S. Fusion of LBP and HOG using multiple kernel learning for infrared face recognition[C] //2017 IEEE/ACIS 16th International Conference on Computer and Information Science(ICIS). Wuhan, China: IEEE, 2017:81-84.
[15] SONG T, LUO L, XIN L, et al. Multi-scale cross-band encoding of sectored local binary pattern for robust texture classification[C] //2018 24th International Conference on Pattern Recognition(ICPR). Beijing, China: IEEE, 2018: 1163-1168.
[16] DALAL N,TRIGGS B. Histograms of oriented gradients for human detection[C] //2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR).San Diego, USA: IEEE, 2005: 886-893.
[17] SADANI A, BELAD A K A. Co-occurrence matrix of oriented gradients for word script and nature identifi-cation[C] //International Conference on Document Analysis & Recognition(DAR). Tunis, Tunisia: IEEE, 2015: 16-20.
[18] TIKADER A, PUHAN N B. Histogram of oriented gradients for English-Bengali script recognition[C] //2014 International Conference for Convergence for Technology(CT). Pune, India: IEEE, 2014:1-5.
[19] TIAN Shangxuan, BHATTACHARYA U, LU Shijian, et al. Multilingual scene character recognition with co-occurrence of histogram of oriented gradients[J]. Pattern Recognition, 2016, 51: 125-134.
[20] MICHAEL G, JASPER S, RYAN A. Bayesian optimization with unknown constraints[C] //2014 In Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence(UAI). Quebec, Canada: AUAI, 2014:250-259.
[21] SNOEK J, LAROCHELLE H, ADAMS R P. Practical Bayesian optimization of machine learning algorithms[C] //2012 Advances in Neural Information Processing Systems(NIPS). NV, USA: MIT Press, 2012: 2951-2959.
[22] BUVAJAR M, ALIMJAN A,NURBIYA Y, et al. Script identification based on HSV features[C] //2016 Chinese Conference on Pattern Recognition(CCPR). Chengdu, China: Springer, 2016: 588-597.
[23] 李顺,木特力铺·马木提,吾尔尼沙·买买提,等.基于离散曲波变换的多文种文档图像文种识别[J].计算机工程与设计,2019,40(5):1376-1382. LI Shun, MAMUTI Mutelipu, MAIMAITI Wuernisha, et al. Multilingual document image recognition based on discrete curvelet transform [J]. Computer Engineering and Design, 2019, 40(5):1376-1382.
[24] MAHMUT M, GENC Y. A deep-learning approach to optical character recognition for Uighur language[C] //2019 International Conference on Advances in Computing, Communication and Control(ICAC3). Mumbai, India: IEEE, 2019: 1-6.
[25] ADDIS D, LIU C, TA V. Printed ethiopic script recognition by using LSTM networks[C] //2018 International Conference on System Science and Engineering(ICSSE). Taipei, China: IEEE, 2018: 1-6.
[1] 曹春红,段鸿轩,曹玲,张乐乐,胡凯,肖芬. 基于多级特征级联的遥感图像实时语义分割[J]. 山东大学学报 (工学版), 2021, 51(2): 19-25.
[2] 张璞,刘畅,王永. 基于特征融合和集成学习的建议语句分类模型[J]. 山东大学学报 (工学版), 2018, 48(5): 47-54.
[3] 牟春倩,唐雁. 融合整体和局部信息的三维模型检索方法[J]. 山东大学学报(工学版), 2016, 46(6): 48-53.
[4] 王斌,常发亮,刘春生. 基于多特征融合的交通标志分类[J]. 山东大学学报(工学版), 2016, 46(4): 34-40.
[5] 孔超1,2,张化祥1,2*,刘丽1,2. 基于兴趣区域特征融合的半监督图像检索算法[J]. 山东大学学报(工学版), 2014, 44(3): 22-28.
[6] 蔡念, 张国宏, 楼朋旭, 戴青云. 基于形状和纹理的外观设计专利图像检索方法[J]. 山东大学学报(工学版), 2011, 41(2): 1-4.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!