JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE) ›› 2018, Vol. 48 ›› Issue (3): 17-24.doi: 10.6040/j.issn.1672-3961.0.2017.411

Previous Articles     Next Articles

Item embedding classification method for E-commerce

LONG Bai1, ZENG Xianyu1, LI Zhi1,2, LIU Qi1*   

  1. 1. Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei 230000, Anhui, China;
    2. School of Software Engineering, University of Science and Technology of China, Hefei 230000, Anhui, China
  • Received:2017-05-17 Online:2018-06-20 Published:2017-05-17

Abstract: Inspired by the Word Embedding Model word2vec, which proved higly successful in the field of Natural Language Processing in recent years, two Item Embedding models item2vec and w-item2vec were proposed. By modeling users behaviour sequences, both item2vec and w-item2vec projected the items to distributed representations in vector space. The vectors of items represented the properties of items and could be used to measure the relations between items. By means of this property, we could categorize products effectively and efficiently. Experimental results showed that methods were conducted on a real-world dataset and w-item2vec achieved an accuracy of nearly 50% for item categorization by using only 10% of the items for training. Two proposed models outperformed other methods obviously.

Key words: item categorization, item emebedding, behavior modeling, word embedding, E-commerce

CLC Number: 

  • TP391
[1] 中华人民共和国商务部:中国电子商务报告(2016)[EB/OL].(2017-06-14)[2017-06-28]. http://images.mofcom.gov.cn/dzsws/ 201706/20170621110205702.pdf
[2] SHEN D, RUVINI J D, SARWAR B. Large-scale item categorization for e-commerce[C] //Proceedings of the 21st ACM International Conference on Information and Knowledge Management. Hawaii, USA: ACM, 2012: 595-604.
[3] CHEN J, WARREN D. Cost-sensitive learning for large-scale hierarchical classification[C] //Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. San Francisco, USA: ACM, 2013: 1351-1360.
[4] DEKEL O, KESHET J, SINGER Y. Large margin hierarchical classification[C] //Proceedings of the 21st International Conference on Machine Learning. Banff, Canada: ACM, 2004: 27.
[5] DAS P, XIA Y, LEVINE A, et al. Large-scale taxonomy categorization for noisy product listings[C] //Proceedings of IEEE International Conference on Big Data. Honolulu, USA: IEEE, 2017:3885-3894.
[6] DIMITROVSKI I, KOCEV D, KITANOVSKI I, et al. Improved medical image modality classification using a combination of visual and textual features[J]. Computerized Medical Imaging and Graphics, 2015, 39(1): 14-26.
[7] RUSSAKOVSKY O, DENG J, SU H, et al. Imagenet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252.
[8] LIU Q, ZENG X, ZHU H, et al. Mining indecisiveness in customer behaviors[C] // Proceedings of IEEE International Conference on Data Mining. Barcelona, Spain: IEEE, 2016:281-290.
[9] HINTON G E, MCCLELLAND J L, RUMELHART D E. Distributed representations[M]. New York, USA: Encyclopedia of Cognitive Science. John Wiley & Sons, Ltd, 2006:77-109.
[10] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26(1):3111-3119.
[11] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [C] // Proceedings of International Conference on Learning Representations. Scottsdale, USA: ICLR, 2013:1-12.
[12] PEROZZI B, Al-RFOU R, SKIENA S. Deepwalk: Online learning of social representations[C] //Proceedings of the 20th ACM International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2014: 701-710.
[13] GROVER A, LESKOVEC J. node2vec: Scalable feature learning for networks[C] //Proceedings of the 22nd ACM International Conference on Knowledge Discovery and Data Mining. San Francisco, USA: ACM, 2016: 855-864.
[14] GRBOVIC M, DJURIC N, RADOSAVLJEVIC V, et al. Context-and content-aware embeddings for query rewriting in sponsored search[C] //Proceedings of the 38th International ACM Conference on Research and Development in Information Retrieval. Santiago, Chile: ACM, 2015: 383-392.
[15] PRESS S J, WELSON S. Choosing between logistic regression and discriminant analysis[J]. Journal of the American Statistical Association, 1978, 73(364): 699-705.
[16] SUYKENS J A K, VANDEWALLE J. Least squares support vector machine classifiers[J]. Neural Processing Letters, 1999, 9(3): 293-300.
[17] JANSEN Bernard J, SPINK A, BLAKELY C, et al. Defining a session on web search engines: research articles[J]. Journal of the American Society for Information Science and Technology, 2007, 58(6): 862-871.
[18] GUTHRIE D, ALLISON B, LIU W, et al. A closer look at skip-gram modelling[C] //Proceedings of the 5th international Conference on Language Resources and Evaluation(LREC-2006). Genoa, Italy: ELRA, 2006: 1-4.
[19] GUTMANN M U, HYVÄRINEN A. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics[J]. Journal of Machine Learning Research, 2012, 13(2): 307-361.
[20] MNIH A, TEH Y W. A fast and simple algorithm for training neural probabilistic language models [C] // Proceedings of International Coference on International Conference on Machine Learning. Omnipress, Scotland: PMLR, 2012:419-426.
[21] MNIH A, KAVUKCUOGLU K. Learning word embeddings efficiently with noise-contrastive estimation [C] // Proceedings of Advances in Neural Information Processing Systems. Lake Tahoe, USA: NIPS, 2013: 2265-2273.
[22] SARWAR B, KARYPIS G, KONSTAN J, et al. Item-based collaborative filtering recommendation algorithms[C] //Proceedings of the 10th International Conference on World Wide Web. Hong Kong, China: ACM, 2001: 285-295.
[23] MNIH A, SALAKHUTDINOV R R. Probabilistic matrix factorization[C] // Proceedings of Advances in Neural Information Processing Systems. Whistler, Canada: NIPS, 2008: 1257-1264.
[1] XIONG Bingyan, WANG Guoyin, DENG Weibin. Hierarchical cost sensitive decision tree and its application in the prediction of the mobile phone replacement [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(5): 36-42.
[2] WANG Xiaochu, WANG Shitong, BAO Fang. Image classification algorithm based on minimax probability machine with regularized probability density concensus [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2015, 45(5): 13-21.
[3] ZHANG Dongbo, KOU Tao, XU Haixia. Fast scene recognition based on LDB descriptor and local spatial structure matching [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 16-23.
[4] CHEN Haiyong, YU Li, LIU Hui, YANG Jiabo, HU Qidi. Solar cell defect images fusion based on empirical wavelet [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 24-31.
[5] MOU Lianming. Weighted k sub-convex-hull classifier based on adaptive feature selection [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 32-37.
[6] SHEN Dongdong, ZHOU Fengyu, LI Mengyuan, WANG Shuqian, GUO Renhe. Indoor wireless positioning based on ensemble deep neural network [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 95-102.
[7] ZHANG Pu, LIU Chang, WANG Yong. Suggestion sentence classification model based on feature fusion and ensemble learning [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 47-54.
[8] WANG Guoxin, CHEN Fengdong, LIU Guodong. Feature extraction method of color pseudo-random coded structured light [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 55-60.
[9] HU Jianping, LI Xin, XIE Qi, LI Ling, ZHANG Daochang. An unconstrained optimization EMD approach in 2D based on Delaunay triangulation [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 9-15.
[10] LI Guangli, LIU Bin, ZHU Tao, YIN Yi, ZHANG Hongbin. Cross-media retrieval model based on choosing key canonical correlated vectors [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 38-46.
[11] WU Chenmou, FANG Zhijun, HWANG Jenqneng. Active driving behavior analysis algorithm based on monocular camera [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(5): 69-76.
[12] ZHANG Xianhong, ZHANG Chunrui. Image enhancement algorithm based on six dimensional feedforward neural network model [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 10-19.
[13] JIANG Shanshan, YANG Jing, FAN Liya. An image feature extraction method based on PDEs [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 27-36.
[14] DOU Tingting, YAO Yuanxi, CHEN Peng, LU Deng. Arc modeling and practical simulation application based on ATP-EMTP [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 102-108.
[15] HUANG Jinchao. A new method for muti-objects image segmentation based on faster region proposal networks [J]. JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), 2018, 48(4): 20-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!