Loading...

Table of Content

      
    20 April 2020
    Volume 50 Issue 2
    Machine Learning & Data Mining
    Fake comment detection based on heterogeneous ensemble learning
    Dapeng ZHANG,Yajun LIU,Wei ZHANG,Fen SHEN,Jiansheng YANG
    Journal of Shandong University(Engineering Science). 2020, 50(2):  1-9.  doi:10.6040/j.issn.1672-3961.0.2019.402
    Abstract ( 747 )   HTML ( 25 )   PDF (2118KB) ( 282 )   Save
    Figures and Tables | References | Related Articles | Metrics

    In view of the problem of small data set and inaccurate labeling in the field of fake comment detection, in order to prevent the vicious competition of sellers, ensure the fair trading of e-commerce platform, and protect the rights of consumers, the latest fake comment data set released by Amazon was used. The research was carried out and the related algorithms were improved. The Word2vec model could not recognize the word pairs in English. The Bigram-Word2vec model was proposed. The "two-class weighted hard voting" was proposed to solve the heterogeneous integration learning's case where the number of votes of the classifier was equal. The "weighted soft voting" was studied for how to set the weight of the classifier in heterogeneous integration learning. The experimental results showed that the improvement of related algorithms in this paper had achieved more ideal results.

    Method for super-resolution using parallel interlaced sampling
    An ZHU,Chu XU
    Journal of Shandong University(Engineering Science). 2020, 50(2):  10-16,26.  doi:10.6040/j.issn.1672-3961.0.2019.318
    Abstract ( 559 )   HTML ( 11 )   PDF (5209KB) ( 122 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Various Internet-based images and artificial intelligence applications were more sensitive to the quality of image data. The image quality had been seriously affected due to the limitations of previous acquisition equipment and transmission methods. In order to compensate for the loss of image data quality and enhance the image effect, a parallel interlaced up and down sampling network (PSUDN) was proposed as a better solution to this problem, which using parallel high resolution feature (HR Feature) and low resolution feature (LR Feature) interleaving sample to generated advanced feature maps, and improved the quality of the output high-resolution pictures by building parallel high resolution feature modules and low resolution feature modules. The model constructed by parallel upsampling and downsampling could reconstruct 8 times high resolution pictures and achieved better results.

    Visual tracking algorithm based on verifying networks
    Ningning CHEN,Jianwei ZHAO,Zhenghua ZHOU
    Journal of Shandong University(Engineering Science). 2020, 50(2):  17-26.  doi:10.6040/j.issn.1672-3961.0.2019.418
    Abstract ( 576 )   HTML ( 16 )   PDF (3959KB) ( 203 )   Save
    Figures and Tables | References | Related Articles | Metrics

    In order to solve the problem that the existing deep learning based visual tracking algorithms paid attention to the deep features but neglected the shallow features, and the tracking network did not evaluate the tracking results, a visual tracking algorithm based on verifying network was proposed. The proposed algorithm consisted of tracking network and verifying network. In the tracking network, considering the fusion of deep features and shallow edge features, a multi-input residual network was designed to learn the relationship between the target and its corresponding Gaussian response map to obtain the position information of the target. In the verifying network, a shallow chain discriminate network was designed, and this paper compared the tracking results of tracking network and verifying network, and updated the tracking network according to the compared results. Therefore, the proposed algorithm not only took the deep features into account, but also avoided the loss of detail information. Furthermore, the tracking results were evaluated to prevent the continuation of error messages in the update. The experimental results illustrated that the proposed tracking algorithm achieved better tracking results than some other existing tracking methods.

    Vehicle classification and tracking for complex scenes based on improved YOLOv3
    Shiqi SONG,Yan PIAO,Zexin JIANG
    Journal of Shandong University(Engineering Science). 2020, 50(2):  27-33.  doi:10.6040/j.issn.1672-3961.0.2019.412
    Abstract ( 835 )   HTML ( 16 )   PDF (5481KB) ( 335 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the influence of weather conditions and mutual occlusion of vehicles on vehicle classification and tracking accuracy and stability, a hybrid model based on improved YOLOv3 and matching tracking was proposed. The improved YOLOv3 network refered to DenseNet′s design idea, replaced the residual layer in the network with a dense convolution block and changed the design structure of the network. The fused features of dense convolution blocks and convolution layers were classified by using Softmax classifier. According to the detection result of single frame image, the target matching function was designed to solve the vehicle tracking problem in video sequence. In the KITTI dataset test, the improved algorithm achieved an average precision of 93.01%, the number of frames per second reached 48.98, and the average recognition rate in the self-built dataset was 95.79%. The experimental results showed that the proposed method could effectively distinguish the types of vehicles in complex scenes with higher accuracy. At the same time, the method had higher accuracy and robustness in vehicle tracking.

    Improved bird swarm algorithms based on mixed decision making
    Wei YAN,Damin ZHANG,Huijuan ZHANG,Ziyun XI,Zhongyun CHEN
    Journal of Shandong University(Engineering Science). 2020, 50(2):  34-43.  doi:10.6040/j.issn.1672-3961.0.2019.294
    Abstract ( 808 )   HTML ( 9 )   PDF (1356KB) ( 541 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Aiming at the problems of low precision and easy to fall into local optimum in solving complex function problems of traditional bird swarm algorithm (BSA), an improved bird swarm algorithm based on mixed decision-making was proposed while retaining the simplicity of BSA. The centroid opposition-based learning was used to initialize the bird population and maintain the better spatial solution distribution of the bird flock. In order to balance the global search ability and local detection ability of the algorithm in the optimization process, the period time of the birds flying to another area was dynamically adjusted. The weighting strategy of adaptive cosine function and weighted averaging idea were introduced to improve the producer's foraging formula, so as to increase the ability of the algorithm to get rid of difficulties after falling into local optimum. The performance of improved bird swarm algorithm based on mixed decision-making, bird swarm algorithm and particle swarm optimization were compared on the basis of nine test functions. The results showed that the accuracy and speed of the improved algorithm were greatly improved in the test of single-peak and multi-peak functions.

    A syntactic element recognition method based on deep neural network
    Yanping CHEN,Li FENG,Yongbin QIN,Ruizhang HUANG
    Journal of Shandong University(Engineering Science). 2020, 50(2):  44-49.  doi:10.6040/j.issn.1672-3961.0.2019.313
    Abstract ( 684 )   HTML ( 7 )   PDF (1711KB) ( 289 )   Save
    Figures and Tables | References | Related Articles | Metrics

    It was difficult to obtain structural information in Chinese sentences by the traditional feature method. To solve the problem, according to characteristics of Chinese sentence, a Bi-LSTM-Attention-CRF model was proposed based on deep neural network. A Bi-LSTM network was used to automatically extract structural information and semantic information from raw input sentences. Attention mechanism was adopted to weight abstract semantic features for classification. An optimized label sequence was output through the CRF layer. Comparing with other methods, our model could effectively identify syntactic elements in sentences. The performance reached to 84.85% in F1 score in the evaluation data sets.

    Identification of the same product feature based on multi-dimension similarity and sentiment word expansion
    Longmao HU,Xuegang HU
    Journal of Shandong University(Engineering Science). 2020, 50(2):  50-59.  doi:10.6040/j.issn.1672-3961.0.2019.403
    Abstract ( 607 )   HTML ( 8 )   PDF (1624KB) ( 284 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Because the existing methods for identifying the same product features were limited by the lack of dictionary coverage or corpus size, an identification method was proposed based on multidimensional similarity and sentiment word expansion. Extracting emotional words of product features through bi-directional long short-term memory and conditional random field (Bi-LSTM-CRF), combining the morpheme similarity, Cilin similarity and term frequency-inverse document frequency (TF-IDF) cosine similarity of product feature words, the same product features were identified by K-medoids clustering algorithm. The experimental results showed that, on mobile and notebook datasets, the maximum adjusted rand index (ARI) reached 0.579 and 0.595 9 respectively, while the minimum entropy reached 0.782 6 and 0.745 7. The proposed method was superior to the adjusted Jaccard similarity combined morpheme, Word2Vec similarity and Word2Vec similarity based on bisecting K-means.

    LDA-based topic feature representation method for symbolic sequences
    Chao FENG,Kunpeng XU,Lifei CHEN
    Journal of Shandong University(Engineering Science). 2020, 50(2):  60-65.  doi:10.6040/j.issn.1672-3961.0.2019.760
    Abstract ( 805 )   HTML ( 6 )   PDF (1403KB) ( 269 )   Save
    Figures and Tables | References | Related Articles | Metrics

    To address the problems of high feature dimensionality and high algorithm time complexity in the existing methods, a topic feature representation method was proposed to transform the symbolic sequences into a set of topic probability vectors, based on the topic model latent Dirichlet allocation (LDA) commonly used in text mining. In the new method, each short sequence gram was considered as the shallow feature (word) of the sequence, and the topics with their probability distributions were extracted as the deep features of the sequences using the LDA model learning algorithm.Experiments were carried out on six real-world sequence sets, and compared with the existing grams-based and Markov model-based methods. The results showed that the new method improved the learning efficiency of the representation model while reducing the feature dimensionality, and achieved better accuracy in the application of symbolic sequence classification.

    Entity recommendation based on normalized similarity measure of meta graph in heterogeneous information network
    Wenkai ZHANG,Ke YU,Xiaofei WU
    Journal of Shandong University(Engineering Science). 2020, 50(2):  66-75.  doi:10.6040/j.issn.1672-3961.0.2019.304
    Abstract ( 580 )   HTML ( 10 )   PDF (2102KB) ( 220 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Based on the promising result of meta graph in heterogeneous information networks (HIN), normalized similarity measure of meta graph (NSMG) was proposed which combined implicit feedback matrix and PathSim(meta path-based similarity) to solve the problem of preference for large degree entities. Yelp-HIN(heterogeneous information networks in Yelp) and Amazon-HIN(heterogeneous information networks in Amazon) were constructed based on Yelp and Amazon datasets. Different types of meta graphs and normalized similarity measures were defined. Matrix decomposition and factorization machine were used to combine the similarities on different meta graphs. The experimental results showed that the proposed method based on normalization similarity measure of meta graphs performed better than the commonly used entity recommendation method in HIN on very sparse data sets.

    Prediction of microRNA-binding residues based on Laplacian support vector machine and sequence information
    Xin MA,Xue WANG
    Journal of Shandong University(Engineering Science). 2020, 50(2):  76-82.  doi:10.6040/j.issn.1672-3961.0.2019.292
    Abstract ( 555 )   HTML ( 6 )   PDF (1012KB) ( 178 )   Save
    Figures and Tables | References | Related Articles | Metrics

    A new method of semi-surpervised learning algorithm was proposed to predict miRNA-binding residues in protein sequences. The Laplacian support vector machine (LapSVM) algorithm was combined with the newly proposed hybrid features to build a prediction model. The hybrid features were obtained from a combination of secondary structure information, HKM features, and the newly proposed feature combination of amino acid physicochemical properties and evolutionary information. Performance comparison of the various features indicated that our novel feature contributed the most to prediction improvement. The results demonstrated that accuracy of our LapSVM model achieved 88.72%, sensitivity achieved 54.18% and specificity achieved 91.15% using feature selection. The LapSVM model significantly outperformed other approaches at miRNA-binding site prediction.

    Image denoising based on 3D shearlet transform and BM4D
    Shengnan ZHANG,Lei WANG,Chunhong CHANG,Benli HAO
    Journal of Shandong University(Engineering Science). 2020, 50(2):  83-90.  doi:10.6040/j.issn.1672-3961.0.2019.262
    Abstract ( 1215 )   HTML ( 14 )   PDF (12734KB) ( 251 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Aimed at the disadvantage that the traditional block matching denoising method could only deal with two-dimensional images, an image denoising method based on 3D shearlet transform and BM4D(block-matching and 4D filtering) was proposed. This method used 3D shearlet transform to obtain transform domain coefficients, and realized joint filtering in transform domain through hard threshold and Wiener filtering stage. The 3D shearlet transformation was localized through two filtering stages: multi-scale decomposition and directional decomposition. The hard threshold and Wiener filtering were performed, which include grouping, collaborative filtering and aggregation. The 4D transformation of the cubes was based on the local correlationandon-local correlation cubes. The estimated values of each grouped cube were obtained by inverse transformation of 3D shearlet transform, and self-adaptive aggregation was performed at their original positions. PSNR(peak signal to noise ratio) and SSIM(structural similarity) were used as evaluation criteria. The results showed that this method could effectively remove image noise in high noise environment, and effectively improved the visual effect of the image with high accuracy.

    Air quality prediction approach based on integrating forecasting dataset
    Minghe GAO,Ying ZHANG,Rongrong ZHANG,Zihao HUANG,Linyan HUANG,Fanyu LI,Xin ZHANG,Yanhao WANG
    Journal of Shandong University(Engineering Science). 2020, 50(2):  91-99.  doi:10.6040/j.issn.1672-3961.0.2019.404
    Abstract ( 1083 )   HTML ( 19 )   PDF (4733KB) ( 693 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Towarding the air quality prediction research problem, LightGBM was employed to propose and design a predictive feature-based air quality prediction approach, which could effectively predict the PM2.5 concentration, i.e., the key indicator reflecting air quality, in the upcoming 24-hour within Beijing. During constructing the prediction solution, the features of the training data set was analyzed to execute data cleansing, and the methods of random forest and linear interpolation were used to solve the problem of high data loss and noise interference. The predictive data features were integrated into the dataset, and meanwhile the corresponding statistical features were designed to imiprove the prediction accurancy. The sliding window mechanism was used to mine high-dimensional time features and increase the quantity of data features. The performance and result of the proposed approach were analyzed in details through comparing with the basedline models. The experimental results showed that compared with other model methods, the proposed LightGBM-based prediction approach with integrating forecasting data had higher prediction accuracy.

    Fire detection based on lightweight convolutional neural network
    Yunyang YAN,Chenxi DU,Yian LIU,Shangbing GAO
    Journal of Shandong University(Engineering Science). 2020, 50(2):  100-107.  doi:10.6040/j.issn.1672-3961.0.2019.424
    Abstract ( 785 )   HTML ( 6 )   PDF (4400KB) ( 251 )   Save
    Figures and Tables | References | Related Articles | Metrics

    A novel lightweight flame detection method was proposed based on MobileNet. The video flame detection rate was promoted by the feature receptive field of DCB(dilated convolution block)module expand based on depthwise separable convolution and dilated convolution to strengthen the feature semantic information. The SSD(single shot multibox detector) detection framework was also optimized. The lightweight detection model DMSSD(Dilated MobileNet-SSD) was provided. Experiments showed that the mean average precision was increased by 1.7% and 3.8% respectively on the PASCAL VOC dataset and the VisiFire dataset of Bilkent University. Furthermore, the detection speed was up to 80 frames per second. The robustness and real-time performance of DMSSD were strong.

    Abnormal sound detection of washing machines based on deep learning
    Chunyang LI,Nan LI,Tao FENG,Zhuhe WANG,Jingkai MA
    Journal of Shandong University(Engineering Science). 2020, 50(2):  108-117.  doi:10.6040/j.issn.1672-3961.0.2019.419
    Abstract ( 838 )   HTML ( 19 )   PDF (5582KB) ( 673 )   Save
    Figures and Tables | References | Related Articles | Metrics

    Based on the convolutional neural network (CNN) framework, a model for abnormal sounds recognition of washing machine was proposed. According to the remarkable feature extraction ability and translation invariance of convolutional neural network, the abnormal sound features of washing machines were learned, so as to achieve the purpose of the automatic intelligent recognition of abnormal sounds for washing machines in production line. This method provided a complete process to solve the problems of training datasets establishment and data imbalance. A network model for data augmentation called advanced deep convolution generated adversarial network (ADCGAN)was proposed to solve the problem of training data scarcity. The traditional deep convolution generated adversarial network (DCGAN) model was improved to better adapt to the generation of industrial sounds. This model could be used to extend the original data and generate the abnormal sound augmented datasets of washing machine. The augmented datasets was used to train the convolutional neural network, and the test accuracy reached 0.999. The generalization ability of abnormal sounds recognition model for washing machine network was tested by using the data set with background noise signal added. The correct recognition rate reached 0.902, which indicated that this network had good robustness in recognizing abnormal noises of washing machines.

    Semantic analysis and vectorization for intelligent detection of big data cross-site scripting attacks
    Haijun ZHANG,Yinghui CHEN
    Journal of Shandong University(Engineering Science). 2020, 50(2):  118-128.  doi:10.6040/j.issn.1672-3961.0.2019.043
    Abstract ( 642 )   HTML ( 8 )   PDF (2001KB) ( 225 )   Save
    Figures and Tables | References | Related Articles | Metrics

    The access traffic corpus big data were processed with word vectorization based on the methods of semantic scenario analysis and vectorization, and the intelligent detection oriented to big data cross-site scripting attack was realized. It used the natural language processing methods for data acquisition, data cleaning, data sampling, feature extraction and other data preprocessing. The algorithm of word vectorization based on neural network was used to realize word vectorization and get big data of word vectorization. Through theoretical analysis and deductions, the intelligent detection algorithms of varieties of long short term memory networks with different layers were realized. With different hyperparameters and repeated tests, lots of results were got, such as the highest recognition rate for 0.999 5, the minimum recognition rate for 0.264 3, average recognition rate for 99.88%, variance for 0, standard deviations for 0.000 4, the curve diagram of recognition rates change, the curve diagram of error of loss change, the curve diagram of cosine proximity change of word vector samples and the curve diagram of mean absolute error change etc. The results of the study showed that the algorithm had the advantages of high recognition rates, strong stability and excellent overall performance, etc.