Table of Content

    20 June 2018
    Volume 48 Issue 3
    Change detection with remote sensing images based on forward-backward heterogenicity
    LI Shijin, WANG Shengte, HUANG Leping
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  1-9.  doi:10.6040/j.issn.1672-3961.0.2017.406
    Abstract ( 657 )   PDF (9936KB) ( 179 )   Save
    References | Related Articles | Metrics
    In order to improve the change detection accuracy of water surrounding environment, an improved change detection method was proposed. This method was based on the combination of spectral and textural features, and fused the index feature to construct a hybrid feature space. The simple linear iterative cluster(SLIC)algorithm was used to obtain ground objects by processing the superimposed images. Meanwhile, the proposed method synthesized various forward-backward heterogeneity information to construct the forward-backward heterogeneity of ground objects. The EM algorithm and the minimum error Bayes decision theory were used to obtain the change information of the images on two phases. By eliminating the pseudo change information of vegetation, the relative robust and exact detection results could be achieved. Experimental results showed that the proposed method could effectively distinguish the useful change information from uninterested disturbance information and pseudo change information, and had low false detection ratio and low missing detection ratio. The accuracy of the proposed method could reach more than 96%. Moreover, this method could intelligently recognize the abnormal land-use changes around lakes and reservoirs.
    Gene expression data classification based on artificial bee colony and SVM
    YE Mingquan, GAO Lingyun, WAN Chunyuan
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  10-16.  doi:10.6040/j.issn.1672-3961.0.2017.405
    Abstract ( 628 )   PDF (569KB) ( 616 )   Save
    References | Related Articles | Metrics
    The characteristics of high dimension, small sample and high noise for gene expression data raised many challenges in tumor diagnosis. In order to classify tumor gene expression data more accurately, the kernel function parameters and penalty factors of SVM(support vector machine)were optimized by ABC(artificial bee colony)algorithm, in which classification accuracy was used as the fitness function. A new gene expression data classification method based on ABC algorithm and SVM, which named ABC-SVM, was proposed. Experiments were conducted on six public tumor gene expression datasets, and other classicfication methods were compared. The results showed that ABC-SVM, a method based on fewer informative genes, could obtain higher classification accuracy, and the classification of tumor samples could be more effectively predicted.
    Item embedding classification method for E-commerce
    LONG Bai, ZENG Xianyu, LI Zhi, LIU Qi
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  17-24.  doi:10.6040/j.issn.1672-3961.0.2017.411
    Abstract ( 738 )   PDF (837KB) ( 432 )   Save
    References | Related Articles | Metrics
    Inspired by the Word Embedding Model word2vec, which proved higly successful in the field of Natural Language Processing in recent years, two Item Embedding models item2vec and w-item2vec were proposed. By modeling users behaviour sequences, both item2vec and w-item2vec projected the items to distributed representations in vector space. The vectors of items represented the properties of items and could be used to measure the relations between items. By means of this property, we could categorize products effectively and efficiently. Experimental results showed that methods were conducted on a real-world dataset and w-item2vec achieved an accuracy of nearly 50% for item categorization by using only 10% of the items for training. Two proposed models outperformed other methods obviously.
    Multipurpose zero watermarking algorithm for color image based on SVD and DCNN
    ZHAO Yanxia, WANG Xizhao
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  25-33.  doi:10.6040/j.issn.1672-3961.0.2017.408
    Abstract ( 719 )   PDF (2310KB) ( 256 )   Save
    References | Related Articles | Metrics
    A multipurpose zero watermarking algorithm for color image based on SVD(singular value decomposition)and DCNN(deep convolutional neural network)were proposed for the copyright protection and tamper location of color image. The original RGB color image was transformed into YCbCr color image. The Y channel, Cb channel and Cr channel were transformed by DWT(discrete wavelet transform), some matrices were got through decomposing the coefficient matrices by SVD and got the inputs matrices of DCNN. The information matrix of original image was got from the inputs matrix of output layer of DCNN and was used to generate zero robust watermarking image. The information matrix was got from the coefficient matrix of low frequency subband through the DWT of Y channel and was used to generate the zero semi-fragile watermarking image. The experimental results showed that the algorithm was not only efficient but also had good resistance to the strong common attacks.
    Chinese financial news classification method based on convolutional neural network
    XIE Zhifeng, WU Jiaping, MA Lizhuang
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  34-39.  doi:10.6040/j.issn.1672-3961.0.2017.433
    Abstract ( 807 )   PDF (1540KB) ( 370 )   Save
    References | Related Articles | Metrics
    In order to complete the task of financial news classification, a new method based on convolutional neural network for the classification of Chinese financial news was presented. A simple CNN was trained with one layer of convolution on top of word vectors obtained from an unsupervised neural language model. These vectors were trained on a large number of financial news corpus. Compared with the traditional methods, the network model based on convolutional neural network was simple in structure, which could show excellent performance by using small sample set. The method not only could solve the Chinese financial news classification problem effectively, but also prove the effectiveness of convolutional neural network in dealing with problems of text classification fully.
    Building of domain sentiment lexicon based on word2vec
    LIN Jianghao, ZHOU Yongmei, YANG Aimin, CHEN Jin
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  40-47.  doi:10.6040/j.issn.1672-3961.0.2017.403
    Abstract ( 879 )   PDF (1944KB) ( 541 )   Save
    References | Related Articles | Metrics
    In order to fill the gap of sentimental and semantic representation in domain sentiment lexicon, a construction method of domain sentiment lexicon via word vectors was proposed. The word2vec model was trained based on 250 thousand news texts and 100 thousand hotel review texts. Eighty sentimental words, which possed obvious sentiment, rich content and diverse POS, were chosen as a set of seed words. Meanwhile, 9 860 candidate sentimental words among the hotel review texts were acquired via the measuring value of TR-IDF. The semantic similarity between the candidate sentimental words and the seed words was calculated based on their word vectors, and the sentimental words were mapped to the high dimensional vector space and the feature vector representation(Senti2vec)was extracted. Senti2vec was applied into the polarity classification of sentimental words and sentimental text analysis. The experimental results showed that Senti2vec could represent the meaning and sentiment of sentimental words. Senti2vec was based on semantic similarity calculation from data of specific domain, which enabled this method more adaptable into different domains.
    Incremental multi-view clustering algorithm based on kernel K-means
    ZHANG Peirui, YANG Yan, XING Huanlai, YU Xiuying
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  48-53.  doi:10.6040/j.issn.1672-3961.0.2017.434
    Abstract ( 882 )   PDF (634KB) ( 307 )   Save
    References | Related Articles | Metrics
    Because of the defect of long running time in the kernel based multi-view clustering algorithm(MVKKM)when dealing with large-scale datasets, the concept of incremental clustering model was introduced. The incremental multi-view clustering algorithm based on kernel K-means(IMVKKM)was proposed by combining MVKKM algorithm and incremental clustering framework. The dataset was divided into chunks and the MVKKM method was used in each data chunk to obtain a set of cluster centers,which was regarded as the initial cluster center of the next chunk. The cluster centers of all the chunks were combined and the final set of cluster result was identified by using MVKKM. The experimental results showed that IMVKKM algorithm had better clustering results and shorter running time than MVKKM algorithm on three large-scale datasets. The proposed approach could reduce the running time while keeping the clustering performance.
    K-NN algorithm for big data based on HBase and SimHash
    WANG Tingting, ZHAI Junhai, ZHANG Mingyang, HAO Pu
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  54-59.  doi:10.6040/j.issn.1672-3961.0.2017.414
    Abstract ( 772 )   PDF (491KB) ( 287 )   Save
    References | Related Articles | Metrics
    Aiming at solving the problem of high computational complexity of K-NN(K-nearest neighbors)algorithm in big data scenarios, based on HBase and SimHash, a K-NN algorithm for big data classification was proposed. The big data sets were mapped from the original space into the Hamming space, and the sets of hash codes were obtained. The pairs of rowkeys and values were stored in HBase database; the rowkeys were the hash codes of instances; the values were the classes of instances. For testing instances, the values of instances which had same rowkeys were selected from HBase database, and the labels of testing instances were obtained by majority voting with the values. The proposed algorithm was experimentally compared with MapReduce-based K-NN and Spark-based K-NN on the running time and testing accuracy. The experimental results showed that the running time of the proposed algorithm was much lower than the times of the MapReduce-based K-NN and Spark-based K-NN in the case of classification performance preservation.
    Image aesthetic quality evaluation based on embedded fine-tune deep CNN
    LI Yuxin, PU Yuanyuan, XU Dan, QIAN Wenhua, LIU Hejuan
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  60-66.  doi:10.6040/j.issn.1672-3961.0.2017.421
    Abstract ( 695 )   PDF (2903KB) ( 164 )   Save
    References | Related Articles | Metrics
    The image database was not big enough for using convolutional neural networks to research the image aesthetic quality. Aiming at this problem, a fine-tune transfer learning method was used to analyze the effect of convolutional neural networks architecture and image contents on image aesthetic quality evaluation. During the research of image aesthetic quality evaluation by image contents, the problem of image data decrease rose again. The embedded fine-tune method using fine-tune twice continuously was proposed to solve the problem. The experiments were performed on Photo Quality, a small image database, and got a good effect. The results indicated that the accuracy of image aesthetic quality evaluation by embedded fine-tune was an average of 5.36% higher than by traditional artificially designed feature extraction method, 3.35% and 2.33% higher than by the other two deep learning methods respectively. The embedded fine-tune deep convolutional neural networks solved the problem of small database in image aesthetic quality evaluation research effectively and accurately.
    A document understanding method for short texts by auxiliary long documents
    YAN Yingying, HUANG Ruizhang, WANG Rui, MA Can, LIU Bowei, HUANG Ting
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  67-74.  doi:10.6040/j.issn.1672-3961.0.2017.402
    Abstract ( 628 )   PDF (848KB) ( 256 )   Save
    References | Related Articles | Metrics
    Based on the dirichlet-multinomial regression(DMR)model, a dual dirichlet-multinomial regression(DDMR)model that short texts were understood by auxiliary long documents was proposed. A topic set was shared by long documents and short texts which came from different data sources, and two dirichlet priors were used to generate the topic allocation of long documents and short texts, which enabled the topic knowledge of long documents to be transferred to short texts and improved understanding of the short text. The experiments showed that the DDMR model had a great effect on the topical discovery of short texts.
    The one-dimensional cutting stock problem with sequence-dependent cut losses
    LIANG Zehua, CUI Yaodong, ZHANG Yu
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  75-80.  doi:10.6040/j.issn.1672-3961.0.2017.425
    Abstract ( 725 )   PDF (955KB) ( 160 )   Save
    References | Related Articles | Metrics
    For a particular one-dimensional cutting stock problem abstracted from specific industrial applications, an algorithm based on sequential value correction proposed with considering minimize stock material waste and the problems special properties was proposed. The cutting patterns were generated sequentially after defining and getting the cost between each two items, and then a cutting plan make-up was got by these patterns. Many different cutting plans were produced by continuously correcting the value of items, and the best one was chosen to approach optimal solution. Compared with the other algorithms, the results showed that the proposed approach could get less consumption of raw material and low computation time.
    Method for solving Choquet integral model based on ant colony algorithm
    CHEN Jiajie, WANG Jinfeng
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  81-87.  doi:10.6040/j.issn.1672-3961.0.2017.412
    Abstract ( 746 )   PDF (584KB) ( 235 )   Save
    References | Related Articles | Metrics
    An improved ant colony algorithm for Choquet integral was investigated to enhance the search efficiency of fuzzy measure. Choquet integral model was built according to the characteristic quantity and solved by the process of searching globally or locally according to the state transition probability. It was classified by Fisher discriminates. The experiment used three sets of cancer gene datasets preprocessed by R language Bioconductor toolkit, and classification results was analyzed between new model and the mainstream algorithm. The results showed that in DLBCL dataset and colon dataset, ant colony algorithm had the better effect; in prostate dataset, although the classification results were about the same, ant colony algorithm still had faster convergence than genetic algorithm. The improved ant colony algorithm presented a feasible and effective way to solve fuzzy measures in Choquet integral model.
    An ensemble method with convolutional neural network and deep belief network for gait recognition and simulation
    HE Zhengyi, ZENG Xianhua, GUO Jiang
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  88-95.  doi:10.6040/j.issn.1672-3961.0.2017.427
    Abstract ( 1067 )   PDF (2302KB) ( 292 )   Save
    References | Related Articles | Metrics
    The Gaussian-based conditional restricted Boltzmann machine(GCRBM)time series model could efficiently predict for single type of gait time series data, but the model could not make accurate recognition and prediction for multi-category gait time series data. To solve the problem above, an ensemble/integrated method with convolutional neural network(CNN)and deep belief network(DBN)for gait recognition and simulation was proposed. Multiple CNNs models with different structures were trained by all the gait data. Multiple DBNs models corresponding to the multi-category data were trained to study low dimensional features, and corresponding to train multiple GCRBMs models through the low dimensional features. In the step of recognition and simulation, model will identify the class of gait data with all CNNs classifiers by the “minority-obeying” voting strategy, then the low-dimensional feature of the DBNs model corresponding to the identified class was used as the input of the corresponding GCRBMs model to predict the late timing low-dimensional feature of the target data. The gait images could be reconstructed by the corresponding DBNs model. Compared with the method of support vector machine(SVM), integrated DBN and CNN, the proposed method’s gait recognition rate was improved based on CASIA gait datasets. Moreover, the predicting result could be simulated to the true gait sequences by the proposed method, which demonstrated the validity of the model.
    An additive co-clustering for recommendation of integrating social network
    DU Xixi, LIU Huafeng, JING Liping
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  96-102.  doi:10.6040/j.issn.1672-3961.0.2017.404
    Abstract ( 664 )   PDF (1010KB) ( 360 )   Save
    References | Related Articles | Metrics
    In order to solve the problem of user cold start problem and improve the prediction accuracy of recommendation algorithm, an additive co-clustering recommendation model combining social networks(SN-ACCRec)was proposed, which integrated user social relations into user clustering of rating matrix. According to the social relations theory analysis of users, user blocks was divided with the idea of fuzzy C means clustering, and a co-clustering result was acquired by clusters items on rating matrix according to k-means algorithm. The general and specific categories was gotten by generating the user and item additive co-clustering results in an iterative method and pedict the missing values. The model was evaluated using ten fold cross validation method, and experimental results showed that this model could reduce the average absolute error(MAE)and the root mean square error(RMSE), which also showed a better recommendation performance in the cold start users.
    A finger-vein recognition method based on weighted graph model
    YE Ziyun, YANG Jinfeng
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  103-109.  doi:10.6040/j.issn.1672-3961.0.2017.467
    Abstract ( 678 )   PDF (2527KB) ( 140 )   Save
    References | Related Articles | Metrics
    A new weighted graph construction method was proposed for finger-vein network representation. For a weighted graph, its nodes and edges were respectively generated by dividing image into blocks and a triangulation algorithm, and the weights of edges were valued using the feature similarities between adjacent blocks. In this way, a finger-vein image could be represented by a weighted graph, and the adjacency matrix of this weighted graph was used for finger-vein recognition. The experiment results proved the effectiveness of the method, and some important factors that affected graph recognition results were discussed in detail.
    Parallelization and GPU acceleration of compressive sensing reconstruction algorithm
    HE Wenjie, HE Weichao, SUN Quansen
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  110-114.  doi:10.6040/j.issn.1672-3961.0.2017.413
    Abstract ( 551 )   PDF (1800KB) ( 218 )   Save
    References | Related Articles | Metrics
    Aimed at the poor real-time performance of the compression sensing reconstruction algorithm, the parallel acceleration of the compressive sampling matching pursuit(CoSaMP)algorithm was proposed. Coarse grained parallelization of reconstruction algorithm was realized based on multithreading technology. The hotspot of CoSaMP algorithm was analyzed, and the matrix operation which was time-consuming was transplanted to graphics processing unit(GPU)to achieve fine grained parallelization of the algorithm. The experiments on the test image showed that 50-fold acceleration speedup was achieved and the study reduced the computing time cost of the reconstruction algorithm effectively.
    A hybrid intrusion detection system based on BFOA and K-means algorithm
    XIAO Miaomiao, WEI Benzheng, YIN Yilong
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  115-119.  doi:10.6040/j.issn.1672-3961.0.2017.428
    Abstract ( 690 )   PDF (1144KB) ( 200 )   Save
    References | Related Articles | Metrics
    The K-means algorithm was sensitive to the selection of the initial clustering center and the number of clusters K, which led to the instability of the clustering results and would have a significant impact on the detection results of IDS(instrusion detection system, briefly named as IDS). To solve this problem, a hybrid intrusion detection algorithm(HIDS)based on BFOA(bacterial foraging optimization algorithm)and K-means was proposed. The value of K could be determined dynamically based on the distance threshold method. BFOA could be used to optimize the initial cluster centers, which made the initial clustering centers to be globally optimal. Therefore, the instability of the clustering results of K-means algorithm was solved. The detection rate was 98.33% by performing an experimental test on the KDD99 dataset. The experimental results showed that the method could effectively improve the detection rate and reduce the false detection rate.
    A word extend LDA model for short text sentiment
    SHEN Ji, MA Zhiqiang, LI Tuya, ZHANG Li
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  120-126.  doi:10.6040/j.issn.1672-3961.0.2017.407
    Abstract ( 636 )   PDF (607KB) ( 222 )   Save
    References | Related Articles | Metrics
    Faced with low accuracy of sentiment polarity analysis for short text, this research presented an sentiment analysis model for short text based on latent dirichlet allocation. The model searched for the emotional words by the part of speech in the short texts and expanded them restrainedly to an extended set, enhanced the co-occurrence frequency between emotional words. The model added the expanded set to the discovered emotional words in short texts, increasing length of the short texts, extracting emotional information and turning topic clustering into emotion topic clustering. The model used 4 000 positive and negative short texts to experiments. The results showed that our model improved sentiment classification 11.8% than joint sentiment topic model model and 9.5% than latent sentiment model model; more emotional words were found at the same time. It proved that the model extracted richer emotion features for short texts and had a higher accuracy of classification in sentiment analysis.
    An radial basis function neural network model based on monotonic constraints
    CAO Ya, DENG Zhaohong, WANG Shitong
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  127-133.  doi:10.6040/j.issn.1672-3961.0.2017.423
    Abstract ( 726 )   PDF (662KB) ( 210 )   Save
    References | Related Articles | Metrics
    Radial basis function(RBF)neural network was a type of efficient feedforward neural network, which had simple structure and good generalization ability. It had been widely used in data classification. However, for some special classification scenarios, such as the scenarios of dealing with the monotonic data, RBF neural network could not fully realize its potential. For this challenge, monotonic radial basis function neural network(MC-RBF)was proposed. The model added a prior knowledge about monotonicity which was expressed in terms of inequality based on partial order of training data. The Tikhonov regularization was introduced to MC-RBF to ensure the uniqueness and boundedness of the solution of the optimization problem. The experimental results showed that MC-RBF had better classification performance than the classical RBF neural network when dealing with monotonic datasets.
    An over sampling algorithm based on clustering
    WANG Huan, ZHOU Zhongmei
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  134-139.  doi:10.6040/j.issn.1672-3961.0.2017.416
    Abstract ( 708 )   PDF (456KB) ( 167 )   Save
    References | Related Articles | Metrics
    In the research of over sampling, in order to generate meaningful new samples, the ClusteredSMOTE-Boost was proposed, which was based on the clustering technique. The algorithm filtered the noisy of minority class samples and took the remaining minority class samples as target samples to synthesize new samples. According to characteristics of the cluster of target samples after clustering determined the weight and the number of the target samples for the whole training set. All target samples were clustered and K-nearest neighbors in the cluster of the target sample were selected, and then a sample from K-nearest neighbors was randomly chosen to synthesize new sample with target sample. Thus, new samples were similar with samples in the target cluster. This method reduced the complexity of the boundary caused by the additional new samples. The experimental results showed that the ClusteredSMOTE-Boost algorithm was superior to the three classical algorithms SMOTE-Boost, ADASYN-Boost, BorderlineSMOTE-Boost on the variety of measures.
    Coefficient of variation clustering algorithm for non-uniform data
    YANG Tianpeng, XU Kunpeng, CHEN Lifei
    JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE). 2018, 48(3):  140-145.  doi:10.6040/j.issn.1672-3961.0.2017.410
    Abstract ( 665 )   PDF (1359KB) ( 585 )   Save
    References | Related Articles | Metrics
    Affected by the “uniform effect”, a problem existed in the partition-based algorithms remained on open and challenging taskdue to handling. To solve this problem, a clustering algorithm based on coefficient of variation was proposed. The “uniform effect” caused by K-means-type partitioning clustering algorithm from the view of clustering optimization was analyzed. Instead of the squared error, a new measure of dispersion for non-uniform data was proposed relied on the coefficient of variation. The clustering objective optimization function was defined using a new non-uniform data dissimilarity formula, which was proposed based on the coefficient of variation. According to the local optimization method, the clustering algorithm process was given. The experimental results on real and synthetic non-uniform datasets showed that the clustering accuracy of CVCN was better than K-means, Verify2, ESSC.