您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(工学版)》

山东大学学报(工学版) ›› 2012, Vol. 42 ›› Issue (1): 19-24.

• 机器学习与数据挖掘 • 上一篇    下一篇

一种基于位置优化的排序学习方法

林原,林鸿飞*,张平   

  1. 大连理工大学计算机科学与技术学院, 辽宁  大连  116024
  • 收稿日期:2011-10-12 出版日期:2012-02-20 发布日期:2011-10-12
  • 通讯作者: 林鸿飞(1962- ),男,内蒙古通辽人,教授,博士,博士生导师,主要研究方向为搜索引擎,文本挖掘和自然语言理解. Email:hflin@dlut.edu.cn
  • 作者简介:林原(1983- ),男,吉林长春人,博士研究生,主要研究方向为信息检索,机器学习,排序学习. Email:yuanlin@mail.dlut.edu.cn
  • 基金资助:

    国家自然科学基金资助项目(60673039,60973068);国家高技术发展研究计划(863计划)资助项目(2006AA01Z151);教育部博士点基金资助项目(20090041110002);教育部出国留学人员归国启动基金资助项目(20090041110002)

A learning to rank approach based on ranking positions

LIN Yuan, LIN Hong-fei*, ZHANG Ping   

  1. School of Computer Science and Engineering, Dalian University of Technology, Dalian 116024, China
  • Received:2011-10-12 Online:2012-02-20 Published:2011-10-12

摘要:

如何设计有效的相关性排序函数是信息检索研究的一个核心问题,因为排序函数直接影响着搜索结果的质量。排序函数的好坏一般由信息检索评价方法进行评估,对其进行优化的主要困难是这些方法都依赖于结果文档的排序位置,因此对于查询的结果返回列表中相关文档的位置的研究是十分重要的。通过探索相关文档和不相关文档之间的偏序关系构造新的输入样本;该样本是由一个相关文档和一组不相关文档所构成的,它能够更加有效的区分文档的相关性;基于该输入样本,通过定义位置损失函数对排序结果进行优化。在公开数据集Letor30的上的实验结果显示该方法可以将多种排序评价方法的准确率平均提高2%,证明了所提出的方法的有效性。

关键词: 排序学习, 信息检索, 排序位置

Abstract:

Designing effective ranking functions is a core problem for information retrieval since the ranking functions directly impacted the relevance of the search results. Learning ranking functions from preference data in particular have recently attracted much interest. The ranking algorithms were often evaluated using information retrieval measures. The main difficulty in direct optimization of these measures was that they depended on the ranks of documents. So it was important to optimize the ranking positions of relevant documents in the result list. Specifically, the roles of preference were investigated between the relevant documents and irrelevant documents in the learning process. To remedy this, a new input sample named one-group sample was constructed by a relevant document and a group of irrelevant documents according to a given query. The new sample could effectively distinguish the relevance of documents.  With the new samples a new position based loss function was also developed to improve the performance of learned ranking functions. Experimental studies were conducted using the Letor30 data set which improved ranking accuracies by 2% and demonstrated the effectiveness of the proposed method.

Key words: learning to rank, information retrieval, ranking positions

[1] 刘东慧1,2,姜薇1*. 基于事件本体的Web不良信息挖掘[J]. 山东大学学报(工学版), 2012, 42(5): 35-40.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!