山东大学学报 (工学版) ›› 2019, Vol. 49 ›› Issue (6): 11-24.doi: 10.6040/j.issn.1672-3961.0.2019.229
Wei WANG1(),Feng WU2,Fengyu ZHOU3,*()
摘要:
机器人对操作技能的自主学习是未来机器人服务人类社会所需具备的重要技能之一,也是机器人研究领域的热点问题之一。对目前机器人操作技能学习的主流模式、方式、算法以及不同方法的优缺点做了全面综述,归纳了在未来知识共享模式下个体机器人实现操作技能的自主学习所面临的挑战和亟待解决的关键问题,并介绍了一种将机器人个体学习模式与共享学习模式有机结合提升机器人操作技能的自主认知与学习的潜在解决方案。
中图分类号:
1 | WANG Wei. A social networking-enabled framework for autonomous robot skill development[D]. Sydney, Australia: University of Technology, Australia, 2016. |
2 | 秦方博, 徐德. 机器人操作技能模型综述[J]. 自动化学报, 2019, 45 (8): 1401- 1418. |
3 | KUFFNER J. Cloud-enabled robots[C]//IEEE-RAS International Conference on Humanoid Robotics. Nashville, USA: ACM, 2010. |
4 | Breakthrough Technologies 2016. MIT technology review[EB/OL]. (2016-02-23)[2019-11-03]. https://www.technologyreview.com/lists/technologies/2016/. |
5 | ARGALL B D , CHERNOVA S , VELOSO M , et al. A survey of robot learning from demonstration[J]. Robotics & Autonomous Systems, 2009, 57 (5): 469- 483. |
6 |
KONIDARIS G D , KUINDERSMA S R , GRUPEN R A , et al. Robot learning from demonstration by constructing skill trees[J]. International Journal of Robotics Research (IJRR), 2012, 31 (3): 360- 375.
doi: 10.1177/0278364911428653 |
7 | CANGELOSI A , SCHLESINGER M . Developmental robotics: from babies to robots[M]. New York: USA:the MIT Press, 2015. |
8 | 曾毅, 刘成林, 谭铁牛. 类脑智能研究的回顾与展望[J]. 计算机学报, 2016, 39 (1): 212- 222. |
ZENG Yi , LIU Chenglin , TAN Tieniu . Review and outlook of brain-like intelligence research[J]. Chinese Journal of Computers, 2016, 39 (1): 212- 222. | |
9 | CAKMAK M , DEPALMA N , ARRIAGA R I , et al. Exploiting social partners in robot learning[J]. Autonomous Robots, 2010, 29 (3/4): 309- 329. |
10 | YAHYA A, LI Adrian, KALAKRISHNAN M, et al. Collective robot reinforcement learning with distributed asynchronous guided policy search[C]//IROS. Vancouver, Canada: [s.n.], 2017: 79-86. |
11 | PIAGET J , COOK M T . The origins of intelligence in children[M]. New York: International Universities Press, 1952. |
12 | WENG Juyang , MCCELLAND A , PENTLAND O , et al. Autonomous mental development by robots and animals[J]. Journal of Science, 2001, 291, 599- 600. |
13 |
FRANK G . Learning like a baby: a survey of artificial intelligence approaches[J]. The Knowledge Engineering Review, 2011, 26 (2): 209- 236.
doi: 10.1017/S0269888911000038 |
14 | 乔红, 尹沛劼, 李睿, 等. 机器人与神经科学交叉的意义:关于智能机器人未来发展的思考[J]. 中国科学院院刊, 2015, 30 (6): 762- 771. |
QIAO Hong , YIN Peijie , LI Rui , et al. The significance of the crossover between robot and neuroscience: the thinking of the future development of intelligent robot[J]. Bulletin of Chinese Academy of Sciences, 2015, 30 (6): 762- 771. | |
15 | ZENG Yi , ZHAO Yuxuan , BAI Jun . Towards robot self-consciousness (i): brain-inspired robot mirror neuron system model and its application in mirror self-recognition[J]. Lecture Notes in Computer Science, 2016, 10023, 11- 21. |
16 |
ZENG Yi , ZHAO Yuxuan , BAI Jun , et al. Toward robot self-consciousness (ii): brain-inspired robot bodily self model for self-recognition[J]. Cognitive Computation, 2018, 10 (2): 307- 320.
doi: 10.1007/s12559-017-9505-1 |
17 | QIAO Hong , LI Yinlin , LI Fengfu , et al. Biologically inspired model for visual cognition achieving unsupervised episodic and semantic feature learning[J]. IEEE Transactions on Cybernetics, 2015, 46 (10): 2335- 2347. |
18 | 徐波, 刘成林, 曾毅. 类脑智能研究现状与发展思考[J]. 中国科学院院刊, 2016, 31 (7): 793- 802. |
XU Bo , LIU Chenglin , ZENG Yi . Thoughts on the status and development of brain-inspired intelligence research[J]. Bulletin of Chinese Academy of Sciences, 2016, 31 (7): 793- 802. | |
19 |
TENORTH M , BEETZ M . KnowRob:a knowledge processing infrastructure for cognition-enabled robots[J]. International Journal of Robotics Research, 2013, 32 (5): 566- 590.
doi: 10.1177/0278364913481635 |
20 | TENORTH M, PERZYLO A, LAFRENZ R, et al. The roboearth language: representing and exchanging knowledge about action, objects and environments[C]//IEEE International Conference on Robotics and Automation (ICRA). St. Paul, MN, USA: IEEE, 2012: 1284-1289. |
21 | RUSSELL S J, NORVIG P.人工智能:一种现代方法[M].殷建平,祝恩,刘越,等译. 3版.北京:清华大学出版社, 2017. |
22 | SUTTON R S , PRECUP D , SINGH S . Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning[J]. Artificial Intelligence, 1999, 112 (1/2): 181- 211. |
23 |
ZHANG Yilu , WENG Juyang . Task transfer by a developmental robot[J]. IEEE Transactions on Evolutionary Computation, 2007, 11 (2): 226- 248.
doi: 10.1109/TEVC.2006.890269 |
24 | KUIPERS B . Qualitative reasoning[M]. Massachusetts, USA: MIT, 1994. |
25 | COHEN P R, CHANG Yuhan, MORRISON C T. Learning and transferring action schemas[C]//The 20th International Joint Conference on Artificial Intelligence (IJCAI). Hyderabad, India: IEEE, 2007: 720-725. |
26 |
MUGAN J , KUIPERS B . Autonomous learning of high-level states and actions in continuous environments[J]. IEEE Transactions on Autonomous Mental Development, 2012, 4 (1): 70- 86.
doi: 10.1109/TAMD.2011.2160943 |
27 | KONIDARIS G, KUINDERSMA S, GRUPEN R A, et al. Autonomous skill acquisition on a mobile manipulator[C]//Association for the Advancement on Artificial Intelligence (AAAI). San Francisco, USA: AAAI Press, 2011. |
28 |
PRESTES E , CARBONERA J L , FIORINI S R , et al. Towards a core ontology for robotics and automation[J]. Robotics and Autonomous Systems, 2013, 61 (11): 1193- 1204.
doi: 10.1016/j.robot.2013.04.005 |
29 |
TENORTH M , PERZYLO A C , LAFRENZ R , et al. Representation and exchange of knowledge about actions, objects, and environments in the RoboEarth framework[J]. IEEE Transactions on Automation Science and Engineering, 2013, 10 (3): 643- 651.
doi: 10.1109/TASE.2013.2244883 |
30 | NIEKUM S , OSENTOSK S , KONIDARIS G D , et al. Learning grounded finite-state representations from unstructured demonstrations[J]. International Journal of Robotics Research, 2014, 34 (2): 131- 157. |
31 |
SILVER D , HUANG Aja , MADDISON C J , et al. Mastering the game of go with deep neural networks and tree search[J]. Nature, 2016, 529 (7587): 484- 489.
doi: 10.1038/nature16961 |
32 | LENZ I , LEE H , SAXENA A . Deep learning for detecting robotic grasps[J]. International Journal of Robotics Research (IJRR), 2014, 34 (4/5): 705- 724. |
33 |
MNIH V , KAVUKCUOGLU K , SILVER D , et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518 (7540): 529- 533.
doi: 10.1038/nature14236 |
34 |
VOOSEN P . The AI detectives[J]. Science, 2017, 357 (6346): 22- 27.
doi: 10.1126/science.357.6346.22 |
35 | DOSILOVVIC F, BRCIC M, HLUPIC N. Explainable artificial intelligence: a survey[C]//The 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO 2018). Opatija, Croatia: ACM, 2018: 210-215. |
36 | 刘乃军, 鲁涛, 蔡莹皓, 等. 机器人操作技能学习方法综述[J]. 自动化学报, 2019, 45 (3): 458- 470. |
LIU Naijun , LU Tao , CAI Yinghao , et al. A survey of learning methods on robot manipulation skills[J]. Acta Automatica Sinica, 2019, 45 (3): 458- 470. | |
37 |
SILVER D , SCHRITTWIESER J , SIMONYAN K , et al. Mastering the game of go without human knowledge[J]. Nature, 2017, 550 (7676): 354- 359.
doi: 10.1038/nature24270 |
38 | VAN HASSELT H, GUEZ A, SILVER D. Deep reinforcement learning with double Q-learning[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Arizona, USA: AAAI Press, 2016: 2094-2100. |
39 | WANG Ziyu, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[C]//The 33rd International Conference on Machine Learning. New York, USA: JMLR, 2016: 1995-2003. |
40 | HAUSKNECHT M, STONE P. Deep recurrent Q-learning for partially observable MDPs[C]//The 29th AAAI Conference on Artificial Intelligence. Texas, USA: AAAI Press, 2015. |
41 | GU Shixiang, LILLICRAP T, SUTSKEVER I, et al. Continuous deep Q-learning with model-based acceleration[C]//Proceedings of the 33rd International Conference on Machine Learning (ICML). New York, USA: JMLR, 2016: 2829-2838. |
42 | SILVER D, LEVER G, HEESS N, et al. Deterministic policy gradient algorithms[C]//The 31st International Conference on International Conference on Machine Learning (ICML). Beijing, China: JMLR, 2014: 387-395. |
43 | LILLICRAP T , HUNT J J , PRITZEL A , et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015, 8 (6): A187. |
44 | SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[C]//The 32nd International Conference on Machine Learning (ICML). Lille, France: JMLR, 2015: 1889-1897. |
45 | HEESS N, DHRUVA T B, SRIRAM S, et al. Emergence of locomotion behaviours in rich environments.[EB/OL]. (2017-07-01)[2019-11-03]. https://www.researchgate.net/publication/318316001_Emergence_of_Locomotion_Behaviours_in_Rich_Environmentsar |
46 | LEVINE S, KOLTUN V. Guided policy search[C]//The 30th International Conference on Machine Learning (ICML). Atlanta, USA: JMLR, 2013: 1-9. |
47 | LEVINE S , FINN C , DARRELLL T , et al. End-to-end training of deep visuomotor policies[J]. Journal of Machine Learning Research, 2016, 17 (1): 1334- 1373. |
48 | KULKARNI T D, NARASIMHAN K R, SAEEDI A, et al. Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation[C]//Neural and Evolutionary Computing Statistics-Machine Learning. New York, USA: IEEE, 2016. |
49 | KRISHNAN S, FOX R, STOICA I, et al. DDCO: Discovery of deep continuous options for robot learning from demonstrations[C]//Proceedings of Machine Learning Research. Boston, USA: ACM, 2017. |
50 |
LEMKE C , BUDKA M , GABRYS B . Metalearning: a survey of trends and technologies[J]. Artificial Intelligence Review, 2015, 44 (1): 117- 130.
doi: 10.1007/s10462-013-9406-y |
51 | FINN C, ABBEEL P, LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning(PMLR). Sydney, Australia: ACM, 2017. |
52 | FINN C, YU T, ZHANG T, et al. One-shot visual imitation learning via meta-learning[C]// The 1st Conference on Robot Learning(CoRL). Osaka, Japan: IEEE, 2017: 357-368. |
53 | PAN Jialin , YANG Qiang . A survey on transfer learning[J]. IEEE Transactions on Knowledge & Data Engineering, 2010, 22 (10): 1345- 1359. |
54 | RUSU A A, RABINOWITZ N C, DESJARDINS G, et al. Progressive neural networks[EB/OL]. (2019-03-15)[2019-11-03]. https://blog.acolyer.org/2016/10/11/progressive-neural-networks/?utm_source=tuicool&utm_medium=referral. |
55 | FERNANDO C, BANARSE D, BLUNDELL C, et al. Pathnet: evolution channels gradient descent in super neural networks[J/OL]. Neural and Evolutionary Computing, 2017. https://www.researchgate.net/publication/313096253_PathNet_Evolution_Channels_Gradient_Descent_in_Super_Neural_Networks. |
56 | KEHOE B, MATSUKAWA A, CANDIDO S, et al. Cloud-based robot grasping with the google object recognition engine[C]//IEEE International Conference on Robotics & Automation (ICRA). Karlsruhe, Germany: the IEEE Press, 2013: 4263-4270. |
57 |
STEELS L . Evolving grounded communication for robots[J]. Trends in Cognitive Sciences, 2003, 7 (7): 308- 312.
doi: 10.1016/S1364-6613(03)00129-3 |
58 |
STEELS L , BELPAEME T . Coordinating perceptually grounded categories through language: a case study for color[J]. Behavioral and Brain Sciences, 2005, 28 (4): 489- 529.
doi: 10.1017/S0140525X05220083 |
59 |
WAIBEL M , BEETZ M , CIVERA J , et al. RoboEarth: a world wide web for robots[J]. IEEE Robotics and Automation Magazine, 2011, 18 (2): 69- 82.
doi: 10.1109/MRA.2011.941632 |
60 | ASHUTOSH S, JAIN A, SENER O, et al. RoboBrain: large-scale knowledge engine for robots[EB/OL]. (2014-12-01)[2019-11-03]. https://arxiv.org/abs/1412.0691v1. |
61 | GU Shixiang, HOLLY E, LILLICRAP T, et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates[C]//2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 3389-3396. |
62 | CLAUDIA P D, SHAH J A. C-LEARN: learning geometric constraints from demonstrations for multi-step manipulation in shared autonomy[C]//IEEE International Conference on Robotics & Automation (ICRA). Singapore: IEEE, 2017: 4058-4065. |
63 |
TENORTH M , KLANK U , PANGERCIC D , et al. Web-enabled robots[J]. IEEE Robotics and Automation Magazine, 2011, 18 (2): 58- 68.
doi: 10.1109/MRA.2011.940993 |
64 | WANG Wei , JOHNSTON B , WILLIAMS M-A . Social networking for robots to share knowledge, skills and know-how[J]. Lecture Notes in Computer Science, 2012, 7621, 418- 427. |
65 | MyRobots.com[EB/OL]. (2010-12-19)[2019-11-03]. http://creader.com/news/20011219/200112. |
66 | FREEMAN K. Social network for robots lets you talk to your roomba[EB/OL]. (2011-12-29)[2019-11-03]. http://mashable.com/2011/12/28/social-network-for-robots. |
67 | NIEKUM S, OSENTOSKI S, KONIDARIS G, et al. Learning and generalization of complex tasks from unstructured demonstrations[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vilamoura-Algarve, Portugal: IEEE, 2012. |
68 |
KEHOE B , PATIL S , ABBEEL P , et al. A survey of research on cloud robotics and automation[J]. IEEE Transactions on Automation Science and Engineering, 2015, 12 (2): 398- 409.
doi: 10.1109/TASE.2014.2376492 |
69 | 张钹.走向真正的人工智能[R].深圳: [s.n.], 2018. |
70 |
KONIDARIS G , KAELBLING L P , LOZANO-PEREZ T . From skills to symbols: learning symbolic representations for abstract high-level planning[J]. Journal of Artificial Intelligence Research, 2018, 61, 215- 289.
doi: 10.1613/jair.5575 |
71 | QUIGLEY M, GERKEY B, CONLEY K, et al. ROS: an open-source robot operating system[C]//ICRA Workshop on Open Source Software. Kobe, Japan: IEEE, 2009, 3(3). |
72 | 陈贤, 武延军. 基于ROS的云机器人服务框架[J]. 计算机系统应用, 2016, 25 (10): 73- 80. |
CHEN Xian , WU Xiangjun . A framework for ROS-based cloud robot service[J]. Computer Systems and Applications, 2016, 25 (10): 73- 80. | |
73 | 张继鑫, 武延军. 基于ROS的服务机器人云端协同计算框架[J]. 计算机系统应用, 2016, 25 (9): 85- 91. |
ZHANF Jixin , WU Yanjun . A cloud-based collaborative computing framework for ROS-based service robot[J]. Computer Systems and Applications, 2016, 25 (9): 85- 91. | |
74 |
ALISSANDRAKIS A , NEHANIV C L , DAUTENHAHN K . Correspondence mapping induced state and action metrics for robotic imitation[J]. IEEE Transactions on Systems(Man and Cybernetics-Part B): Cybernetics, 2007, 37 (2): 299- 307.
doi: 10.1109/TSMCB.2006.886947 |
75 |
JI Jianmin , CHEN Xiaoping . A weighted causal theory for acquiring and utilizing open knowledge[J]. International Journal of Approximate Reasoning, Elsevier Science Inc, 2014, 55 (9): 2071- 2082.
doi: 10.1016/j.ijar.2014.03.002 |
76 | FITZGERALD T, THOMAZ A L. Skill demonstration transfer for learning from demonstration[C]//The 10th ACM/IEEE International Conference on Human-Robot Interaction Extended Abstracts (HRI'15 Extended abstracts). New York, USA: ACM, 2015: 187-188. |
77 | KONIDARIS G D, KAELBLING L, LOZANOPEREZ T. Symbol acquisition for task-level planning[C]//AAAI Conference on Learning Rich Representations from Low-Level Sensors. Bellevue, Washington, USA: AAAI Press, 2013. |
78 | YANG Fangkai, LYU Daoming, LIU Bo, et al. PEORL: integrating symbolic planning and hierarchical reinforcement learning for robust decision-making[C]//International Joint Conference on Artificial Intelligence(IJCAI). Stockholm, Sweden: IEEE, 2018. |
79 | GUDI S L K C, OJHA S, JOHNSTON B, et al. Fog robotics: an introduction[C]//International Conference on Intelligent Systems (IROS). Vancouver, Canada: ACM, 2017. |
80 | GUDI S L K C, OJHA S, JOHNSTON B, et al. Fog robotics for efficient, fluent and robust human-robot interaction[C]//IEEE the 17th International Symposium on Network Computing and Applications (NCA). Massachusetts, USA: Computer Society Press, 2018: 1-5. |
[1] | 田国会, 许亚雄. 云机器人:概念、架构与关键技术研究综述[J]. 山东大学学报(工学版), 2014, 44(6): 47-54. |
|