摘要:第三十七期 ACE Talk,我们特别邀请到来自北京大学的副教授卢宗青为我们带来以"Scaling Humanoid Robot Learning with Internet Videos"为主题的报告。卢宗青副教授将分享如何利用互联网规模的视频数据教会机器人
//
微软亚洲研究院 ACE Talk 系列讲座旨在邀请杰出的学术新星分享科研成果,为学生与研究员提供相互交流学习与洞悉前沿动态的平台。
第三十七期 ACE Talk,我们特别邀请到来自北京大学的副教授卢宗青 为我们带来以"Scaling Humanoid Robot Learning with Internet Videos" 为主题的报告。卢宗青副教授将分享如何利用互联网规模的视频数据教会机器人类人技能,以及如何结合视觉-语言模型、动作模型和强化学习,实现高效的知识迁移。
讲座信息
时间:6 月 11 日(周三)10:00 - 11:50
地点:微软亚洲研究院(北京市海淀区丹棱街5号)或 Teams 线上参会
• 嘉宾报告,Q&A(10:00 - 11:30)
• Office tour(11:30 - 11:50)
报名方式
欢迎扫描下方二维码填写报名问卷,报名成功后将收到邮件通知,邮件中将提供讲座 Teams 线上会议链接。
报名截止时间:6 月 10 日(周二)下午 14:00
报名二维码:
嘉宾介绍
卢宗青
北京大学
副教授
Dr. Zongqing Lu is an Associate Professor in the School of Computer Science at Peking University (PKU). His research focuses on reinforcement learning, multimodal models, and general-purpose agents. He has published extensively in and serves as the area chair for top-tier machine learning conferences, including ICLR, ICML, and NeurIPS. He was leading the Multimodal Interaction Research Center at Beijing Academy of Artificial Intelligence from 2022 to 2024. He obtained his PhD from Nanyang Technological University. In addition to his academic roles, Dr. Lu is the founder of BeingBeyond, a startup company focused on developing foundation models for embodied AI.
报告简介
Scaling Humanoid Robot Learning
with Internet Videos
Humanoid robots hold immense promise as general-purpose physical agents, yet their learning remains heavily constrained by the scarcity and cost of real-world data. In this talk, I will present our recent progress on scaling up humanoid robot learning by leveraging the vast and diverse Internet videos. We explore how vision-language models, motion models, and reinforcement learning can be harnessed to extract actionable knowledge from video data and enable humanoid robots to learn diverse, human-like skills without large-scale robot trials. Our approach integrates large-scale pretraining with Internet data, followed by efficient finetuning in physical/simulated environments, enabling humanoid robots to perform complex manipulation, locomotion, and interaction tasks. Our results show that Internet-scale data can serve not only as a catalyst for visual and semantic understanding but also as a rich resource for embodiment, bringing us closer to general-purpose, instruction-following humanoid robots.
主持人简介
Jiaolong Yang
微软亚洲研究院
首席研究员
Dr. Jiaolong Yang is a Principal Researcher and Research Manager in the Microsoft Research Asia (MSRA) Lab located in Beijing, China. His research interest lies in 3D Computer Vision and AI, including 3D reconstruction and generation, human face & body modelling, immersive AI experiences, and physical AI embodiments. Part of his research has been transferred to various Microsoft Products such as Microsoft Cognitive Services, Windows Hello, Microsoft XiaoIce, etc. He serves as the program committee member/reviewer for major computer vision conferences and journals including CVPR/ICCV/ECCV/TPAMI/IJCV, the Area Chair for CVPR/ICCV/ECCV/WACV/MM, and the Associate Editor for the International Journal of Computer Vision (IJCV). He received the Excellent PhD Thesis Award from China Society of Image and Graphics (CSIG) in 2017, the Best Paper Award from IEEE VR 2022, and the Best Paper Honorable Mention from IEEE VR 2025.
关于ACE Talk
ACE (Accelerate,Create,Empower) Talk Series epitomizes our commitment across three dimensions. To accelerate the swift adoption of cutting-edge research where researchers and students can share the latest breakthroughs and advancements. To create an environment that nurtures novel ideas and fosters the genesis of innovative solutions to complex problems. At its core, we are dedicated to empowering individuals, including our speakers and audiences to drive positive changes on a broader scale. Through this endeavor, we aspire to enhance global communication and cultivate a diverse academic atmosphere, connecting talented individuals worldwide and ultimately contributing to meaningful change within the academic research community.
来源:opendotnet