中央研究院資訊科技創新研究中心

Abstract

I will talk about some of our work in the vision and language understanding regime for video understanding, embodied AI and continual learning. In video understanding task, we explore methods that localize moments in videos, generate image sequences for a story and answer questions about given video. Our work addresses some of challenges, including spatiotemporal attention mechanisms that encode long horizon information encoding. Transitioning to embodied AI, I will discuss advancements in developing agents capable of performing goal-directed tasks in simulated environments. By integrating video understanding with action sequence prediction and imitation learning, we enhance the agent's ability to adapt and plan effectively in dynamic scenarios. This talk will also highlight the role of continual learning and environmental adaptability, showcasing their importance for robust and interactive AI systems if time permits.

線上會議連結：
Webex 會議連結
會議號： 2514 838 0092
密碼： wxFqanV894p

Bio

Jonghyun Choi received the B.S. and M.S. degrees in electrical engineering and computer science from Seoul National University, Seoul, South Korea in 2003 and 2008 respectively. He received a Ph.D. degree from University of Maryland, College Park in 2015, under the supervision of Prof. Larry S. Davis. He is currently an associate professor at Seoul National University, Seoul, South Korea. During his PhD, he has worked as a research intern in a number of research labs including US Army Research Lab (2012), Adobe Research (2013), Disney Research Pittsburgh (2014) and Microsoft Research Redmond (2014). He was an associate professor at Yonsei University (2022-2024), an assistant professor at GIST (2018-2022), a research scientist at Allen Institute for Artificial Intelligence (AI2), Seattle, WA (2016-2018), a senior researcher at Comcast Applied AI Research, Washington, DC (2015-2016). He serves as an area chair at NeurIPS, CVPR, WACV, AAAI (senior PC), ICRA (associate editor), and an associate editor in IEEE Transactions on PAMI and IJCV. His research interest includes visual understanding for edge devices and household robots and visual recognition of images and videos under computational and data efficiency constraints.

人工智慧創新應用專題中心

人工智慧創新應用專題中心

學術演講

Understanding Sequence of Visual Data