Maggie Xinyue Hao
Video understanding is a fundamental ability required for artificially intelligent agents, including robots, self-driving cars, or interactive systems. Due to video's scale and temporal nature, existing techniques usually suffer from high complexity, expensive computation, and low accuracy. In this PhD, we will tackle both things at once by using attention. Learning to attend to the right parts of a video will help discard some of the least important parts, which will make processing faster. Learning to attend will likely also yield more accurate results, as it will discard information that can potentially be noisy. One of the main applications where we will test our findings is action recognition.
PhD, Centre for Doctoral Training in Robotics & Autonomous Systems, University of Edinburgh, 2022-2026
MSc (with Distinction), Artificial Intelligence, University of Edinburgh, 2020-2021,
BEng, Telecommunications Engineer with Management, Beijing University of Posts and Telecommunications (BUPT), 2016-2020
Before starting my PhD studies, I worked as an algorithm engineer at ByteDance and Alibaba Group in China. Besides, I also worked as a research intern in lifelong learning at Intel Labs China and Tsinghua University during my undergraduate years. My research interests lie in deep learning and computer vision, especially video understanding and spatial-temporal modeling.