Greetings from Bellevue.

I am an Applied Scientist at Amazon Alexa AI, working on multimodal perception, video understanding, and visual reasoning.

My broader research lies at the intersection of computer vision, robotics, and embodied AI, with a focus on building perception systems that can understand 3D structure, reason across viewpoints, and act under uncertainty in open-world environments. Previously, I received my Ph.D. from Northwestern University, where I worked with Prof. Ying Wu on active vision, embodied recognition, uncertainty-aware perception, and robotics-related visual understanding.

My detailed resume/CV is here (last updated on 2026).

๐Ÿ”ฅ News

  • 2026.02: ย ๐ŸŽ‰ One co-authored paper on safeguarding MLLMs got accepted by CVPR 2026! Congratulations to Jinqi and other authors.
  • 2025.02: ย ๐ŸŽ‰ One co-authored paper on visual localization under extreme viewpoint changes got accepted by CVPR 2025! Congratulations to Yunxuan and other authors.
  • 2024.10: ย ๐ŸŽ‰ One co-authored paper on visual question answering got accepeted by EMNLP 2024! Congratulations to Xiaoying and other authors.
  • 2024.05: The proposed dataset to evaluate active recognition has been made publicly available! Please refer to the page for details.
  • 2024.04: ย ๐ŸŽ‰ I have successfully defended my Ph.D.! I would like to extend my gratitude to my committee: Prof. Ying Wu, Prof. Qi Zhu, and Prof. Thrasos N. Pappas. And I will join Amazon as an Applied Scientist this summer!
  • 2024.02: ย ๐ŸŽ‰ Two papers on active recognition for embodied agents have been accepted by CVPR 2024! Thanks to all my collaborators!
  • 2023.07: ย ๐ŸŽ‰ Our paper on uncertainty estimation has been accepted to ICCV 2023! Appreciation goes out to all advisors: Dr. Bo Liu, Dr. Haoxiang Li, Prof. Ying Wu, and Prof. Gang Hua!

๐Ÿ“– Educations

  • 2019 - 2024, M.S., Ph.D. in Electrical Engineering, advised by Prof. Ying Wu, Northwestern University.
  • 2013 - 2019, B.E., M.S. in Computer Science, advised by Prof. Long Chen, Sun Yat-sen University.

๐Ÿ“ Publications

CVPR 2024
sym

Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations

Lei Fan, Jianxiong Zhou, Xiaoying Xing, Ying Wu

Poster | Video

  • Investigate CLIPโ€™s limitations in embodied perception scenarios, emphasizing diverse viewpoints and occlusion degrees.
  • Propose an active agent to mitigate CLIPโ€™s limitations, aiming for active open-vocabulary recognition.
CVPR 2024
sym

Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception

Lei Fan, Mingfu Liang, Yunxuan Li, Gang Hua, Ying Wu

Supplementary | Poster | Dataset | Project (coming soon) | Video

  • Handling unexpected visual inputs for embodied agentโ€™s training and testing in open environments.
  • Collect a dataset for evaluating active recognition agents. Each testing sample is accompanied with a recognition difficulty level.
  • Applying evidential deep learning and evidence combination for frame-wise information fusion, mitigating unexpected image interference.
ICCV 2023
sym

Flexible Visual Recognition by Evidential Modeling of Confusion and Ignorance

Lei Fan, Bo Liu, Haoxiang Li, Ying Wu, Gang Hua

Supplementary | Poster | Project | Code

  • Modeling both confusion and ignorance with hyper-opinions.
  • Proposing a hierarchical structure with binary plausible functions to handle the challenge of 2^K predictions.
  • Experiments with synthetic data, flexible visual recognition, and open-set detection validate our approach.
WACV 2023
sym

Avoiding Lingering in Learning Active Recognition by Adversarial Disturbance

Lei Fan, Ying Wu

Supplementary | Poster

  • Lingering: The joint learning process could lead to unintended solutions, like a collapsed policy that only visits views that the recognizer is already sufficiently trained to obtain rewards.
  • Our approach integrates another adversarial policy to disturb the recognition agent during training, forming a competing game to promote active explorations and avoid lingering.
ICCV 2021
sym

FLAR: A Unified Prototype Framework for Few-sample Lifelong Active Recognition

Lei Fan, Peixi Xiong, Wei Wei, Ying Wu

Supplementary | Poster

  • The active recognition agent needs to incrementally learn new classes with limited data during exploration.
  • Our approach integrates prototypes, a robust representation for limited training samples, into a reinforcement learning solution, which motivates the agent to move towards views resulting in more discriminative features.

๐Ÿ’ป Internships

  • 2023.06 - 2023.09, Applied Scientist Intern, Amazon Robotics, Seattle, US.
    - Topic: Surface normal estimation and stability analysis.
    - Advisors: Dr. Shantanu Thaker, Dr. Sisir Karumanchi.
  • 2022.06 - 2022.09, Research Intern, Wormpex AI Research, Bellevue, US.
    - Topic: Uncertainty quantification for deep visual recognition.
    - Advisors: Dr. Bo Liu, Dr. Haoxiang Li, and Dr. Gang Hua.
  • 2020.06 - 2020.09, Research Intern, Yosion Analytics, Chicago, US.
    - Topic: Autonomous forklift in a human-machine co-working environment.
  • 2016.06 - 2016.09, Visual Engineer Intern, DJI, Shenzhen, China.
    - Topic: Stereo matching using the fish-eye cameras on drones.

๐ŸŽ– Honors and Awards

  • 2019.09 Northwestern University Murphy Fellowship.
  • 2018.06 Best Student Paper, IEEE Intelligent Vehicle Symposium.
  • 2019.09 National Merit Scholarship, China