Zun Wang
I'm a first-year CS Ph.D. student at UNC Chapel Hill, advised by Prof. Mohit Bansal. I was a Master of Machine Learning and Computer Vision student at the Australian National University advised by Prof. Stephen Gould. In my master time, I also interned at the OpenGVLab, Shanghai AI Laboratory, led by Prof. Yu Qiao.
Before that, I got my bachelor degree in applied mathematics from the University of Science and Technology of China.
Email /
CV /
twitter /
Google Scholar /
Github
|
|
Research
My research goal is to build multimodal, generative, and embodied agents, with current interests in:
- Multimodal Understanding and Generation
- Scalable Learning for Embodied Agents
- Multimodal Data Generation and Curation
|
News
- (2025-01) Self-refining Data Flywheel for high-quality VLN data generation is accepted to ICLR 2025! 🤖 surpasses human on R2R-VLN for the first time!
- (2024-11) New preprint DreamRunner✨ for storytelling video generation! My first PhD project at UNC MURGE-Lab🥳!
- (2024-11) Our VLN survey paper is accepted to TMLR!
- (2024-08) Started my Ph.D. in the MURGe Lab at UNC Chapel Hill. Hello UNC😆!
- (2024-07) Two paper accepted to ECCV 2024! Congrats Gengze and InternVideo Team!
- (2024-04) One paper accepted to TPAMI! Congrats Dong!
- (2024-02) One paper accepted to CVPR 2024 as Highlight! Congrats Kunchang!
- (2023-10) Attending ICCV2023 @ Paris in person😆! Great pleasure to learn from so many researchers/scholars🥹!
- (2023-07) One paper accepted to ICCV 2023 as Oral presentation!
- (2023-07) I'm awarded a Postgraduate Medal for Academic Excellence from ANU!Photos here!
- (2023-07) I graduated from Masters of Machine Learning and Computer Vision with Commendation from ANU.
- (2022-11) I'm awarded the Chancellor's Letter of Commendation from ANU.
- (2022-09) We won 1st place of the REVERIE VLN Challenge in CSIG 2022!
- (2022-06) We won 1st place of the RxR-Habitat VLN Competition in Embodied AI Workshop, CVPR 2022!
- (2022-03) We have a paper accepted to CVPR 2022!
|
|
DreamRunner: Fine-grained Storytelling Video Generation with Retrieval-augmented Motion Adaptation
Zun Wang, Jialu Li, Han Lin, Jaehong Yoon, Mohit Bansal
Preprint
paper /
code /
project page
|
|
Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
Zun Wang, Jialu Li, Yicong Hong, Songze Li, Kunchang Li, Shoubin Yu, Yi Wang, Yu Qiao, Yali Wang, Mohit Bansal, Limin Wang
ICLR, 2025
paper /
code
|
|
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts
Gengze Zhou, Yicong Hong, Zun Wang, Chongyang Zhao, Mohit Bansal, Qi Wu
preprint
paper / code
|
|
MVBench: A Comprehensive Multi-Modal Video Understanding Benchmark
Kunchang Li, Yali Wang, Yinan He, Yizhuo Li, Yi Wang, Yi Liu, Zun Wang, Jilan Xu, Guo Chen, Ping Luo, Limin Wang, Yu Qiao
CVPR, 2024, Highlight (3%)
paper / code
|
|
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding
Yi Wang*, Kunchang Li*, Xinhao Li*, Jiashuo Yu*, Yinan He*, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang
ECCV, 2024
paper / code
|
|
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
Gengze Zhou, Yicong Hong, Zun Wang, Xin Eric Wang, Qi Wu
ECCV, 2024
paper / code
|
|
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments
Dong An, Hanqing Wang, Wenguan Wang, Zun Wang, Yan Huang, Keji He, Liang Wang
TPAMI, 2024
paper / code
|
|
Scaling Data Generation in Vision-and-Language Navigation
Zun Wang*, Jialu Li*, Yicong Hong*, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao
ICCV, 2023, Oral presentation (1.9%)
paper /
code /
project page
|
|
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Yi Wang*, Kunchang Li*, Yizhuo Li*, Yinan He*, Bingkun Huang*, Zhiyu Zhao*, Hongjie Zhang*, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Yali Wang, Limin Wang, Yu Qiao
Technical Report , 2022
paper / code
|
|
1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022)
Dong An*, Zun Wang*, Yangguang Li, Yi Wang, Yicong Hong, Yan Huang, Liang Wang, Jing Shao
Technical Report , 2022
paper
|
|
REVERIE Challenge @ CSIG 2022
Our team BPT (Zun Wang, Yi Wang, Yinan He, Yu Qiao) is the Winner of both channels (out of 50+ teams).
report /
certificate (channel 1) /
certificate (channel 2) /
leaderboard
|
|
RxR-Habitat Competition @ CVPR 2022
Our team Joyboy (Dong An*, Zun Wang*, Yangguang Li, Yi Wang, Yicong Hong, Yan Huang, Liang Wang, Jing Shao) is the Winner of the competition. Our solution improves SoTA performance from 37% to 55%.
report /
certificate /
leaderboard
|
Collaborators
I work closely and discuss deeply with my friend Dr. Yicong Hong. I also share lots of VLN thinkings and collaborate with my friend Dong An.
|
|