Yixiao Ge
- geyixiao831@gmail.com
- Google Scholar
- Github
- Beijing, China
I am currently a senior researcher at Tencent ARC Lab and Tencent AI Lab, leading an effort on vision and multimodal foundation models. Previously, I got my Ph.D. degree from Multimedia Lab (MMLab), the Chinese University of Hong Kong, advised by Prof. Hongsheng Li and Prof. Xiaogang Wang. We are actively looking for self-motivated interns to work on related research topics. Please feel free to reach out if you are interested.
News
- [Aug 2023] Glad to release ViT-Lens. Stay tuned for more updates!
- [Aug 2023] Glad to release SEED-Bench, the most comprehensive MLLM benchmark to date.
- [July 2023] Glad to release our SEED. Stay tuned for more updates!
- [July 2023] Four papers are accepted to ICCV 2023.
- [May 2023] One paper is accepted to KDD 2023.
- [Apr 2023] One paper is accepted to ICML 2023.
- [Feb 2023] Four papers are accepted to CVPR 2023.
- [Jan 2023] One paper is accepted to ICLR 2023.
- [Nov 2022] Two papers are accepted to AAAI 2023.
- [Jul 2022] Three papers are accepted to ECCV 2022.
- [Apr 2022] One paper is accepted to IJCAI 2022 as a Long oral presentation.
- [Mar 2022] Two papers are accepted to CVPR 2022 with one Oral presentation.
- [Jan 2022] Three papers are accepted to ICLR 2022.
Publications [Full List]
( *equal contribution #corresponding author )
Selected Preprints:
Selected Preprints:
-
ViT-Lens: Towards Omni-modal RepresentationsAdvancing omni-modal representation learning with modality lens.Weixian Lei, Yixiao Ge#, Jianfeng Zhang, Dylan Sun, Kun Yi, Ying Shan, Mike Zheng Shou#
-
SEED-Bench: Benchmarking Multimodal LLMs with Generative ComprehensionConsists of 19K multiple-choice questions with accurate human annotations, spans 12 evaluation dimensions in terms of both spatial and temporal comprehension.Bohao Li*, Rui Wang*, Guangzhi Wang*, Yuying Ge#, Yixiao Ge#, Ying Shan
-
Planting a SEED of Vision in Large Language ModelEmpowers Large Language Models (LLMs) with the emergent ability to see and draw.Yuying Ge*, Yixiao Ge*#, Ziyun Zeng, Xintao Wang, Ying Shan
-
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible AdapterEnabling new ViTs plugged into the framework (e.g., BLIP-2) with other modules untouched and a performance boost.Binjie Zhang, Yixiao Ge#, Xuyuan Xu, Ying Shan, Mike Zheng Shou#
-
What Makes for Good Visual Tokenizers for Large Language Models?Rather than simply applying CLIP models, we systematically investigate proper pre-training methods to build good visual tokenizers, making LLMs powerful multimodal LLMs.Guangzhi Wang, Yixiao Ge#, Xiaohan Ding, Mohan Kankanhalli, Ying Shan
-
TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at ScaleProducing general-purpose video features that work out of the box. We surpass InternVideo and ImageBind on zero-shot and linear tasks.Ziyun Zeng, Yixiao Ge#, Zhan Tong, Xihui Liu, Shu-Tao Xia, Ying Shan
-
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instructionWe for the first time enable Vicuna-13B to use visual models via self-instruct tuning. The system can be deployed on local machines without APIs.Lin Song, Yanwei Li, Rui Yang, Sijie Zhao, Yixiao Ge, Ying Shan
-
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video GenerationJay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Yufei Shi, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou
-
Exploring Model Transferability through the Lens of Potential EnergyXiaotong Li, Zixuan Hu, Yixiao Ge, Ying Shan, Lingyu Duan
-
BoxSnake: Polygonal Instance Segmentation with Box SupervisionRui Yang, Lin Song, Yixiao Ge, Xiu Li
-
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object DetectionYuxin Fang*, Shusheng Yang*, Shijie Wang*, Yixiao Ge, Ying Shan, Xinggang Wang
-
Binary Embedding-based Retrieval at TencentYukang Gan*, Yixiao Ge*, Chang Zhou*, Shupeng Su, Zhouchuan Xu, Xuyuan Xu, Quanchao Hui, Xiang Chen, Yexin Wang, Ying Shan
-
π-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task InterpolationChengyue Wu, Teng Wang, Yixiao Ge#, Zeyu Lu, Ruisong Zhou, Ying Shan, Ping Luo
-
Accelerating Vision-Language Pretraining with Free Language ModelingTeng Wang, Yixiao Ge, Feng Zheng, Ran Cheng, Ying Shan, Xiaohu Qie, Ping Luo
-
Masked Visual Reconstruction in Language Semantic SpaceShusheng Yang, Yixiao Ge#, Kun Yi, Dian Li, Ying Shan, Xiaohu Qie, Xinggang Wang#
-
Learning Transferable Spatiotemporal Representations from Natural Script KnowledgeZiyun Zeng*, Yuying Ge*, Xihui Liu, Bin Chen#, Ping Luo, Shu-Tao Xia, Yixiao Ge#
-
All in One: Exploring Unified Video-Language Pre-trainingAlex Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou
-
Masked Image Modeling with Denoising ContrastKun Yi*, Yixiao Ge*#, Xiaotong Li, Shusheng Yang, Dian Li, Jianping Wu, Ying Shan, Xiaohu Qie
-
Darwinian Model Upgrades: Model Evolving with Selective CompatibilityBinjie Zhang*, Shupeng Su*, Yixiao Ge#, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying ShanAAAI, 2023 [Paper]
-
Video-Text Pre-training with Learned RegionsRui Yan, Mike Zheng Shou, Yixiao Ge, Alex Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang
-
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text RetrievalYuying Ge, Yixiao Ge, Xihui Liu, Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo
-
Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher SpaceWenqi Shao#, Xun Zhao, Yixiao Ge#, Zhaoyang Zhang, Lei Yang, Xiaogang Wang, Ying Shan, Ping Luo
-
mc-BEiT: Multi-choice Discretization for Image BERT Pre-trainingXiaotong Li, Yixiao Ge, Kun Yi, Zixuan Hu, Ying Shan, Lingyu Duan
-
Towards Universal Backward-Compatible Representation LearningBinjie Zhang, Yixiao Ge#, Yantao Shen, Shupeng Su, Fanzi Wu, Chun Yuan#, Xuyuan Xu, Yexin Wang, Ying Shan
-
Bridging Video-text Retrieval with Multiple Choice QuestionsYuying Ge, Yixiao Ge, Xihui Liu, Dian Li, Ying Shan, Xiaohu Qie, Ping Luo
-
Object-aware Video-language Pre-training for RetrievalAlex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou
-
Hot-Refresh Model Upgrades with Regression-Alleviating Compatible Training in Image RetrievalBinjie Zhang, Yixiao Ge#, Yantao Shen, Yu Li, Chun Yuan#, Xuyuan Xu, Yexin Wang, Ying Shan
-
Dynamic Token Normalization Improves Vision TransformerWenqi Shao, Yixiao Ge, Zhaoyang Zhang, Xuyuan Xu, Xiaogang Wang, Ying Shan, Ping Luo
-
Uncertainty Modeling for Out-of-Distribution GeneralizationXiaotong Li, Yongxing Dai, Yixiao Ge, Jun Liu, Ying Shan, Lingyu Duan
-
Structured Domain Adaptation with Online Relation Regularization for Unsupervised Person Re-IDYixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, Xiaogang Wang, Hongsheng Li
-
Progressive Correspondence Pruning by Consensus LearningChen Zhao*, Yixiao Ge*, Feng Zhu, Rui Zhao, Hongsheng Li, Mathieu Salzmann
-
Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identificationYi Zheng, Shixiang Tang, Guolong Teng, Yixiao Ge, Kaijian Liu, Donglian Qi, Jing Qin, Dapeng ChenICCV, 2021 [Paper]
-
Refining Pseudo Labels with Clustering Consensus over Generations for Unsupervised Object Re-identificationXiao Zhang*, Yixiao Ge*, Yu Qiao, Hongsheng LiCVPR, 2021 [Paper]
-
DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial NetworkRui Liu, Yixiao Ge, Ching Lam Choi, Xiaogang Wang, Hongsheng Li
-
Mutual CRF-GNN Network for Few-shot LearningShixiang Tang, Dapeng Chen, Lei Bai, Kaijian Liu, Yixiao Ge, Wanli OuyangCVPR 2021 [Paper]
-
Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-IDYixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, Hongsheng Li
-
Self-supervising Fine-grained Region Similarities for Large-scale Image LocalizationYixiao Ge, Haibo Wang, Feng Zhu, Rui Zhao, Hongsheng Li
-
Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identificationYixiao Ge, Dapeng Chen, Hongsheng Li
-
FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identificationYixiao Ge*, Zhuowan Li*, Haiyu Zhao, Guojun Yin, Shuai Yi, Xiaogang Wang, Hongsheng Li