Yousong Zhu

I am now an Associate Professor at the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA).

I am interested in computer vision and machine learning, especially object detection & recognition, visual self-supervised learning, visual-language models, network architecture design, etc.

Looking for interns working on vision foundation model, object detection and visual-language learning. If you are interested, please send me an email with your CV.

Email  /  Google Scholar  /  DBLP  /  Chinese homepage

profile photo

News!
  • [06/2024] Our paper EMAE working on self-supervised learning was accepted by TPAMI2024.
  • [03/2024] We are excited to announce the arrival of Griffon v2.
  • [11/2023] We release Griffon, a unified localizaiton foundation model with LVLM, Check our code.
  • [11/2022] One paper working on Autoregressive Image Modeling was accepted by AAAI2023.
  • [09/2022] The paper Obj2Seq was accepted by NeurIPS2022 as Spotlight! Check our code.
  • [07/2022] The paper PASS working on Person Re-identification was accepted by ECCV2022, the code has been released!
  • [03/2022] Two papers (UniVIP and C2AM loss for long-tail detection) were accepted by CVPR2022.
  • [09/2021] We tried Masked Image Modeling in visual representation learning, the paper MST was accepted by NeurIPS2021.
  • [07/2021] Our paper DPT working on Deformable Transformer was accepted by ACM MM2021 as Oral, Check our code.
  • [03/2021] The paper ACSL was accepted by CVPR2021, the code has been released!
  • [07/2020] The paper working on Large Batch Training for object detection was accepted by ECCV2020. Training a Faster R-CNN in COCO within 12 minutes.
  • [03/2020] The DSRL paper was accepted by CVPR2020 as Oral, Congrats to all collaborators!

Education
  • Ph.D in Institute of Automation, Chinese Academy of Sciences, 2019.
  • B.S. in School of Information Science and Engineering, Central South University, 2014.

Work Experience
  • 2022~Now: Associate Professor
    • National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences.
  • 2019~2021: Assistant Professor
    • National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences.

Publications

Efficient Masked Autoencoders with Self-Consistency
Zhaowen Li, Yousong Zhu, Zhiyang Chen, Wei Li, Rui Zhao, Chaoyang Zhao, Ming Tang, Jinqiao Wang
TPAMI2024

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring
Yufei Zhan, Yousong Zhu, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang
arXiv, 2024
arXiv / code

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models
Yufei Zhan, Yousong Zhu, Zhiyang Chen, Fan Yang, Ming Tang, Jinqiao Wang
European Conference on Computer Vision (ECCV), 2024
arXiv / code

Self-Supervised Representation Learning from Arbitrary Scenarios
Zhaowen Li, Yousong Zhu, Zhiyang Chen, Zongxin Gao, Rui Zhao, Chaoyang Zhao, Ming Tang, Jinqiao Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Paper

Exploring Stochastic Autoregressive Image Modeling for Visual Representation
Yu Qi, Fan Yang, Yousong Zhu, Yufei Liu, Liwei Wu, Rui Zhao, Wei Li
Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023
Paper

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Zhiyang Chen, Yousong Zhu, Zhaowen Li, Fan Yang, Wei Li, Haixin Wang, Chaoyang Zhao, Liwei Wu, Rui Zhao, Jinqiao Wang, Ming Tang
Neural Information Processing Systems (NeurIPS), 2022 (Spotlight)
Paper / code

PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification
Kuan Zhu, Haiyun Guo, Tianyi Yan, Yousong Zhu, Jinqiao Wang, Ming Tang
European Conference on Computer Vision (ECCV), 2022
arXiv / code

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training
Zhaowen Li, Yousong Zhu, Fan Yang, Wei Li, Chaoyang Zhao, Yingying Chen, Zhiyang Chen, Jiahao Xie, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Paper

C2AM Loss: Chasing a Better Decision Boundary for Long-Tail Object Detection
Tong Wang, Yousong Zhu, Yingying Chen, Chaoyang Zhao, Bin Yu, Jinqiao Wang, Ming Tang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Paper

MST: Masked Self-Supervised Transformer for Visual Representation
Zhaowen Li, Zhiyang Chen, Fan Yang, Wei Li, Yousong Zhu, Chaoyang Zhao, Rui Deng, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang
Neural Information Processing Systems (NeurIPS), 2021
Paper

DPT: Deformable Patch-based Transformer for Visual Recognition
Zhiyang Chen, Yousong Zhu, Chaoyang Zhao, Guosheng Hu, Wei Zeng, Jinqiao Wang, Ming Tang
ACM International Conference on Multimedia (ACM MM), 2021 (Oral)
arXiv / code

Adaptive Class Suppression Loss for Long-Tail Object Detection
Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Jinqiao Wang, Ming Tang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Paper / code

Large Batch Optimization for Object Detection: Training COCO in 12 Minutes
Tong Wang, Yousong Zhu, Chaoyang Zhao, Wei Zeng, Yaowei Wang, Jinqiao Wang, Ming Tang
European Conference on Computer Vision (ECCV), 2020
Paper

Dual Super-Resolution Learning for Semantic Segmentation
Li Wang, Dong Li, Yousong Zhu, Lu Tian, Yi Shan
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 (Oral)
Paper

Food det: Detecting Foods in Refrigerator with Supervised Transformer Network
Yousong Zhu*, Xu Zhao*, Chaoyang Zhao, Jinqiao Wang, Hanqing Lu (*equal contribution)
Neurocomputing, 2020
Paper

Mask Guided Knowledge Distillation for Single Shot Detector
Yousong Zhu, Chaoyang Zhao, Chenxia Han, Jinqiao Wang, Hanqing Lu
IEEE International Conference on Multimedia & Expo (ICME), 2019
Paper

Attention CoupleNet: Fully Convolutional Attention Coupling Network for Object Detection
Yousong Zhu, Chaoyang Zhao, Haiyun Guo, Jinqiao Wang, Xu Zhao, Hanqing Lu
IEEE Transactions on Image Processing (TIP), 2019
Paper

CoupleNet: Coupling Global Structure with Local Parts for Object Detection
Yousong Zhu, Chaoyang Zhao, Jinqiao Wang, Xu Zhao, Yi Wu, Hanqing Lu
International Conference on Computer Vision (ICCV), 2017
Paper / code

Scale-Adaptive Deconvolutional Regression Network for Pedestrian Detection
Yousong Zhu, Jinqiao Wang, Chaoyang Zhao, Haiyun Guo, Hanqing Lu
Asian Conference on Computer Vision (ACCV), 2016
Paper


Honors & Service
  • 2022 Super AI Leader -World AI Conference (WAIC), SAIL Award. (2022世界人工智能大会最高奖--SAIL奖)
  • 2022 Gold medal of the 8th China College Students' 'Internet+' Innovation and Entrepreneurship Competition.
  • 2019 University of Chinese Academy of Sciences & Beijing Outstanding Graduate. (中国科学院大学&北京市优秀毕业生)
  • 2018 AI Challenger Autonomous Driving Perception, First Place Prize. (AI Challenger全球无人驾驶视觉感知第一名)
  • Journal Reviewer: T-IP, T-MM, PR, T-CSVT, T-NNLS, Neurocomputing.
  • Conference Reviewer: CVPR23, NeurIPS23, ICCV23, ICLR23, AAAI23, ECCV22, ACM MM22, NeurIPS22, AAAI22, CVPR22, ICCV21, ACM MM21, CVPR21, AAAI21, CVPR20, ECCV20, AAAI20, ICCV19, ACM MM19.

Students
  • Ph.D:
      Tong Wang (2017-2022; Baidu), work on object detection. (together with Prof. Ming Tang)
      Zhiyang Chen (2019-2024; Xihu University), work on vision foundation model. (together with Prof. Ming Tang)
      Zhaowen Li (2019-2024; Huawei Topminds), work on visual self-supervised learning. (together with Prof. Jinqiao Wang)
      YuFei Zhan (2021- ), work on open-world object detection and recognition. (together with Prof. Jinqiao Wang)
      Fan Yang (2022- ), work on visual prompt learning and language-guided object detection. (together with Prof. Jinqiao Wang)
      Shurong Zheng (2023- ), work on Large Visual-Language Models. (together with Prof. Jinqiao Wang)
  • Master:
      Hongyin Zhao (2020-2023), work on long-tail object detection. (together with Prof. Jinqiao Wang)


template credit to Jon Barron