default search action

combined dblp search
author search
venue search
publication search

ask others

Haoyu Cao 0001

> Home > Persons

Person information

affiliation: Tencent YouTu Lab, Hefei, China

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[i17]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2501-01957
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2501-01957
Chaoyou Fu, Haojia Lin, Xiong Wang, Yifan Zhang, Yunhang Shen, Xiaoyu Liu, Haoyu Cao, Zuwei Long, Heting Gao, Ke Li, Long Ma, Xiawu Zheng, Rongrong Ji, Xing Sun, Caifeng Shan, Ran He:
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction. CoRR abs/2501.01957 (2025)
[i16]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-05177
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-05177
Yunhang Shen, Chaoyou Fu, Shaoqi Dong, Xiong Wang, Yifan Zhang, Peixian Chen, Mengdan Zhang, Haoyu Cao, Ke Li, Xiawu Zheng, Yan Zhang, Yiyi Zhou, Ran He, Caifeng Shan, Rongrong Ji, Xing Sun:
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy. CoRR abs/2502.05177 (2025)
[i15]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-03739
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-03739
Zuwei Long, Yunhang Shen, Chaoyou Fu, Heting Gao, Lijiang Li, Peixian Chen, Mengdan Zhang, Hang Shao, Jian Li, Jinlong Peng, Haoyu Cao, Ke Li, Rongrong Ji, Xing Sun:
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model. CoRR abs/2505.03739 (2025)
[i14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-20777
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-20777
Zhehan Kan, Yanlin Liu, Kun Yin, Xinghua Jiang, Xin Li, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun, Qingmin Liao, Wenming Yang:
TACO: Think-Answer Consistency for Optimized Long-Chain Reasoning and Efficient Data Learning via Reinforcement Learning in LVLMs. CoRR abs/2505.20777 (2025)
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2507-05805
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2507-05805
Xin Li, Mingming Gong, Yunfei Wu, Jianxin Dai, Antai Guo, Xinghua Jiang, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun:
DREAM: Document Reconstruction via End-to-end Autoregressive Model. CoRR abs/2507.05805 (2025)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-09607
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-09607
Shaoqi Dong, Chaoyou Fu, Haihan Gao, Yifan Zhang, Chi Yan, Chu Wu, Xiaoyu Liu, Yunhang Shen, Jing Huo, Deqiang Jiang, Haoyu Cao, Yang Gao, Xing Sun, Ran He, Caifeng Shan:
VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation. CoRR abs/2510.09607 (2025)
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-16448
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-16448
YongXiang Hua, Haoyu Cao, Zhou Tao, Bocheng Li, Zihao Wu, Chaohu Liu, Linli Xu:
Input Domain Aware MoE: Decoupling Routing Decisions from Task Optimization in Mixture of Experts. CoRR abs/2510.16448 (2025)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-21817
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-21817
Xiaoyu Liu, Chaoyou Fu, Chi Yan, Chu Wu, Haihan Gao, Yifan Zhang, Shaoqi Dong, Chen Qian, Bin Luo, Xiuyong Yang, Guanwu Li, Yusheng Cai, Yunhang Shen, Deqiang Jiang, Haoyu Cao, Xing Sun, Caifeng Shan, Ran He:
VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting. CoRR abs/2510.21817 (2025)
2024
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/ml/ZhangZCBCJX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ml/ZhangZCBCJX24
Mao Zhang, Tie Zhang, Yifei Cheng, Changcun Bao, Haoyu Cao, Deqiang Jiang, Linli Xu:
Communication-efficient clustered federated learning via model distance. Mach. Learn. 113(6): 3869-3888 (2024)
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/YuLZCSB24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/YuLZCSB24
Wenwen Yu, Yuliang Liu, Xingkui Zhu, Haoyu Cao, Xing Sun, Xiang Bai:
Turning a CLIP Model Into a Scene Text Spotter. IEEE Trans. Pattern Anal. Mach. Intell. 46(9): 6040-6054 (2024)
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/YanZZLCJX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/YanZZLCJX24
Haoqiu Yan, Yongxin Zhu, Kai Zheng, Bing Liu, Haoyu Cao, Deqiang Jiang, Linli Xu:
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction. ACL (1) 2024: 15009-15022
[c9]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/coling/LiGZYCJX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/coling/LiGZYCJX24
Bocheng Li, Zhujin Gao, Yongxin Zhu, Kun Yin, Haoyu Cao, Deqiang Jiang, Linli Xu:
Few-shot Temporal Pruning Accelerates Diffusion Models for Text Generation. LREC/COLING 2024: 7259-7269
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LiuYCJ0LJSX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LiuYCJ0LJSX24
Chaohu Liu, Kun Yin, Haoyu Cao, Xinghua Jiang, Xin Li, Yinsong Liu, Deqiang Jiang, Xing Sun, Linli Xu:
HRVDA: High-Resolution Visual Document Assistant. CVPR 2024: 15534-15545
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LiWJGGCLJS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LiWJGGCLJS24
Xin Li, Yunfei Wu, Xinghua Jiang, Zhihao Guo, Mingming Gong, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun:
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models. CVPR 2024: 15546-15555
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/WangLQCJX24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/WangLQCJX24
Yubo Wang, Chaohu Liu, Yanqiu Qu, Haoyu Cao, Deqiang Jiang, Linli Xu:
Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models. ACM Multimedia 2024: 1072-1081
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-19014
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-19014
Xin Li, Yunfei Wu, Xinghua Jiang, Zhihao Guo, Mingming Gong, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun:
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models. CoRR abs/2402.19014 (2024)
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-06918
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-06918
Chaohu Liu, Kun Yin, Haoyu Cao, Xinghua Jiang, Xin Li, Yinsong Liu, Deqiang Jiang, Xing Sun, Linli Xu:
HRVDA: High-Resolution Visual Document Assistant. CoRR abs/2404.06918 (2024)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-12707
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-12707
Haoqiu Yan, Yongxin Zhu, Kai Zheng, Bing Liu, Haoyu Cao, Deqiang Jiang, Linli Xu:
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction. CoRR abs/2406.12707 (2024)
[i6]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-06699
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-06699
Yubo Wang, Chaohu Liu, Yanqiu Qu, Haoyu Cao, Deqiang Jiang, Linli Xu:
Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models. CoRR abs/2410.06699 (2024)
2023
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/CaoBLCYLLJS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/CaoBLCYLLJS23
Haoyu Cao, Changcun Bao, Chaohu Liu, Huang Chen, Kun Yin, Hao Liu, Yinsong Liu, Deqiang Jiang, Xing Sun:
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration. ICCV 2023: 19460-19470
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icdar/YuZCHLCLCKCDFHLYYLCDLLYZKSWB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icdar/YuZCHLCLCKCDFHLYYLCDLLYZKSWB23
Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai:
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images. ICDAR (2) 2023: 536-552
[i5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-03287
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-03287
Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai:
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images. CoRR abs/2306.03287 (2023)
[i4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-10408
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-10408
Wenwen Yu, Yuliang Liu, Xingkui Zhu, Haoyu Cao, Xing Sun, Xiang Bai:
Turning a CLIP Model into a Scene Text Spotter. CoRR abs/2308.10408 (2023)
[i3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-01131
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-01131
Haoyu Cao, Changcun Bao, Chaohu Liu, Huang Chen, Kun Yin, Hao Liu, Yinsong Liu, Deqiang Jiang, Xing Sun:
Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration. CoRR abs/2309.01131 (2023)
2022
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/CaoLMJGH0L022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/CaoLMJGH0L022
Haoyu Cao, Xin Li, Jiefeng Ma, Deqiang Jiang, Antai Guo, Yiqing Hu, Hao Liu, Yinsong Liu, Bo Ren:
Query-driven Generative Network for Document Information Extraction in the Wild. ACM Multimedia 2022: 4261-4271
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/LiZHCWJLR22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/LiZHCWJLR22
Xin Li, Yan Zheng, Yiqing Hu, Haoyu Cao, Yunfei Wu, Deqiang Jiang, Yinsong Liu, Bo Ren:
Relational Representation Learning in Visually-Rich Documents. ACM Multimedia 2022: 4614-4624
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/naacl/CaoMGHLJLR22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/naacl/CaoMGHLJLR22
Haoyu Cao, Jiefeng Ma, Antai Guo, Yiqing Hu, Hao Liu, Deqiang Jiang, Yinsong Liu, Bo Ren:
GMN: Generative Multi-modal Network for Practical Document Information Extraction. NAACL-HLT 2022: 3768-3778
[i2]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-02411
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-02411
Xin Li, Yan Zheng, Yiqing Hu, Haoyu Cao, Yunfei Wu, Deqiang Jiang, Yinsong Liu, Bo Ren:
Relational Representation Learning in Visually-Rich Documents. CoRR abs/2205.02411 (2022)
[i1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-04713
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-04713
Haoyu Cao, Jiefeng Ma, Antai Guo, Yiqing Hu, Hao Liu, Deqiang Jiang, Yinsong Liu, Bo Ren:
GMN: Generative Multi-modal Network for Practical Document Information Extraction. CoRR abs/2207.04713 (2022)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.