


default search action
Wei Xue 0002
Person information
- affiliation: Hong Kong Baptist University, Division of Emerging Interdisciplinary Areas, Hong Kong
- affiliation: Imperial College London, Department of Electrical and Electronic Engineering, UK
- affiliation (PhD 2015): Chinese Academy of Sciences (CAS), Pattern Recognition and Intelligent Systems from the Institute of Automation, Beijing, China
Other persons with the same name
- Wei Xue — disambiguation page
- Wei Xue 0001
— Jiangnan University, Institute of Automation, Key Laboratory of Advanced Process Control for Light Industry, Wuxi, China - Wei Xue 0003
— Tsinghua University, Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Beijing, China (and 2 more)
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2026
[j7]Xingqun Qi
, Hengyuan Zhang, Yatian Wang, Jiahao Pan, Chen Liu, Muyi Sun, Wei Xue, Shanghang Zhang, Sirui Han, Qifeng Liu, Yike Guo:
CoCoGesture: Towards coherent co-speech 3D gesture generation in the wild. Inf. Fusion 126: 103613 (2026)- 2025
[j6]Junkun Jiang
, Jie Chen
, Ho Yin Au
, Mingyuan Chen
, Wei Xue
, Yike Guo
:
Every Angle is Worth a Second Glance: Mining Kinematic Skeletal Structures From Multi-View Joint Cloud. IEEE Trans. Vis. Comput. Graph. 31(10): 7337-7349 (2025)
[c49]Chunyang Jiang
, Chi-Min Chan, Wei Xue, Qifeng Liu, Yike Guo:
Importance Weighting Can Help Large Language Models Self-Improve. AAAI 2025: 24257-24265
[c48]Zhen Ye, Peiwen Sun, Jiahe Lei, Hongzhan Lin, Xu Tan, Zheqi Dai, Qiuqiang Kong, Jianyi Chen, Jiahao Pan, Qifeng Liu, Yike Guo, Wei Xue:
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model. AAAI 2025: 25697-25705
[c47]Wei Li, Lujun Li, Mark G. Lee, Shengjie Sun, Lei Zhang, Wei Xue, Yike Guo:
BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios. ACL (Findings) 2025: 138-152
[c46]Chi-Min Chan, Chunpu Xu, Junqi Zhu, Jiaming Ji, Donghai Hong, Pengcheng Wen, Chunyang Jiang, Zhen Ye, Yaodong Yang, Wei Xue, Sirui Han, Yike Guo:
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA. ACL (Findings) 2025: 7433-7451
[c45]Huadai Liu, Jialei Wang, Rongjie Huang, Yang Liu, Heng Lu, Zhou Zhao, Wei Xue:
FlashAudio: Rectified Flow for Fast and High-Fidelity Text-to-Audio Generation. ACL (1) 2025: 13694-13710
[c44]Peng Li, Wangguandong Zheng, Yuan Liu, Tao Yu, Yangguang Li, Xingqun Qi, Xiaowei Chi, Siyu Xia, Yan-Pei Cao, Wei Xue, Wenhan Luo, Yike Guo:
PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing. CVPR 2025: 16008-16018
[c43]Zeyue Tian, Zhaoyang Liu, Ruibin Yuan, Jiahao Pan, Qifeng Liu, Xu Tan, Qifeng Chen, Wei Xue, Yike Guo:
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling. CVPR 2025: 18782-18793
[c42]Peijie Dong, Lujun Li, Yuedong Zhong, Dayou Du, Ruibo Fan, Yuhan Chen, Zhenheng Tang, Qiang Wang, Wei Xue, Yike Guo, Xiaowen Chu:
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs. ICLR 2025
[c41]Xingqun Qi, Yatian Wang, Hengyuan Zhang, Jiahao Pan, Wei Xue, Shanghang Zhang, Wenhan Luo, Qifeng Liu, Yike Guo:
Co3Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion. ICLR 2025
[c40]Peiwen Sun, Sitong Cheng, Xiangtai Li, Zhen Ye, Huadai Liu, Honggang Zhang, Wei Xue, Yike Guo:
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation. ICLR 2025
[c39]Xiaowei Chi, Chun-Kai Fan, Hengyuan Zhang, Xingqun Qi, Rongyu Zhang, Anthony Chen, Chi-Min Chan, Wei Xue, Qifeng Liu, Shanghang Zhang, Yike Guo:
Empowering World Models with Reflection for Embodied Video Prediction. ICML 2025
[c38]Hao Gu, Wei Li, Lujun Li, Qiyuan Zhu, Mark G. Lee, Shengjie Sun, Wei Xue, Yike Guo:
Delta Decompression for MoE-based LLMs Compression. ICML 2025
[c37]Wei Li, Lujun Li, Hao Gu, You-Liang Huang, Mark G. Lee, Shengjie Sun, Wei Xue, Yike Guo:
MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition. ICML 2025
[c36]Huadai Liu, Tianyi Luo, Kaicheng Luo, Qikai Jiang, Peiwen Sun, Jialei Wang, Rongjie Huang, Qian Chen, Wen Wang, Xiangtai Li, Shiliang Zhang, Zhijie Yan, Zhou Zhao, Wei Xue:
OmniAudio: Generating Spatial Audio from 360-Degree Video. ICML 2025
[i62]Junkun Jiang, Jie Chen, Ho Yin Au, Mingyuan Chen, Wei Xue, Yike Guo:
Every Angle Is Worth A Second Glance: Mining Kinematic Skeletal Structures from Multi-view Joint Cloud. CoRR abs/2502.02936 (2025)
[i61]Zhen Ye, Xinfa Zhu, Chi-Min Chan, Xinsheng Wang, Xu Tan
, Jiahe Lei, Yi Peng, Haohe Liu, Yizhu Jin, Zheqi Dai, Hongzhan Lin, Jianyi Chen, Xingjian Du, Liumeng Xue, Yunlin Chen, Zhifei Li, Lei Xie, Qiuqiang Kong, Yike Guo, Wei Xue:
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis. CoRR abs/2502.04128 (2025)
[i60]Xinyu Liu, Ailing Zeng, Wei Xue, Harry Yang, Wenhan Luo, Qifeng Liu, Yike Guo
:
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer. CoRR abs/2502.05979 (2025)
[i59]Liumeng Xue, Ziya Zhou, Jiahao Pan, Zixuan Li, Shuai Fan, Yinghao Ma, Sitong Cheng, Dongchao Yang, Haohan Guo, Yujia Xiao, Xinsheng Wang, Zixuan Shen, Chuanbo Zhu, Xinshen Zhang, Tianchi Liu, Ruibin Yuan, Zeyue Tian, Haohe Liu, Emmanouil Benetos, Ge Zhang, Yike Guo
, Wei Xue:
Audio-FLAN: A Preliminary Release. CoRR abs/2502.16584 (2025)
[i58]Hao Gu, Wei Li, Lujun Li, Qiyuan Zhu, Mark G. Lee, Shengjie Sun, Wei Xue, Yike Guo:
Delta Decompression for MoE-based LLMs Compression. CoRR abs/2502.17298 (2025)
[i57]Boyi Kang, Xinfa Zhu, Zihan Zhang, Zhen Ye, Mingshuai Liu, Ziqian Wang, Yike Zhu, Guobin Ma, Jun Chen, Longshuai Xiao, Chao Weng, Wei Xue, Lei Xie:
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement. CoRR abs/2503.00493 (2025)
[i56]Xinsheng Wang, Mingqi Jiang, Ziyang Ma, Ziyu Zhang, Songxiang Liu, Linqin Li, Zheng Liang, Qixi Zheng, Rui Wang, Xiaoqin Feng, Weizhen Bian, Zhen Ye, Sitong Cheng, Ruibin Yuan, Zhixian Zhao, Xinfa Zhu, Jiahao Pan, Liumeng Xue, Pengcheng Zhu, Yunlin Chen, Zhifei Li, Xie Chen, Lei Xie, Yike Guo
, Wei Xue:
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens. CoRR abs/2503.01710 (2025)
[i55]Ruibin Yuan, Hanfeng Lin, Shuyue Guo, Ge Zhang, Jiahao Pan, Yongyi Zang, Haohe Liu, Yiming Liang, Wenye Ma, Xingjian Du, Xinrun Du, Zhen Ye, Tianyu Zheng, Yinghao Ma, Minghao Liu, Zeyue Tian, Ziya Zhou, Liumeng Xue, Xingwei Qu, Yizhi Li, Shangda Wu, Tianhao Shen, Ziyang Ma, Jun Zhan, Chunhui Wang, Yatian Wang, Xiaowei Chi, Xinyue Zhang, Zhenzhu Yang, Xiangzhou Wang, Shansong Liu, Lingrui Mei, Peng Li, Junjie Wang, Jianwei Yu, Guojian Pang, Xu Li, Zihao Wang, Xiaohuan Zhou, Lijun Yu, Emmanouil Benetos, Yong Chen, Chenghua Lin, Xie Chen, Gus Xia, Zhaoxiang Zhang, Chao Zhang, Wenhu Chen, Xinyu Zhou, Xipeng Qiu, Roger B. Dannenberg, Zheng-Jia Liu, Jian Yang, Wenhao Huang, Wei Xue, Xu Tan, Yike Guo:
YuE: Scaling Open Foundation Models for Long-Form Music Generation. CoRR abs/2503.08638 (2025)
[i54]Zeyue Tian, Yizhu Jin, Zhaoyang Liu
, Ruibin Yuan, Xu Tan, Qifeng Chen, Wei Xue, Yike Guo
:
AudioX: Diffusion Transformer for Anything-to-Audio Generation. CoRR abs/2503.10522 (2025)
[i53]Huadai Liu, Tianyi Luo, Qikai Jiang, Kaicheng Luo, Peiwen Sun, Jialei Wang, Rongjie Huang, Qian Chen, Wen Wang, Xiangtai Li, Shiliang Zhang, Zhijie Yan, Zhou Zhao, Wei Xue:
OmniAudio: Generating Spatial Audio from 360-Degree Video. CoRR abs/2504.14906 (2025)
[i52]Xingqun Qi, Yatian Wang, Hengyuan Zhang, Jiahao Pan, Wei Xue, Shanghang Zhang, Wenhan Luo, Qifeng Liu, Yike Guo:
Co3Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion. CoRR abs/2505.01746 (2025)
[i51]Peng Li, Suizhi Ma, Jialiang Chen, Yuan Liu, Chongyi Zhang, Wei Xue, Wenhan Luo, Alla Sheffer, Wenping Wang, Yike Guo:
CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation. CoRR abs/2505.07003 (2025)
[i50]Chi-Min Chan, Chunpu Xu, Jiaming Ji, Zhen Ye, Pengcheng Wen, Chunyang Jiang, Yaodong Yang, Wei Xue, Sirui Han, Yike Guo:
J1: Exploring Simple Test-Time Scaling for LLM-as-a-Judge. CoRR abs/2505.11875 (2025)
[i49]Ziyang Ma, Yinghao Ma, Yanqiao Zhu, Chen Yang, Yi-Wen Chao, Ruiyang Xu, Wenxi Chen, Yuanzhe Chen, Zhuo Chen, Jian Cong, Kai Li, Keliang Li, Siyou Li, Xinfeng Li, Xiquan Li, Zheng Lian, Yuzhe Liang, Minghao Liu, Zhikang Niu, Tianrui Wang, Yuping Wang, Yuxuan Wang, Yihao Wu, Guanrou Yang, Jianwei Yu, Ruibin Yuan, Zhisheng Zheng, Ziya Zhou, Haina Zhu, Wei Xue, Emmanouil Benetos, Kai Yu, Chng Eng Siong, Xie Chen:
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix. CoRR abs/2505.13032 (2025)
[i48]Chunyang Jiang, Chi-Min Chan, Yiyang Cai, Yulong Liu, Wei Xue, Yike Guo:
Graceful Forgetting in Generative Language Models. CoRR abs/2505.19715 (2025)
[i47]Huadai Liu, Jialei Wang, Kaicheng Luo, Wen Wang, Qian Chen, Zhou Zhao, Wei Xue:
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing. CoRR abs/2506.21448 (2025)
[i46]Yizhu Jin, Zhen Ye, Zeyue Tian, Haohe Liu, Qiuqiang Kong, Yike Guo, Wei Xue:
Inference-time Scaling for Diffusion-based Audio Super-resolution. CoRR abs/2508.02391 (2025)
[i45]Wenjie Tian, Xinfa Zhu, Hanke Xie, Zhen Ye, Wei Xue, Lei Xie:
Llasa+: Free Lunch for Accelerated and Streaming Llama-Based Speech Synthesis. CoRR abs/2508.06262 (2025)
[i44]Longhao Li, Zhao Guo, Hongjie Chen, Yuhang Dai, Ziyu Zhang, Hongfei Xue, Tianlun Zuo, Chengyou Wang, Shuiyuan Wang, Jie Li, Jian Kang, Xin Xu, Hui Bu, Binbin Zhang, Ruibin Yuan, Ziya Zhou, Wei Xue, Lei Xie:
WenetSpeech-Yue: A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation. CoRR abs/2509.03959 (2025)
[i43]Sitong Cheng, Weizhen Bian, Xinsheng Wang, Ruibin Yuan, Jianyi Chen, Shunshun Yin, Yike Guo, Wei Xue:
UniSS: Unified Expressive Speech-to-Speech Translation with Your Voice. CoRR abs/2509.21144 (2025)
[i42]Xiaowei Chi, Peidong Jia, Chun-Kai Fan, Xiaozhu Ju, Weishi Mi, Kevin Zhang, Zhiyuan Qin, Wanxin Tian, Kuangzhi Ge, Hao Li, Zezhong Qian, Anthony Chen, Qiang Zhou, Yueru Jia, Jiaming Liu, Yong Dai, Qingpo Wuwu, Chengyu Bai, Yu-Kai Wang, Ying Li, Lizhang Chen, Yong Bao, Zhiyuan Jiang, Jiacheng Zhu, Kai Tang, Ruichuan An, Yulin Luo, Qiuxuan Feng, Siyuan Zhou, Chi-Min Chan, Chengkai Hou, Wei Xue, Sirui Han, Yike Guo, Shanghang Zhang, Jian Tang:
WoW: Towards a World omniscient World model Through Embodied Interaction. CoRR abs/2509.22642 (2025)
[i41]Chunyang Jiang, Yonggang Zhang, Yiyang Cai, Chi-Min Chan, Yulong Liu, Mingming Chen, Wei Xue, Yike Guo:
Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks. CoRR abs/2509.23067 (2025)
[i40]Yike Zhu, Boyi Kang, Ziqian Wang, Xingchen Li, Zihan Zhang, Wenjie Li, Longshuai Xiao, Wei Xue, Lei Xie:
MeanFlowSE: One-Step Generative Speech Enhancement via MeanFlow. CoRR abs/2509.23299 (2025)
[i39]Yujia Xiao, Liumeng Xue, Lei He, Xinyi Chen, Aemon Yat Fei Chiu, Wenjie Tian, Shaofei Zhang, Qiuqiang Kong, Xinfa Zhu, Wei Xue, Tan Lee:
PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation. CoRR abs/2510.00485 (2025)- 2024
[j5]Xinyuan Qian
, Wei Xue
, Qiquan Zhang
, Ruijie Tao
, Haizhou Li
:
Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech. IEEE Trans. Multim. 26: 4480-4489 (2024)
[c35]Dongmei Zhang, Chang Li, Renrui Zhang, Shenghao Xie, Wei Xue, Xiaodong Xie, Shanghang Zhang:
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection. AAAI 2024: 16723-16731
[c34]Ruibin Yuan, Hanfeng Lin, Yi Wang, Zeyue Tian, Shangda Wu, Tianhao Shen, Ge Zhang, Yuhang Wu, Cong Liu, Ziya Zhou, Liumeng Xue, Ziyang Ma, Qin Liu, Tianyu Zheng, Yizhi Li, Yinghao Ma, Yiming Liang, Xiaowei Chi, Ruibo Liu, Zili Wang, Chenghua Lin, Qifeng Liu, Tao Jiang, Wenhao Huang, Wenhu Chen, Jie Fu, Emmanouil Benetos, Gus Xia, Roger B. Dannenberg, Wei Xue, Shiyin Kang, Yike Guo
:
ChatMusician: Understanding and Generating Music Intrinsically with LLM. ACL (Findings) 2024: 6252-6271
[c33]Xingqun Qi, Jiahao Pan, Peng Li, Ruibin Yuan, Xiaowei Chi, Mengfei Li, Wenhan Luo, Wei Xue, Shanghang Zhang, Qifeng Liu, Yike Guo
:
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-Speech Gesture Generation. CVPR 2024: 10424-10434
[c32]Lujun Li
, Zimian Wei, Peijie Dong, Wenhan Luo, Wei Xue, Qifeng Liu, Yike Guo
:
AttnZero: Efficient Attention Discovery for Vision Transformers. ECCV (5) 2024: 20-37
[c31]Lujun Li, Haosen Sun
, Shiwen Li, Peijie Dong, Wenhan Luo, Wei Xue, Qifeng Liu, Yike Guo
:
Auto-GAS: Automated Proxy Discovery for Training-Free Generative Architecture Search. ECCV (5) 2024: 38-55
[c30]Jianyi Chen, Zheqi Dai, Zhen Ye, Xu Tan, Qifeng Liu, Yike Guo, Wei Xue:
PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain. EMNLP (Findings) 2024: 4253-4263
[c29]Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu:
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate. ICLR 2024
[c28]Jiaming Liu, Senqiao Yang, Peidong Jia, Renrui Zhang, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang:
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation. ICLR 2024
[c27]Lujun Li, Yufan Bao, Peijie Dong, Chuanguang Yang, Anggeng Li, Wenhan Luo, Qifeng Liu, Wei Xue, Yike Guo:
DetKDS: Knowledge Distillation Search for Object Detectors. ICML 2024
[c26]Jianyi Chen, Wei Xue, Xu Tan, Zhen Ye, Qifeng Liu, Yike Guo:
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation. IJCAI 2024: 7618-7626
[c25]Yiwen Lu, Zhen Ye, Wei Xue, Xu Tan
, Qifeng Liu, Yike Guo:
COMOSVC: Consistency Model-Based Singing Voice Conversion. ISCSLP 2024: 184-188
[c24]Ziya Zhou, Yuhang Wu, Zhiyue Wu, Xinyue Zhang, Ruibin Yuan, Yinghao Ma, Lu Wang, Emmanouil Benetos, Wei Xue, Yike Guo:
Can LLMs "Reason" in Music? an Evaluation of LLMs' Capability of Music Understanding and Generation. ISMIR 2024: 103-110
[c23]Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo:
ComposerX: Multi-Agent Symbolic Music Composition With LLMs. ISMIR 2024: 669-679
[c22]Zhen Ye
, Zeqian Ju
, Haohe Liu
, Xu Tan
, Jianyi Chen
, Yiwen Lu
, Peiwen Sun
, Jiahao Pan
, Weizhen Bian
, Shulin He
, Wei Xue
, Qifeng Liu
, Yike Guo
:
FlashSpeech: Efficient Zero-Shot Speech Synthesis. ACM Multimedia 2024: 6998-7007
[c21]Lujun Li, Peijie Dong, Zhenheng Tang, Xiang Liu, Qiang Wang, Wenhan Luo, Wei Xue, Qifeng Liu, Xiaowen Chu, Yike Guo:
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models. NeurIPS 2024
[c20]Peng Li, Yuan Liu, Xiaoxiao Long, Feihu Zhang, Cheng Lin, Mengfei Li, Xingqun Qi, Shanghang Zhang, Wei Xue, Wenhan Luo, Ping Tan, Wenping Wang, Qifeng Liu, Yike Guo:
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention. NeurIPS 2024
[c19]Min Zeng, Haiqin Yang, Wei Xue, Qifeng Liu, Yike Guo:
Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP. UAI 2024: 4096-4108
[i38]Yiwen Lu, Zhen Ye, Wei Xue, Xu Tan
, Qifeng Liu, Yike Guo
:
CoMoSVC: Consistency Model-based Singing Voice Conversion. CoRR abs/2401.01792 (2024)
[i37]Ruibin Yuan, Hanfeng Lin, Yi Wang, Zeyue Tian, Shangda Wu, Tianhao Shen
, Ge Zhang, Yuhang Wu, Cong Liu, Ziya Zhou, Ziyang Ma, Liumeng Xue, Ziyu Wang, Qin Liu, Tianyu Zheng, Yizhi Li, Yinghao Ma, Yiming Liang, Xiaowei Chi, Ruibo Liu, Zili Wang, Pengfei Li, Jingcheng Wu, Chenghua Lin, Qifeng Liu, Tao Jiang, Wenhao Huang, Wenhu Chen, Emmanouil Benetos, Jie Fu, Gus Xia
, Roger B. Dannenberg, Wei Xue, Shiyin Kang, Yike Guo
:
ChatMusician: Understanding and Generating Music Intrinsically with LLM. CoRR abs/2402.16153 (2024)
[i36]Chi-Min Chan
, Chunpu Xu, Ruibin Yuan, Hongyin Luo, Wei Xue, Yike Guo
, Jie Fu:
RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation. CoRR abs/2404.00610 (2024)
[i35]Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia
, Emmanouil Benetos, Xiang Yue
, Chenghua Lin, Xu Tan, Stephen W. Huang, Wenhu Chen, Jie Fu, Ge Zhang:
MuPT: A Generative Symbolic Music Pretrained Transformer. CoRR abs/2404.06393 (2024)
[i34]Zhen Ye, Zeqian Ju, Haohe Liu, Xu Tan
, Jianyi Chen, Yiwen Lu, Peiwen Sun, Jiahao Pan, Weizhen Bian, Shulin He, Qifeng Liu, Yike Guo
, Wei Xue:
FlashSpeech: Efficient Zero-Shot Speech Synthesis. CoRR abs/2404.14700 (2024)
[i33]Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo
:
ComposerX: Multi-Agent Symbolic Music Composition with LLMs. CoRR abs/2404.18081 (2024)
[i32]Jianyi Chen, Wei Xue, Xu Tan
, Zhen Ye, Qifeng Liu, Yike Guo
:
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation. CoRR abs/2405.07682 (2024)
[i31]Yingqing He, Zhaoyang Liu
, Jingye Chen, Zeyue Tian, Hongyu Liu, Xiaowei Chi, Runtao Liu, Ruibin Yuan, Yazhou Xing, Wenhai Wang, Jifeng Dai, Yong Zhang, Wei Xue, Qifeng Liu, Yike Guo
, Qifeng Chen:
LLMs Meet Multimodal Generation and Editing: A Survey. CoRR abs/2405.19334 (2024)
[i30]Zeyue Tian, Zhaoyang Liu
, Ruibin Yuan, Jiahao Pan, Xiaoqiang Huang, Qifeng Liu, Xu Tan
, Qifeng Chen, Wei Xue, Yike Guo
:
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling. CoRR abs/2406.04321 (2024)
[i29]Mengfei Li, Xiaoxiao Long, Yixun Liang, Weiyu Li, Yuan Liu, Peng Li, Xiaowei Chi, Xingqun Qi, Wei Xue, Wenhan Luo
, Qifeng Liu, Yike Guo:
M-LRM: Multi-view Large Reconstruction Model. CoRR abs/2406.07648 (2024)
[i28]Xiaowei Chi, Yatian Wang, Aosong Cheng, Pengjun Fang, Zeyue Tian, Yingqing He, Zhaoyang Liu, Xingqun Qi, Jiahao Pan, Rongyu Zhang, Mengfei Li, Ruibin Yuan, Yanbing Jiang, Wei Xue, Wenhan Luo, Qifeng Chen, Shanghang Zhang, Qifeng Liu, Yike Guo
:
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions. CoRR abs/2407.20962 (2024)
[i27]Ziya Zhou, Yuhang Wu, Zhiyue Wu, Xinyue Zhang, Ruibin Yuan, Yinghao Ma, Lu Wang, Emmanouil Benetos, Wei Xue, Yike Guo
:
Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation. CoRR abs/2407.21531 (2024)
[i26]Peijie Dong, Lujun Li, Dayou Du, Yuhan Chen, Zhenheng Tang, Qiang Wang, Wei Xue, Wenhan Luo, Qifeng Liu, Yike Guo
, Xiaowen Chu:
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs. CoRR abs/2408.01803 (2024)
[i25]Chunyang Jiang, Chi-Min Chan, Wei Xue, Qifeng Liu, Yike Guo
:
Importance Weighting Can Help Large Language Models Self-Improve. CoRR abs/2408.09849 (2024)
[i24]Cheng Lin, Lujun Li, Dezhi Li, Jie Zou, Wei Xue, Yike Guo
:
NoRA: Nested Low-Rank Adaptation for Efficient Fine-Tuning Large Models. CoRR abs/2408.10280 (2024)
[i23]Chi-Min Chan, Jianxuan Yu, Weize Chen, Chunyang Jiang, Xinyu Liu, Weijie Shi
, Zhiyuan Liu, Wei Xue, Yike Guo
:
AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems. CoRR abs/2408.14972 (2024)
[i22]Zhen Ye, Peiwen Sun, Jiahe Lei, Hongzhan Lin, Xu Tan
, Zheqi Dai, Qiuqiang Kong, Jianyi Chen, Jiahao Pan, Qifeng Liu, Yike Guo
, Wei Xue:
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model. CoRR abs/2408.17175 (2024)
[i21]Xinyu Liu, Yingqing He, Lanqing Guo, Xiang Li, Bu Jin, Peng Li, Yan Li, Chi-Min Chan, Qifeng Chen, Wei Xue, Wenhan Luo, Qifeng Liu, Yike Guo
:
HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts. CoRR abs/2409.02919 (2024)
[i20]Peng Li, Wangguandong Zheng, Yuan Liu, Tao Yu, Yangguang Li, Xingqun Qi, Mengfei Li, Xiaowei Chi, Siyu Xia, Wei Xue, Wenhan Luo, Qifeng Liu, Yike Guo
:
PSHuman: Photorealistic Single-view Human Reconstruction using Cross-Scale Diffusion. CoRR abs/2409.10141 (2024)
[i19]Tianyu Wu, Lingrui Mei, Ruibin Yuan, Lujun Li, Wei Xue, Yike Guo
:
You Know What I'm Saying: Jailbreak Attack via Implicit Reference. CoRR abs/2410.03857 (2024)
[i18]Peiwen Sun, Sitong Cheng
, Xiangtai Li, Zhen Ye, Huadai Liu, Honggang Zhang, Wei Xue, Yike Guo
:
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation. CoRR abs/2410.10676 (2024)
[i17]Huadai Liu, Jialei Wang, Rongjie Huang, Yang Liu, Heng Lu, Wei Xue, Zhou Zhao:
FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation. CoRR abs/2410.12266 (2024)
[i16]Xiaowei Chi, Hengyuan Zhang, Chun-Kai Fan, Xingqun Qi, Rongyu Zhang, Anthony Chen, Chi-Min Chan, Wei Xue, Wenhan Luo, Shanghang Zhang, Yike Guo
:
EVA: An Embodied World Model for Future Video Anticipation. CoRR abs/2410.15461 (2024)
[i15]Ziyang Jiang, Xinyuan Qian, Jiahe Lei, Zexu Pan, Wei Xue, Xu-Cheng Yin:
pTSE-T: Presentation Target Speaker Extraction using Unaligned Text Cues. CoRR abs/2411.03109 (2024)
[i14]Yiyang Cai, Zhengkai Jiang, Yulong Liu, Chunyang Jiang, Wei Xue, Wenhan Luo, Yike Guo:
Foundation Cures Personalization: Recovering Facial Personalized Models' Prompt Consistency. CoRR abs/2411.15277 (2024)
[i13]Yan Li, Ziya Zhou, Zhiqiang Wang, Wei Xue, Wenhan Luo, Yike Guo:
SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model. CoRR abs/2412.03430 (2024)- 2023
[c18]Weizhen Bian, Yijin Song, Nianzhen Gu, Tin Yan Chan, Tsz To Lo, Tsun Sun Li, King Chak Wong, Wei Xue
, Roberto Alonso Trillo
:
MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System. AAAI 2023: 16057-16062
[c17]Guanjun Li, Wei Xue
, Wenju Liu, Jiangyan Yi, Jianhua Tao:
GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios. ICASSP 2023: 1-5
[c16]Zhen Ye, Wei Xue
, Xu Tan
, Qifeng Liu, Yike Guo
:
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis Based on Frequency Modulation. IJCAI 2023: 5869-5877
[c15]Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger B. Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo:
LyricWhiz: Robust Multilingual Zero-Shot Lyrics Transcription by Whispering to ChatGPT. ISMIR 2023: 343-351
[c14]Zhen Ye
, Wei Xue
, Xu Tan
, Jie Chen
, Qifeng Liu
, Yike Guo
:
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model. ACM Multimedia 2023: 1831-1839
[c13]Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger B. Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu:
MARBLE: Music Audio Representation Benchmark for Universal Evaluation. NeurIPS 2023
[i12]Wei Xue, Yiwen Wang, Qifeng Liu, Yike Guo
:
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings. CoRR abs/2305.05401 (2023)
[i11]Zhen Ye, Wei Xue, Xu Tan
, Jie Chen, Qifeng Liu, Yike Guo
:
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model. CoRR abs/2305.06908 (2023)
[i10]Zhen Ye, Wei Xue, Xu Tan
, Qifeng Liu, Yike Guo
:
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis based on Frequency Modulation. CoRR abs/2305.12868 (2023)
[i9]Jiaming Liu, Senqiao Yang, Peidong Jia, Ming Lu, Yandong Guo, Wei Xue, Shanghang Zhang:
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation. CoRR abs/2306.04344 (2023)
[i8]Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger B. Dannenberg, Wenhu Chen, Gus Xia
, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo
, Jie Fu:
MARBLE: Music Audio Representation Benchmark for Universal Evaluation. CoRR abs/2306.10548 (2023)
[i7]Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi Li, Ge Zhang, Si Liu, Roger B. Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo
:
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT. CoRR abs/2306.17103 (2023)
[i6]Chi-Min Chan
, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu:
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate. CoRR abs/2308.07201 (2023)
[i5]Min Zeng, Wei Xue, Qifeng Liu, Yike Guo
:
Continual Learning with Dirichlet Generative-based Rehearsal. CoRR abs/2309.06917 (2023)
[i4]Xingqun Qi, Jiahao Pan, Peng Li, Ruibin Yuan, Xiaowei Chi, Mengfei Li, Wenhan Luo, Wei Xue, Shanghang Zhang, Qifeng Liu, Yike Guo
:
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation. CoRR abs/2311.17532 (2023)
[i3]Dongmei Zhang, Chang Li, Ray Zhang, Shenghao Xie, Wei Xue, Xiaodong Xie, Shanghang Zhang:
FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection. CoRR abs/2312.14465 (2023)- 2022
[j4]Xinyuan Qian
, Qiquan Zhang
, Guohui Guan, Wei Xue
:
Deep Audio-Visual Beamforming for Speaker Localization. IEEE Signal Process. Lett. 29: 1132-1136 (2022)
[i2]Yike Guo
, Qifeng Liu, Jie Chen, Wei Xue, Henrik J. Jensen, Fernando Rosas, Jeffrey Shaw, Xing Wu, Jiji Zhang, Jianliang Xu:
Pathway to Future Symbiotic Creativity. CoRR abs/2209.02388 (2022)- 2021
[j3]Wei Xue
, Alastair H. Moore
, Mike Brookes
, Patrick A. Naylor
:
Speech Enhancement Based on Modulation-Domain Parametric Multichannel Kalman Filtering. IEEE ACM Trans. Audio Speech Lang. Process. 29: 393-405 (2021)
[c12]Li He, Wei Xue
:
Causal System Identification based Compensation for Reverberation-Robust DOA Estimation. EUSIPCO 2021: 1885-1889
[c11]Wei Xue
, Gang Quan, Chao Zhang, Guohong Ding, Xiaodong He, Bowen Zhou:
Neural Kalman Filtering for Speech Enhancement. ICASSP 2021: 7108-7112- 2020
[c10]Ying Tong, Wei Xue
, Shanluo Huang, Fan Lu, Chao Zhang, Guohong Ding, Xiaodong He:
The JD AI Speaker Verification System for the FFSVC 2020 Challenge. INTERSPEECH 2020: 3476-3480
[c9]Wei Xue
, Ying Tong, Chao Zhang, Guohong Ding, Xiaodong He, Bowen Zhou:
Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning. INTERSPEECH 2020: 5091-5095
[i1]Wei Xue, Gang Quan, Chao Zhang, Guohong Ding, Xiaodong He, Bowen Zhou:
Neural Kalman Filtering for Speech Enhancement. CoRR abs/2007.13962 (2020)
2010 – 2019
- 2019
[j2]Alastair H. Moore
, Wei Xue
, Patrick A. Naylor
, Mike Brookes
:
Noise Covariance Matrix Estimation for Rotating Microphone Arrays. IEEE ACM Trans. Audio Speech Lang. Process. 27(3): 519-530 (2019)
[c8]Wei Xue, Ying Tong, Guohong Ding, Chao Zhang, Tao Ma, Xiaodong He, Bowen Zhou:
Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation. INTERSPEECH 2019: 2693-2697- 2018
[j1]Wei Xue
, Alastair H. Moore
, Mike Brookes
, Patrick A. Naylor
:
Modulation-Domain Multichannel Kalman Filtering for Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 26(10): 1833-1847 (2018)
[c7]Alastair H. Moore, Wei Xue, Patrick A. Naylor
, Mike Brookes
:
Estimation of the Noise Covariance Matrix for Rotating Sensor Arrays. ACSSC 2018: 1936-1941
[c6]Wei Xue, Alastair H. Moore, Mike Brookes
, Patrick A. Naylor
:
Modulation-Domain Parametric Multichannel Kalman Filtering for Speech Enhancement. EUSIPCO 2018: 2509-2513
[c5]Wei Xue, Alastair H. Moore, Mike Brookes
, Patrick A. Naylor
:
Multichannel Kalman Filtering for Speech Ehnancement. ICASSP 2018: 41-45
[c4]Alastair H. Moore, Leo Lightburn, Wei Xue, Patrick A. Naylor
, Mike Brookes
:
Binaural Mask-Informed Speech Enhancement for Hearing AIDS with Head Tracking. IWAENC 2018: 461-465- 2017
[c3]Wei Xue, Mike Brookes, Patrick A. Naylor
:
Frequency-domain under-modelled blind system identification based on cross power spectrum and sparsity regularization. ICASSP 2017: 591-595- 2016
[c2]Wei Xue, Mike Brookes, Patrick A. Naylor
:
Cross-correlation based under-modelled multichannel blind acoustic system identification with sparsity regularization. EUSIPCO 2016: 718-722
[c1]Wei Xue, Mike Brookes
, Patrick A. Naylor
:
Under-modelled blind system identification for time delay estimation in reverberant environments. IWAENC 2016: 1-5
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-12-19 02:41 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







