default search action
Shiliang Zhang
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j65]Zhen Yang, Jun Yue, Pedram Ghamisi, Shiliang Zhang, Jiayi Ma, Leyuan Fang:
Open Set Recognition in Real World. Int. J. Comput. Vis. 132(8): 3208-3231 (2024) - [j64]Rufei Ren, Yushuai Li, Qiuye Sun, Shiliang Zhang, David Wenzhong Gao, Sabita Maharjan:
Switched Surplus-Based Distributed Security Dispatch for Smart Grid With Persistent Packet Loss. IEEE Internet Things J. 11(4): 6185-6198 (2024) - [j63]Xiaotian Yu, Hanling Yi, Qie Tang, Kun Huang, Wenze Hu, Shiliang Zhang, Xiaoyu Wang:
Graph-based social relation inference with multi-level conditional attention. Neural Networks 173: 106216 (2024) - [j62]Shiyu Xuan, Shiliang Zhang:
Intra-Inter Domain Similarity for Unsupervised Person Re-Identification. IEEE Trans. Pattern Anal. Mach. Intell. 46(3): 1711-1726 (2024) - [j61]Xuehui Ma, Shiliang Zhang, Yushuai Li, Fucai Qian, Zhiyong Sun, Tingwen Huang:
Adaptive Robust Tracking Control With Active Learning for Linear Systems With Ellipsoidal Bounded Uncertainties. IEEE Trans. Autom. Control. 69(11): 8096-8103 (2024) - [j60]Shunan Mao, Shiliang Zhang:
Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction. IEEE Trans. Image Process. 33: 2614-2626 (2024) - [j59]Shiyu Xuan, Ming Yang, Shiliang Zhang:
Adapting Vision-Language Models via Learning to Inject Knowledge. IEEE Trans. Image Process. 33: 5798-5809 (2024) - [j58]Zhanzhou Feng, Jiaming Xu, Lei Ma, Shiliang Zhang:
Efficient Video Transformers via Spatial-temporal Token Merging for Action Recognition. ACM Trans. Multim. Comput. Commun. Appl. 20(4): 120:1-120:21 (2024) - [c128]Cong Cong, Shiyu Xuan, Sidong Liu, Shiliang Zhang, Maurice Pagnucco, Yang Song:
Decoupled Optimisation for Long-Tailed Visual Recognition. AAAI 2024: 1380-1388 - [c127]Shiyu Xuan, Shiliang Zhang:
Decoupled Contrastive Learning for Long-Tailed Recognition. AAAI 2024: 6396-6403 - [c126]Junwei Zhao, Shiliang Zhang, Zhaofei Yu, Tiejun Huang:
Recognizing Ultra-High-Speed Moving Objects with Bio-Inspired Spike Camera. AAAI 2024: 7478-7486 - [c125]Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen:
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation. ACL (Findings) 2024: 15747-15760 - [c124]Dongkai Wang, Shiyu Xuan, Shiliang Zhang:
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model. CVPR 2024: 614-623 - [c123]Dongkai Wang, Shiliang Zhang:
Spatial-Aware Regression for Keypoint Localization. CVPR 2024: 624-633 - [c122]Shiyu Xuan, Qingpei Guo, Ming Yang, Shiliang Zhang:
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs. CVPR 2024: 13838-13848 - [c121]Zehong Ma, Shiliang Zhang, Longhui Wei, Qi Tian:
OVMR: Open-Vocabulary Recognition with Multi-Modal References. CVPR 2024: 16571-16581 - [c120]Xuehui Ma, Yutong Chen, Shiliang Zhang, Yushuai Li, Fucai Qian, Zhiyong Sun:
Robust Quadratic Optimal Control of Linear Systems with Ellipsoid-Set Learning. ECC 2024: 2125-2131 - [c119]Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng:
FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech Codec. ICASSP 2024: 591-595 - [c118]Fan Yu, Haoxu Wang, Ziyang Ma, Shiliang Zhang:
Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech Recognition. ICASSP 2024: 7940-7944 - [c117]Xian Shi, Yexin Yang, Zerui Li, Yanni Chen, Zhifu Gao, Shiliang Zhang:
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability. ICASSP 2024: 10346-10350 - [c116]Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang:
LCB-Net: Long-Context Biasing for Audio-Visual Speech Recognition. ICASSP 2024: 10621-10625 - [c115]Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Yukun Ma, Hai Yu, Jiaqing Liu, Chong Zhang:
Loss Masking Is Not Needed In Decoder-Only Transformer For Discrete-Token-Based ASR. ICASSP 2024: 11056-11060 - [c114]Haoxu Wang, Fan Yu, Xian Shi, Yuezhang Wang, Shiliang Zhang, Ming Li:
SlideSpeech: A Large Scale Slide-Enriched Audio-Visual Corpus. ICASSP 2024: 11076-11080 - [c113]Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen:
Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition. ICASSP 2024: 11146-11150 - [c112]Zimo Liu, Kangjun Liu, Mingyue Guo, Shiliang Zhang, Yaowei Wang:
CoTuning: A Large-Small Model Collaborating Distillation Framework for Better Model Generalization. ACM Multimedia 2024: 10487-10496 - [c111]Hui Zhang, Shiliang Zhang, Sabita Maharjan, Yan Zhang:
P4S: Privacy-Preserving Personalized Pricing Scheme for Smart Grid. SmartGridComm 2024: 21-26 - [c110]Shiliang Zhang, Sabita Maharjan, Raul Shahi, Xuehui Ma:
The impact of integration of renewable energy on imbalance settlement: Resilience analysis. SmartGridComm 2024: 98-104 - [i103]Hongfei Xue, Yuhao Liang, Bingshen Mu, Shiliang Zhang, Mengzhe Chen, Qian Chen, Lei Xie:
E-chat: Emotion-sensitive Spoken Dialogue System with Large Language Models. CoRR abs/2401.00475 (2024) - [i102]Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang:
LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition. CoRR abs/2401.06390 (2024) - [i101]Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, Jiaming Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen:
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity. CoRR abs/2402.08846 (2024) - [i100]Shiyu Xuan, Shiliang Zhang:
Decoupled Contrastive Learning for Long-Tailed Recognition. CoRR abs/2403.06151 (2024) - [i99]Songnan Yang, Xiaohui Zhang, Shiliang Zhang, Xuehui Ma, Wenqi Bai, Yushuai Li, Tingwen Huang:
A Bionic Data-driven Approach for Long-distance Underwater Navigation with Anomaly Resistance. CoRR abs/2403.08808 (2024) - [i98]Haoyu Wang, Zhilu Zhang, Donglin Di, Shiliang Zhang, Wangmeng Zuo:
MV-VTON: Multi-View Virtual Try-On with Diffusion Models. CoRR abs/2404.17364 (2024) - [i97]Dongkai Wang, Shiyu Xuan, Shiliang Zhang:
LocLLM: Exploiting Generalizable Human Keypoint Localization via Large Language Model. CoRR abs/2406.04659 (2024) - [i96]Zehong Ma, Shiliang Zhang, Longhui Wei, Qi Tian:
OVMR: Open-Vocabulary Recognition with Multi-Modal References. CoRR abs/2406.04675 (2024) - [i95]Guanrou Yang, Ziyang Ma, Fan Yu, Zhifu Gao, Shiliang Zhang, Xie Chen:
MaLa-ASR: Multimedia-Assisted LLM-Based ASR. CoRR abs/2406.05839 (2024) - [i94]Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Hai Yu, Jiaqing Liu, Yukun Ma, Chong Zhang:
Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers. CoRR abs/2406.11274 (2024) - [i93]Keyu An, Qian Chen, Chong Deng, Zhihao Du, Changfeng Gao, Zhifu Gao, Yue Gu, Ting He, Hangrui Hu, Kai Hu, Shengpeng Ji, Yabin Li, Zerui Li, Heng Lu, Haoneng Luo, Xiang Lv, Bin Ma, Ziyang Ma, Chongjia Ni, Changhe Song, Jiaqi Shi, Xian Shi, Hao Wang, Wen Wang, Yuxuan Wang, Zhangyu Xiao, Zhijie Yan, Yexin Yang, Bin Zhang, Qinglin Zhang, Shiliang Zhang, Nan Zhao, Siqi Zheng:
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs. CoRR abs/2407.04051 (2024) - [i92]Zhihao Du, Qian Chen, Shiliang Zhang, Kai Hu, Heng Lu, Yexin Yang, Hangrui Hu, Siqi Zheng, Yue Gu, Ziyang Ma, Zhifu Gao, Zhijie Yan:
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens. CoRR abs/2407.05407 (2024) - [i91]Cong Cong, Shiyu Xuan, Sidong Liu, Maurice Pagnucco, Shiliang Zhang, Yang Song:
Dataset Distillation for Histopathology Image Classification. CoRR abs/2408.09709 (2024) - [i90]Mingyu Cui, Yifan Yang, Jiajun Deng, Jiawen Kang, Shujie Hu, Tianzi Wang, Zhaoqing Li, Shiliang Zhang, Xie Chen, Xunying Liu:
Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR. CoRR abs/2409.08797 (2024) - [i89]Keyu An, Zerui Li, Zhifu Gao, Shiliang Zhang:
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition. CoRR abs/2409.17746 (2024) - [i88]Keyu An, Shiliang Zhang, Zhijie Yan:
Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study. CoRR abs/2409.17750 (2024) - [i87]Wenqi Bai, Xiaohui Zhang, Shiliang Zhang, Songnan Yang, Yushuai Li, Tingwen Huang:
Long-distance Geomagnetic Navigation in GNSS-denied Environments with Deep Reinforcement Learning. CoRR abs/2410.15837 (2024) - [i86]Guanrou Yang, Fan Yu, Ziyang Ma, Zhihao Du, Zhifu Gao, Shiliang Zhang, Xie Chen:
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap. CoRR abs/2410.16726 (2024) - 2023
- [j57]Dongkai Wang, Shiliang Zhang:
Contextual Instance Decoupling for Instance-Level Human Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 45(8): 9520-9533 (2023) - [j56]Shunan Mao, Yaowei Wang, Xiaoyu Wang, Shiliang Zhang:
Multi-proxy feature learning for robust fine-grained visual recognition. Pattern Recognit. 143: 109779 (2023) - [j55]Shiliang Zhang, Anton Hagermalm, Sanjin Slavnic, Elad Michael Schiller, Magnus Almgren:
Evaluation of Open-Source Tools for Differential Privacy. Sensors 23(14): 6509 (2023) - [j54]Yuchun Shu, Haoneng Luo, Shiliang Zhang, Longbiao Wang, Jianwu Dang:
A CIF-Based Speech Segmentation Method for Streaming E2E ASR. IEEE Signal Process. Lett. 30: 344-348 (2023) - [j53]Zhenyu Cui, Jiahuan Zhou, Yuxin Peng, Shiliang Zhang, Yaowei Wang:
DCR-ReID: Deep Component Reconstruction for Cloth-Changing Person Re-Identification. IEEE Trans. Circuits Syst. Video Technol. 33(8): 4415-4428 (2023) - [j52]Jianing Li, Yaowei Wang, Shiliang Zhang:
PolarPose: Single-Stage Multi-Person Pose Estimation in Polar Coordinates. IEEE Trans. Image Process. 32: 1108-1119 (2023) - [j51]Zhanzhou Feng, Shiliang Zhang:
Efficient Vision Transformer via Token Merger. IEEE Trans. Image Process. 32: 4156-4169 (2023) - [j50]Xiao Wang, Xiujun Shu, Shiliang Zhang, Bo Jiang, Yaowei Wang, Yonghong Tian, Feng Wu:
MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking. IEEE Trans. Multim. 25: 4335-4348 (2023) - [c109]Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Qian Chen, Shiliang Zhang, Li-Rong Dai:
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings. APSIPA ASC 2023: 1943-1948 - [c108]Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie:
Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR. ASRU 2023: 1-7 - [c107]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR. ASRU 2023: 1-8 - [c106]Zhanzhou Feng, Shiliang Zhang:
Evolved Part Masking for Self-Supervised Learning. CVPR 2023: 10386-10395 - [c105]Haoyu Lu, Nan Li, Tongtong Song, Longbiao Wang, Jianwu Dang, Xiaobao Wang, Shiliang Zhang:
Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition. ICASSP 2023: 1-5 - [c104]Jiaming Wang, Zhihao Du, Shiliang Zhang:
TOLD: a Novel Two-Stage Overlap-Aware Framework for Speaker Diarization. ICASSP 2023: 1-5 - [c103]Ruihan Xu, Haokui Zhang, Wenze Hu, Shiliang Zhang, Xiaoyu Wang:
ParCNetV2: Oversized Kernel with Enhanced Attention*. ICCV 2023: 5729-5739 - [c102]Dongkai Wang, Shiliang Zhang:
3D Human Mesh Recovery with Sequentially Global Rotation Estimation. ICCV 2023: 14907-14916 - [c101]Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
CASA-ASR: Context-Aware Speaker-Attributed ASR. INTERSPEECH 2023: 411-415 - [c100]Yue Gu, Zhihao Du, Shiliang Zhang, Qian Chen, Jiqing Han:
Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition. INTERSPEECH 2023: 1249-1253 - [c99]Zhifu Gao, Zerui Li, Jiaming Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Shiliang Zhang:
FunASR: A Fundamental End-to-End Speech Recognition Toolkit. INTERSPEECH 2023: 1593-1597 - [c98]Xian Shi, Haoneng Luo, Zhifu Gao, Shiliang Zhang, Zhijie Yan:
Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System. INTERSPEECH 2023: 3247-3251 - [c97]Yuhao Liang, Fan Yu, Yangze Li, Pengcheng Guo, Shiliang Zhang, Qian Chen, Lei Xie:
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR. INTERSPEECH 2023: 3487-3491 - [c96]Junjie Li, Meng Ge, Zexu Pan, Rui Cao, Longbiao Wang, Jianwu Dang, Shiliang Zhang:
Rethinking the Visual Cues in Audio-Visual Speaker Extraction. INTERSPEECH 2023: 3754-3758 - [c95]Xiaohuan Zhou, Jiaming Wang, Zeyu Cui, Shiliang Zhang, Zhijie Yan, Jingren Zhou, Chang Zhou:
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for speech recognition. INTERSPEECH 2023: 4943-4947 - [c94]Keyu An, Xian Shi, Shiliang Zhang:
BAT: Boundary aware transducer for memory-efficient and low-latency ASR. INTERSPEECH 2023: 4963-4967 - [c93]Mohan Shi, Yuchun Shu, Lingyun Zuo, Qian Chen, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction. INTERSPEECH 2023: 5047-5051 - [c92]Junwei Zhao, Jianming Ye, Shiliang Zhang, Zhaofei Yu, Tiejun Huang:
Recognizing High-Speed Moving Objects with Spike Camera. ACM Multimedia 2023: 7657-7665 - [c91]Dongkai Wang, Shiliang Zhang, Yaowei Wang, Yonghong Tian, Tiejun Huang, Wen Gao:
HumVis: Human-Centric Visual Analysis System. ACM Multimedia 2023: 9396-9398 - [c90]Yu Liang, Shiliang Zhang, Li Ken Li, Xiaoyu Wang:
Unleashing the Full Potential of Product Quantization for Large-Scale Image Retrieval. NeurIPS 2023 - [i85]Xian Shi, Yanni Chen, Shiliang Zhang, Zhijie Yan:
Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model. CoRR abs/2301.12343 (2023) - [i84]Jiaming Wang, Zhihao Du, Shiliang Zhang:
TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization. CoRR abs/2303.05397 (2023) - [i83]Xian Shi, Haoneng Luo, Zhifu Gao, Shiliang Zhang, Zhijie Yan:
Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System. CoRR abs/2305.10680 (2023) - [i82]Zhifu Gao, Zerui Li, Jiaming Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang:
FunASR: A Fundamental End-to-End Speech Recognition Toolkit. CoRR abs/2305.11013 (2023) - [i81]Mohan Shi, Yuchun Shu, Lingyun Zuo, Qian Chen, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction. CoRR abs/2305.12450 (2023) - [i80]Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai:
CASA-ASR: Context-Aware Speaker-Attributed ASR. CoRR abs/2305.12459 (2023) - [i79]Yuhao Liang, Fan Yu, Yangze Li, Pengcheng Guo, Shiliang Zhang, Qian Chen, Lei Xie:
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR. CoRR abs/2305.13716 (2023) - [i78]Haoyu Lu, Nan Li, Tongtong Song, Longbiao Wang, Jianwu Dang, Xiaobao Wang, Shiliang Zhang:
speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition. CoRR abs/2305.17860 (2023) - [i77]Junjie Li, Meng Ge, Zexu Pan, Rui Cao, Longbiao Wang, Jianwu Dang, Shiliang Zhang:
Rethinking the visual cues in audio-visual speaker extraction. CoRR abs/2306.02625 (2023) - [i76]Xian Shi, Yexin Yang, Zerui Li, Shiliang Zhang:
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability. CoRR abs/2308.03266 (2023) - [i75]Xuehui Ma, Shiliang Zhang, Yushuai Li, Fucai Qian, Tingwen Huang:
Adaptive robust tracking control with active learning for linear systems with ellipsoidal bounded uncertainties. CoRR abs/2308.03727 (2023) - [i74]Yu Liang, Shiliang Zhang, Yaowei Wang, Sheng Xiao, Kenli Li, Xiaoyu Wang:
MixBCT: Towards Self-Adapting Backward-Compatible Training. CoRR abs/2308.06948 (2023) - [i73]Haoxu Wang, Fan Yu, Xian Shi, Yuezhang Wang, Shiliang Zhang, Ming Li:
SlideSpeech: A Large-Scale Slide-Enriched Audio-Visual Corpus. CoRR abs/2309.05396 (2023) - [i72]Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng:
FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec. CoRR abs/2309.07405 (2023) - [i71]Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen:
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer. CoRR abs/2309.07648 (2023) - [i70]Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen:
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition. CoRR abs/2309.10294 (2023) - [i69]Luyao Cheng, Siqi Zheng, Qinglin Zhang, Hui Wang, Yafeng Chen, Qian Chen, Shiliang Zhang:
Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation. CoRR abs/2309.10456 (2023) - [i68]Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Qian Chen, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui Bu:
The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR. CoRR abs/2309.13573 (2023) - [i67]Keyu An, Shiliang Zhang:
Exploring RWKV for Memory Efficient and Low Latency Streaming ASR. CoRR abs/2309.14758 (2023) - [i66]Shiyu Xuan, Qingpei Guo, Ming Yang, Shiliang Zhang:
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs. CoRR abs/2310.00582 (2023) - [i65]Jiaming Wang, Zhihao Du, Qian Chen, Yunfei Chu, Zhifu Gao, Zerui Li, Kai Hu, Xiaohuan Zhou, Jin Xu, Ziyang Ma, Wen Wang, Siqi Zheng, Chang Zhou, Zhijie Yan, Shiliang Zhang:
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT. CoRR abs/2310.04673 (2023) - [i64]Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei Xie:
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR. CoRR abs/2310.04863 (2023) - [i63]Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Yukun Ma, Hai Yu, Jiaqing Liu, Chong Zhang:
Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token Based ASR. CoRR abs/2311.04534 (2023) - [i62]Yunfei Chu, Jin Xu, Xiaohuan Zhou, Qian Yang, Shiliang Zhang, Zhijie Yan, Chang Zhou, Jingren Zhou:
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models. CoRR abs/2311.07919 (2023) - [i61]Fan Yu, Haoxu Wang, Ziyang Ma, Shiliang Zhang:
Hourglass-AVSR: Down-Up Sampling-based Computational Efficiency Model for Audio-Visual Speech Recognition. CoRR abs/2312.08850 (2023) - [i60]Daniel Gerbi Duguma, Juliana Zhang, Meysam Aboutalebi, Shiliang Zhang, Catherine Banet, Cato Bjørkli, Chinmayi Prabhu Baramashetru, Frank Eliassen, Hui Zhang, Jonathan Muringani, Josef Noll, Knut Inge Fostervold, Lars Böcker, Lee Andrew Bygrave, Matin Bagherpour, Maunya Doroudi Moghadam, Olaf Owe, Poushali Sengupta, Roman Vitenberg, Sabita Maharjan, Thiago Garrett, Yushuai Li, Zhengyu Shan:
Privacy-preserving transactive energy systems: Key topics and open research challenges. CoRR abs/2312.11564 (2023) - [i59]Lingyun Zuo, Keyu An, Shiliang Zhang, Zhijie Yan:
Advancing VAD Systems Based on Multi-Task Learning with Improved Model Structures. CoRR abs/2312.14860 (2023) - [i58]Ziyang Ma, Zhisheng Zheng, Jiaxin Ye, Jinchao Li, Zhifu Gao, Shiliang Zhang, Xie Chen:
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation. CoRR abs/2312.15185 (2023) - 2022
- [j49]Dongkai Wang, Shiliang Zhang:
Unsupervised Person Re-Identification via Multi-Label Classification. Int. J. Comput. Vis. 130(12): 2924-2939 (2022) - [j48]Jianzhong He, Shiliang Zhang, Ming Yang, Yanhu Shan, Tiejun Huang:
BDCN: Bi-Directional Cascade Network for Perceptual Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 44(1): 100-113 (2022) - [j47]Jianing Li, Shiliang Zhang, Qi Tian, Meng Wang, Wen Gao:
Pose-Guided Representation Learning for Person Re-Identification. IEEE Trans. Pattern Anal. Mach. Intell. 44(2): 622-635 (2022) - [j46]Xiaobin Liu, Shiliang Zhang:
Who is closer: A computational method for domain gap evaluation. Pattern Recognit. 122: 108293 (2022) - [j45]Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, Qi Tian:
Large-Scale Spatio-Temporal Person Re-Identification: Algorithms and Benchmark. IEEE Trans. Circuits Syst. Video Technol. 32(7): 4390-4403 (2022) - [j44]Mingkui Tan, Gengqin Ni, Xu Liu, Shiliang Zhang, Xiangmiao Wu, Yaowei Wang, Runhao Zeng:
Bidirectional Posture-Appearance Interaction Network for Driver Behavior Recognition. IEEE Trans. Intell. Transp. Syst. 23(8): 13242-13254 (2022) - [j43]Jianming Ye, Jingdong Wang, Shiliang Zhang:
Distillation-Guided Residual Learning for Binary Convolutional Neural Networks. IEEE Trans. Neural Networks Learn. Syst. 33(12): 7765-7777 (2022) - [j42]