


Остановите войну!
for scientists:
Haizhou Li 0001
李海洲
Person information

- unicode name: 李海洲
- affiliation: Chinese University of Hong Kong (Shenzhen), China
- affiliation: National University of Singapore, Department of Electrical and Computer Engineering, Singapore
- affiliation (2006 - 2016): Nanyang Technological University, Singapore
- affiliation (2003 - 2016): Institute for Infocomm Research, A*STAR, Singapore
- affiliation (2011): University of New South Wales, Sydney, Australia
- affiliation (2009): University of Eastern Finland, Kuopio, Finland
- affiliation (PhD 1990): South China University of Technology, Guangzhou, China
Other persons with the same name
- Haizhou Li 0002 — Blaise Pascal University, Clermont-Ferrand, France
- Haizhou Li 0003 — City University of Hong Kong, Department of Computer Science, Hong Kong
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2022
- [j135]Hongqiang Du
, Lei Xie, Haizhou Li:
Noise-robust voice conversion with domain adversarial training. Neural Networks 148: 74-84 (2022) - [j134]Kun Zhou
, Berrak Sisman
, Rui Liu
, Haizhou Li
:
Emotional voice conversion: Theory, databases and ESD. Speech Commun. 137: 1-18 (2022) - [j133]Tianchi Liu
, Rohan Kumar Das
, Kong Aik Lee
, Haizhou Li
:
Neural Acoustic-Phonetic Approach for Speaker Verification With Phonetic Attention Mask. IEEE Signal Process. Lett. 29: 782-786 (2022) - [j132]Zexu Pan
, Xinyuan Qian
, Haizhou Li
:
Speaker Extraction With Co-Speech Gestures Cue. IEEE Signal Process. Lett. 29: 1467-1471 (2022) - [j131]Haizhou Li:
A Unique ICASSP 2022: During an Unusual Time [Conference Highlights]. IEEE Signal Process. Mag. 39(2): 159-160 (2022) - [j130]Zexu Pan
, Ruijie Tao, Chenglin Xu
, Haizhou Li
:
Selective Listening by Synchronizing Speech With Lips. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1650-1664 (2022) - [j129]Rui Liu
, Berrak Sisman
, Guanglai Gao, Haizhou Li
:
Decoding Knowledge Transfer for Neural Text-to-Speech Training. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1789-1802 (2022) - [j128]Enze Su
, Siqi Cai
, Longhan Xie
, Haizhou Li
, Tanja Schultz
:
STAnet: A Spatiotemporal Attention Network for Decoding Auditory Spatial Attention From EEG. IEEE Trans. Biomed. Eng. 69(7): 2233-2242 (2022) - [j127]Siqi Cai
, Enze Su
, Longhan Xie
, Haizhou Li
:
EEG-Based Auditory Attention Detection via Frequency and Channel Neural Attention. IEEE Trans. Hum. Mach. Syst. 52(2): 256-266 (2022) - [j126]Malu Zhang
, Jiadong Wang
, Jibin Wu
, Ammar Belatreche
, Burin Amornpaisannon, Zhixuan Zhang, Venkata Pavan Kumar Miriyala
, Hong Qu
, Yansong Chua
, Trevor E. Carlson
, Haizhou Li
:
Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks. IEEE Trans. Neural Networks Learn. Syst. 33(5): 1947-1958 (2022) - [c621]Chen Zhang, Luis Fernando D'Haro, Thomas Friedrichs, Haizhou Li:
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation. AAAI 2022: 11657-11666 - [c620]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. ACL (1) 2022: 5699-5710 - [c619]Bin Wang, C.-C. Jay Kuo, Haizhou Li:
Just Rank: Rethinking Evaluation with Word and Sentence Similarities. ACL (1) 2022: 6060-6077 - [c618]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
Genre-Conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music. ICASSP 2022: 791-795 - [c617]Marvin Borsdorf, Kevin Scheck, Haizhou Li, Tanja Schultz:
Experts Versus All-Rounders: Target Language Extraction for Multiple Target Languages. ICASSP 2022: 846-850 - [c616]Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li:
Memobert: Pre-Training Model with Prompt-Based Learning for Multimodal Emotion Recognition. ICASSP 2022: 4703-4707 - [c615]Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li:
Self-Supervised Speaker Recognition with Loss-Gated Learning. ICASSP 2022: 6142-6146 - [c614]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
L-SpEx: Localized Target Speaker Extraction. ICASSP 2022: 7287-7291 - [c613]Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, Haizhou Li:
MFA: TDNN with Multi-Scale Frequency-Channel Attention for Text-Independent Speaker Verification with Short Utterances. ICASSP 2022: 7517-7521 - [c612]Qiquan Zhang, Qi Song, Zhaoheng Ni, Aaron Nicolson, Haizhou Li:
Time-Frequency Attention for Monaural Speech Enhancement. ICASSP 2022: 7852-7856 - [c611]Junchen Lu, Berrak Sisman, Rui Liu, Mingyang Zhang, Haizhou Li:
Visualtts: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over. ICASSP 2022: 8032-8036 - [c610]Jiadong Wang, Jibin Wu, Malu Zhang, Qi Liu, Haizhou Li:
A Hybrid Learning Framework for Deep Spiking Neural Networks with One-Spike Temporal Coding. ICASSP 2022: 8942-8946 - [c609]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li:
ADD 2022: the first Audio Deep Synthesis Detection Challenge. ICASSP 2022: 9216-9220 - [i103]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Emotion Intensity and its Control for Emotional Voice Conversion. CoRR abs/2201.03967 (2022) - [i102]Hongqiang Du, Lei Xie, Haizhou Li:
Noise-robust voice conversion with domain adversarial training. CoRR abs/2201.10693 (2022) - [i101]Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, Haizhou Li:
MFA: TDNN with Multi-scale Frequency-channel Attention for Text-independent Speaker Verification with Short Utterances. CoRR abs/2202.01624 (2022) - [i100]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu:
ADD 2022: the First Audio Deep Synthesis Detection Challenge. CoRR abs/2202.08433 (2022) - [i99]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
L-SpEx: Localized Target Speaker Extraction. CoRR abs/2202.09995 (2022) - [i98]Bin Wang, C.-C. Jay Kuo, Haizhou Li:
Just Rank: Rethinking Evaluation with Word and Sentence Similarities. CoRR abs/2203.02679 (2022) - [i97]Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li:
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT. CoRR abs/2203.15610 (2022) - [i96]Zexu Pan, Xinyuan Qian, Haizhou Li:
Speaker Extraction with Co-Speech Gestures Cue. CoRR abs/2203.16840 (2022) - [i95]Zexu Pan, Meng Ge, Haizhou Li:
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction. CoRR abs/2203.16843 (2022) - [i94]Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei:
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data. CoRR abs/2203.17113 (2022) - [i93]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music. CoRR abs/2204.03307 (2022) - [i92]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. CoRR abs/2205.10237 (2022) - [i91]Rui Liu, Berrak Sisman, Björn W. Schuller, Guanglai Gao, Haizhou Li:
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning. CoRR abs/2206.07229 (2022) - [i90]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
PoLyScribers: Joint Training of Vocal Extractor and Lyrics Transcriber for Polyphonic Music. CoRR abs/2207.07336 (2022) - [i89]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Speech Synthesis with Mixed Emotions. CoRR abs/2208.05890 (2022) - 2021
- [j125]Jibin Wu, Qi Liu, Malu Zhang, Zihan Pan, Haizhou Li, Kay Chen Tan:
HuRAI: A brain-inspired computational model for human-robot auditory interface. Neurocomputing 465: 103-113 (2021) - [j124]Rui Liu
, Berrak Sisman
, Yixing Lin, Haizhou Li:
FastTalker: A neural text-to-speech architecture with shallow and group autoregression. Neural Networks 141: 306-314 (2021) - [j123]Hongqiang Du
, Xiaohai Tian, Lei Xie, Haizhou Li
:
Factorized WaveNet for voice conversion with limited data. Speech Commun. 130: 45-54 (2021) - [j122]Tharshini Gunendradasan, Eliathamby Ambikairajah
, Julien Epps, Vidhyasaharan Sethu
, Haizhou Li:
An adaptive transmission line cochlear model based front-end for replay attack detection. Speech Commun. 132: 114-122 (2021) - [j121]Bidisha Sharma, Xiaoxue Gao, Karthika Vijayan, Xiaohai Tian, Haizhou Li
:
NHSS: A speech and singing parallel database. Speech Commun. 133: 9-22 (2021) - [j120]Xinyuan Qian
, Qi Liu
, Jiadong Wang, Haizhou Li
:
Three-Dimensional Speaker Localization: Audio-Refined Visual Scaling Factor Estimation. IEEE Signal Process. Lett. 28: 1405-1409 (2021) - [j119]Berrak Sisman
, Junichi Yamagishi
, Simon King
, Haizhou Li
:
An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning. IEEE ACM Trans. Audio Speech Lang. Process. 29: 132-157 (2021) - [j118]Rui Liu
, Berrak Sisman
, Feilong Bao, Jichen Yang
, Guanglai Gao, Haizhou Li
:
Exploiting Morphological and Phonological Features to Improve Prosodic Phrasing for Mongolian Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 29: 274-285 (2021) - [j117]Mingyang Zhang
, Yi Zhou, Li Zhao, Haizhou Li
:
Transfer Learning From Speech Synthesis to Voice Conversion With Non-Parallel Training Data. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1290-1302 (2021) - [j116]Rui Liu
, Berrak Sisman
, Guanglai Gao, Haizhou Li
:
Expressive TTS Training With Frame and Style Reconstruction Loss. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1806-1818 (2021) - [j115]Chen Zhang
, Grandee Lee
, Luis Fernando D'Haro
, Haizhou Li
:
D-Score: Holistic Dialogue Evaluation Without Reference. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2502-2516 (2021) - [j114]Zihan Pan
, Malu Zhang
, Jibin Wu
, Jiadong Wang, Haizhou Li
:
Multi-Tone Phase Coding of Interaural Time Difference for Sound Source Localization With Spiking Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2656-2670 (2021) - [j113]Chenglin Xu
, Wei Rao
, Jibin Wu
, Haizhou Li
:
Target Speaker Verification With Selective Auditory Attention for Single and Multi-Talker Speech. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2696-2709 (2021) - [j112]Yi Zhou
, Xiaohai Tian
, Haizhou Li
:
Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3427-3439 (2021) - [c608]Yan Zhang, Ruidan He, Zuozhu Liu, Lidong Bing, Haizhou Li:
Bootstrapped Unsupervised Sentence Representation Learning. ACL/IJCNLP (1) 2021: 5168-5180 - [c607]Chen Zhang
, Yiming Chen, Luis Fernando D'Haro, Yan Zhang, Thomas Friedrichs, Grandee Lee, Haizhou Li:
DynaEval: Unifying Turn and Dialogue Level Evaluation. ACL/IJCNLP (1) 2021: 5676-5689 - [c606]Jinhu Li, Chitralekha Gupta, Haizhou Li:
Training Explainable Singing Quality Assessment Network with Augmented Data. APSIPA ASC 2021: 904-911 - [c605]Chitralekha Gupta, Jinhu Li, Haizhou Li:
Towards Reference-Independent Rhythm Assessment of Solo Singing. APSIPA ASC 2021: 912-919 - [c604]Yi Ma, Kong Aik Lee, Ville Hautamäki, Haizhou Li:
PL-EESR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction. ASRU 2021: 106-113 - [c603]Bidisha Sharma, Maulik C. Madhavi, Xuehao Zhou, Haizhou Li:
Exploring Teacher-Student Learning Approach for Multi-Lingual Speech-to-Intent Classification. ASRU 2021: 419-426 - [c602]Zongyang Du, Berrak Sisman, Kun Zhou, Haizhou Li:
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer. ASRU 2021: 594-601 - [c601]Sergey Nikonorov, Berrak Sisman, Mingyang Zhang, Haizhou Li:
DEEPA: A Deep Neural Analyzer for Speech and Singing Vocoding. ASRU 2021: 618-625 - [c600]Marvin Borsdorf, Haizhou Li, Tanja Schultz:
Target Language Extraction at Multilingual Cocktail Parties. ASRU 2021: 717-724 - [c599]Enze Su, Siqi Cai, Peiwen Li, Longhan Xie, Haizhou Li:
Auditory Attention Detection with EEG Channel Attention. EMBC 2021: 5804-5807 - [c598]Siqi Cai, Pengcheng Sun, Tanja Schultz, Haizhou Li:
Low-Latency Auditory Spatial Attention Detection Based on Spectro-Spatial Features from EEG. EMBC 2021: 5812-5815 - [c597]Yiming Chen, Yan Zhang, Chen Zhang, Grandee Lee, Ran Cheng, Haizhou Li:
Revisiting Self-training for Few-shot Learning of Language Model. EMNLP (1) 2021: 9125-9135 - [c596]Nana Hou, Chenglin Xu, Eng Siong Chng, Haizhou Li:
Learning Disentangled Feature Representations for Speech Enhancement Via Adversarial Training. ICASSP 2021: 666-670 - [c595]Kun Zhou, Berrak Sisman
, Rui Liu, Haizhou Li:
Seen and Unseen Emotional Style Transfer for Voice Conversion with A New Emotional Speech Dataset. ICASSP 2021: 920-924 - [c594]Xinyuan Qian, Maulik C. Madhavi, Zexu Pan, Jiadong Wang, Haizhou Li:
Multi-Target DoA Estimation with an Audio-Visual Fusion Mechanism. ICASSP 2021: 4280-4284 - [c593]Rui Liu, Berrak Sisman
, Haizhou Li:
Graphspeech: Syntax-Aware Graph Attention Network for Neural Speech Synthesis. ICASSP 2021: 6059-6063 - [c592]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
Multi-Stage Speaker Extraction with Utterance and Frame-Level Reference Signals. ICASSP 2021: 6109-6113 - [c591]Lili Guo, Longbiao Wang, Chenglin Xu, Jianwu Dang, Eng Siong Chng, Haizhou Li:
Representation Learning with Spectro-Temporal-Channel Attention for Speech Emotion Recognition. ICASSP 2021: 6304-6308 - [c590]Rohan Kumar Das, Jichen Yang, Haizhou Li:
Data Augmentation with Signal Companding for Detection of Logical Access Attacks. ICASSP 2021: 6349-6353 - [c589]Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li:
Muse: Multi-Modal Target Speaker Extraction with Visual Cues. ICASSP 2021: 6678-6682 - [c588]Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Leveraging Acoustic and Linguistic Embeddings from Pretrained Speech and Language Models for Intent Classification. ICASSP 2021: 7498-7502 - [c587]Qicong Xie, Xiaohai Tian, Guanghou Liu, Kun Song, Lei Xie, Zhiyong Wu, Hai Li, Song Shi, Haizhou Li, Fen Hong, Hui Bu, Xin Xu:
The Multi-Speaker Multi-Style Voice Cloning Challenge 2021. ICASSP 2021: 8613-8617 - [c586]Huiping Zhuang, Zhenyu Weng, Fulin Luo, Kar-Ann Toj, Haizhou Li, Zhiping Lin:
Accumulated Decoupled Learning with Gradient Staleness Mitigation for Convolutional Neural Networks. ICML 2021: 12935-12944 - [c585]Jiadong Wang, Xinyuan Qian, Zihan Pan, Malu Zhang, Haizhou Li:
GCC-PHAT with Speech-oriented Attention for Robotic Sound Source Localization. ICRA 2021: 5876-5883 - [c584]Qu Yang, Jibin Wu, Haizhou Li:
Rethinking Benchmarks for Neuromorphic Learning Algorithms. IJCNN 2021: 1-8 - [c583]Hongning Zhu, Kong Aik Lee
, Haizhou Li:
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding. Interspeech 2021: 106-110 - [c582]Qiquan Zhang, Qi Song, Aaron Nicolson, Tian Lan, Haizhou Li:
Temporal Convolutional Network with Frequency Dimension Adaptive Attention for Speech Enhancement. Interspeech 2021: 166-170 - [c581]Xianghu Yue, Haizhou Li:
Phonetically Motivated Self-Supervised Speech Representation Learning. Interspeech 2021: 746-750 - [c580]Kun Zhou, Berrak Sisman, Haizhou Li:
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-Stage Sequence-to-Sequence Training. Interspeech 2021: 811-815 - [c579]Rohan Kumar Das, Maulik C. Madhavi, Haizhou Li:
Diagnosis of COVID-19 Using Auditory Acoustic Cues. Interspeech 2021: 921-925 - [c578]Li Zhang, Qing Wang, Kong Aik Lee
, Lei Xie, Haizhou Li:
Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification. Interspeech 2021: 1094-1098 - [c577]Yi Zhou, Xiaohai Tian, Zhizheng Wu, Haizhou Li:
Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation. Interspeech 2021: 1374-1378 - [c576]Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz:
Universal Speaker Extraction in the Presence and Absence of Target Speakers for Speech of One and Two Talkers. Interspeech 2021: 1469-1473 - [c575]Wupeng Wang, Chenglin Xu, Meng Ge, Haizhou Li:
Neural Speaker Extraction with Speaker-Speech Cross-Attention Network. Interspeech 2021: 3535-3539 - [c574]Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz:
GlobalPhone Mix-To-Separate Out of 2: A Multilingual 2000 Speakers Mixtures Database for Speech Separation. Interspeech 2021: 3905-3909 - [c573]Rui Liu, Berrak Sisman, Haizhou Li:
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability. Interspeech 2021: 4648-4652 - [c572]Yidi Jiang, Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification. Interspeech 2021: 4713-4717 - [c571]Meidan Ouyang, Rohan Kumar Das, Jichen Yang, Haizhou Li:
Capsule Network based End-to-end System for Detection of Replay Attacks. ISCSLP 2021: 1-5 - [c570]Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. ACM Multimedia 2021: 3927-3935 - [c569]Xinyuan Qian, Bidisha Sharma, Amine El Abridi, Haizhou Li:
SLoClas: A Database for Joint Sound Localization and Classification. O-COCOSDA 2021: 128-133 - [c568]Haizhou Li, Gina-Anne Levow, Zhou Yu, Chitralekha Gupta, Berrak Sisman, Siqi Cai, David Vandyke, Nina Dethlefs, Yan Wu, Junyi Jessy Li:
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue. SIGDIAL 2021 - [c567]Kun Zhou, Berrak Sisman
, Haizhou Li:
Vaw-Gan For Disentanglement And Recomposition Of Emotional Elements In Speech. SLT 2021: 415-422 - [c566]Hongqiang Du, Xiaohai Tian, Lei Xie, Haizhou Li:
Optimizing Voice Conversion Network with Cycle Consistency Loss of Speaker Identity. SLT 2021: 507-513 - [e20]Deyi Xiong, Ridong Jiang, Yanfeng Lu, Minghui Dong, Haizhou Li:
International Conference on Asian Language Processing, IALP 2021, Singapore, December 11-13, 2021. IEEE 2021, ISBN 978-1-6654-8311-7 [contents] - [e19]Erik Marchi, Sabato Marco Siniscalchi, Sandro Cumani, Valerio Mario Salerno, Haizhou Li:
Increasing Naturalness and Flexibility in Spoken Dialogue Interaction - 10th International Workshop on Spoken Dialogue Systems, IWSDS 2019, Syracuse, Sicily, Italy, 24-26 April 2019. Lecture Notes in Electrical Engineering 714, Springer 2021, ISBN 978-981-15-9322-2 [contents] - [e18]Haizhou Li, Gina-Anne Levow, Zhou Yu, Chitralekha Gupta, Berrak Sisman, Siqi Cai, David Vandyke, Nina Dethlefs, Yan Wu, Junyi Jessy Li:
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGdial 2021, Singapore and Online, July 29-31, 2021. Association for Computational Linguistics 2021, ISBN 978-1-954085-81-7 [contents] - [e17]Haizhou Li
, Shuzhi Sam Ge
, Yan Wu
, Agnieszka Wykowska
, Hongsheng He
, Xiaorui Liu
, Dongyu Li
, Jairo Pérez-Osorio
:
Social Robotics - 13th International Conference, ICSR 2021, Singapore, November 10-13, 2021, Proceedings. Lecture Notes in Computer Science 13086, Springer 2021, ISBN 978-3-030-90524-8 [contents] - [i88]Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification. CoRR abs/2102.07370 (2021) - [i87]Siqi Cai, Pengcheng Sun, Tanja Schultz, Haizhou Li:
Low-latency auditory spatial attention detection based on spectro-spatial features from EEG. CoRR abs/2103.03621 (2021) - [i86]Chenglin Xu, Wei Rao, Jibin Wu, Haizhou Li:
Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech. CoRR abs/2103.16269 (2021) - [i85]Kun Zhou, Berrak Sisman, Haizhou Li:
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training. CoRR abs/2103.16809 (2021) - [i84]Rui Liu, Berrak Sisman, Haizhou Li:
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability. CoRR abs/2104.01408 (2021) - [i83]Xinyuan Qian, Maulik C. Madhavi, Zexu Pan, Jiadong Wang, Haizhou Li:
Multi-target DoA Estimation with an Audio-visual Fusion Mechanism. CoRR abs/2105.06107 (2021) - [i82]Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li:
Emotional Voice Conversion: Theory, Databases and ESD. CoRR abs/2105.14762 (2021) - [i81]Chen Zhang, Yiming Chen, Luis Fernando D'Haro, Yan Zhang, Thomas Friedrichs, Grandee Lee, Haizhou Li:
DynaEval: Unifying Turn and Dialogue Level Evaluation. CoRR abs/2106.01112 (2021) - [i80]Li Zhang, Qing Wang, Kong Aik Lee, Lei Xie, Haizhou Li:
Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification. CoRR abs/2106.09320 (2021) - [i79]Zongyang Du, Berrak Sisman, Kun Zhou, Haizhou Li:
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer. CoRR abs/2107.03748 (2021) - [i78]Hongning Zhu, Kong Aik Lee, Haizhou Li:
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding. CoRR abs/2107.06493 (2021) - [i77]Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. CoRR abs/2107.06592 (2021) - [i76]Xinyuan Qian, Bidisha Sharma, Amine El Abridi, Haizhou Li:
SLoClas: A Database for Joint Sound Localization and Classification. CoRR abs/2108.02539 (2021) - [i75]Yidi Jiang, Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification. CoRR abs/2108.02598 (2021) - [i74]Bidisha Sharma, Maulik C. Madhavi, Xuehao Zhou, Haizhou Li:
Exploring Teacher-Student Learning Approach for Multi-lingual Speech-to-Intent Classification. CoRR abs/2109.13486 (2021) - [i73]