default search action
Yifan Gong 0001
Person information
- affiliation: Microsoft Corporation, Redmond, WA, USA
- affiliation: Texas Instruments Inc., Dallas, TX, USA
- affiliation: INRIA-Lorraine, Nancy, France
- affiliation (PhD): Henri Poincaré University, Department of Mathematics and Computer Science, Nancy, France
Other persons with the same name
- Yifan Gong 0002 — Chongqing University of Technology, Institute of Electrical and Electronic Engineering, China
- Yifan Gong 0003 — TuSimple, Beijing, China (and 1 more)
- Yifan Gong 0004 — Northeastern University, Boston, MA, USA (and 1 more)
- Yifan Gong 0005 — Xi'an Jiaotong University, School of Microelectronics, China
- Yifan Gong 0006 — Chinese Academy of Sciences, National Space Science Center, Beijing, China
- Yifan Gong 0007 — Hangzhou Dianzi University, School of Computer Science, China
- Yifan Gong 0008 — University of California, Design Lab, CA, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2023
- [j23]Dong Yu, Yifan Gong, Michael A. Picheny, Bhuvana Ramabhadran, Dilek Hakkani-Tür, Rohit Prasad, Heiga Zen, Jan Skoglund, Jan Honza Cernocký, Lukás Burget, Abdelrahman Mohamed:
Twenty-Five Years of Evolution in Speech and Language Processing. IEEE Signal Process. Mag. 40(5): 27-39 (2023) - 2021
- [j22]Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong:
Streaming End-to-End Multi-Talker Speech Recognition. IEEE Signal Process. Lett. 28: 803-807 (2021) - [j21]Peidong Wang, Zhuo Chen, DeLiang Wang, Jinyu Li, Yifan Gong:
Speaker Separation Using Speaker Inventories and Estimated Speech. IEEE ACM Trans. Audio Speech Lang. Process. 29: 537-546 (2021) - 2019
- [j20]Amit Das, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong:
Advancing Acoustic-to-Word CTC Model With Attention and Mixed-Units. IEEE ACM Trans. Audio Speech Lang. Process. 27(12): 1880-1892 (2019) - 2014
- [j19]Kaisheng Yao, Dong Yu, Li Deng, Yifan Gong:
A fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation. Neurocomputing 128: 145-152 (2014) - [j18]Jinyu Li, Li Deng, Yifan Gong, Reinhold Haeb-Umbach:
An Overview of Noise-Robust Automatic Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 22(4): 745-777 (2014) - 2009
- [j17]Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero:
A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions. Comput. Speech Lang. 23(3): 389-405 (2009) - [j16]Dong Yu, Li Deng, Yifan Gong, Alex Acero:
A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models. IEEE Trans. Speech Audio Process. 17(7): 1348-1360 (2009) - 2008
- [j15]Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero:
Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor. IEEE Trans. Speech Audio Process. 16(5): 1061-1070 (2008) - 2007
- [j14]Xiaodong Cui, Yifan Gong:
A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition. IEEE Trans. Speech Audio Process. 15(4): 1366-1376 (2007) - 2005
- [j13]Yifan Gong:
A Method of Joint Compensation of Additive and Convolutive Distortions for Speaker-Independent Speech Recognition. IEEE Trans. Speech Audio Process. 13(5-2): 975-983 (2005) - 2002
- [j12]Yifan Gong:
Noise-dependent Gaussian mixture classifiers for robust rejection decision. IEEE Trans. Speech Audio Process. 10(2): 57-64 (2002) - 1999
- [j11]Mohamed Afify, Yifan Gong, Jean-Paul Haton:
A minimum cross-entropy approach to hidden Markov model adaptation. IEEE Signal Process. Lett. 6(6): 132-134 (1999) - 1998
- [j10]Jan P. Verhasselt, Irina Illina, Jean-Pierre Martens, Yifan Gong, Jean Paul Haton:
Assessing the importance of the segmentation probability in segment-based speech recognition. Speech Commun. 24(1): 51-72 (1998) - [j9]Irina Illina, Mohamed Afify, Yifan Gong:
Environment normalization training and environment adaptation using mixture stochastic trajectory model. Speech Commun. 26(4): 245-258 (1998) - [j8]Mohamed Afify, Yifan Gong, Jean Paul Haton:
A general joint additive and convolutive bias compensation approach applied to noisy Lombard speech recognition. IEEE Trans. Speech Audio Process. 6(6): 524-538 (1998) - 1997
- [j7]Yifan Gong:
Stochastic trajectory modeling and sentence searching for continuous speech recognition. IEEE Trans. Speech Audio Process. 5(1): 33-44 (1997) - 1996
- [j6]Mohamed Afify, Yifan Gong, Jean Paul Haton:
Estimation of mixtures of stochastic dynamic trajectories: application to continuous speech recognition. Comput. Speech Lang. 10(1): 23-36 (1996) - [j5]Olivier Siohan, Yifan Gong, Jean Paul Haton:
Comparative experiments of several adaptation approaches to noisy speech recognition using stochastic trajectory models. Speech Commun. 18(4): 335-352 (1996) - 1995
- [j4]Yifan Gong:
Speech recognition in noisy environments: A survey. Speech Commun. 16(3): 261-291 (1995) - 1993
- [j3]Yifan Gong, Jean Paul Haton:
Plausibility functions in continuous speech recognition: The VINICS system. Speech Commun. 13(1-2): 187-196 (1993) - 1991
- [j2]Yifan Gong, Jean Paul Haton:
Signal-to-String Conversion Based on High Likelihood Regions Using Embedded Dynamic Programming. IEEE Trans. Pattern Anal. Mach. Intell. 13(3): 297-302 (1991) - 1987
- [j1]Yifan Gong, Jean Paul Haton:
Time domain harmonic matching pitch estimation using time-dependent speech modeling. IEEE Trans. Acoust. Speech Signal Process. 35(10): 1386-1400 (1987)
Conference and Workshop Papers
- 2024
- [c184]Shaoshi Ling, Yuxuan Hu, Shuangbei Qian, Guoli Ye, Yao Qian, Yifan Gong, Ed Lin, Michael Zeng:
Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition. ICASSP 2024: 11046-11050 - [c183]Jing Yao, Xiaoyuan Yi, Yifan Gong, Xiting Wang, Xing Xie:
Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human Value. NAACL-HLT 2024: 8762-8785 - 2023
- [c182]Yan Huang, Piyush Behre, Guoli Ye, Shawn Chang, Yifan Gong:
Multi Transcription-Style Speech Transcription Using Attention-Based Encoder-Decoder Model. ASRU 2023: 1-6 - [c181]Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue, Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong:
Building High-Accuracy Multilingual ASR With Gated Language Experts and Curriculum Training. ASRU 2023: 1-7 - 2022
- [c180]Liang Lu, Jinyu Li, Yifan Gong:
Endpoint Detection for Streaming End-to-End Multi-Talker ASR. ICASSP 2022: 7312-7316 - [c179]Guoli Ye, Vadim Mazalov, Jinyu Li, Yifan Gong:
Have Best of Both Worlds: Two-Pass Hybrid and E2E Cascading Framework for Speech Recognition. ICASSP 2022: 7432-7436 - [c178]Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong:
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition. INTERSPEECH 2022: 2608-2612 - [c177]Yashesh Gaur, Nick Kibre, Jian Xue, Kangyuan Shu, Yuhui Wang, Issac Alphanso, Jinyu Li, Yifan Gong:
Streaming, Fast and Accurate on-Device Inverse Text Normalization for Automatic Speech Recognition. SLT 2022: 237-244 - [c176]Jeremy Heng Meng Wong, Yifan Gong:
Joint Speaker Diarisation and Tracking in Switching State-Space Model. SLT 2022: 605-612 - [c175]Jeremy Heng Meng Wong, Igor Abramovski, Xiong Xiao, Yifan Gong:
Diarisation Using Location Tracking with Agglomerative Clustering. SLT 2022: 613-619 - 2021
- [c174]Rui Zhao, Jian Xue, Jinyu Li, Wenning Wei, Lei He, Yifan Gong:
On Addressing Practical Challenges for RNN-Transducer. ASRU 2021: 526-533 - [c173]Eric Sun, Liang Lu, Zhong Meng, Yifan Gong:
Sequence-Level Self-Teaching Regularization. ICASSP 2021: 2945-2949 - [c172]Xiong Xiao, Naoyuki Kanda, Zhuo Chen, Tianyan Zhou, Takuya Yoshioka, Sanyuan Chen, Yong Zhao, Gang Liu, Yu Wu, Jian Wu, Shujie Liu, Jinyu Li, Yifan Gong:
Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020. ICASSP 2021: 5824-5828 - [c171]Jeremy Heng Meng Wong, Dimitrios Dimitriadis, Ken'ichi Kumatani, Yashesh Gaur, George Polovets, Partha Parthasarathy, Eric Sun, Jinyu Li, Yifan Gong:
Ensemble Combination between Different Time Segmentations. ICASSP 2021: 6768-6772 - [c170]Jeremy Heng Meng Wong, Xiong Xiao, Yifan Gong:
Hidden Markov Model Diarisation with Speaker Location Information. ICASSP 2021: 7158-7162 - [c169]Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong:
Internal Language Model Training for Domain-Adaptive End-To-End Speech Recognition. ICASSP 2021: 7338-7342 - [c168]Yan Deng, Rui Zhao, Zhong Meng, Xie Chen, Bing Liu, Jinyu Li, Yifan Gong, Lei He:
Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS. Interspeech 2021: 751-755 - [c167]Yan Huang, Guoli Ye, Jinyu Li, Yifan Gong:
Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need. Interspeech 2021: 1309-1313 - [c166]Vikas Joshi, Amit Das, Eric Sun, Rupesh R. Mehta, Jinyu Li, Yifan Gong:
Multiple Softmax Architecture for Streaming Multilingual End-to-End ASR Systems. Interspeech 2021: 1767-1771 - [c165]Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong:
Streaming Multi-Talker Speech Recognition with Joint Speaker Identification. Interspeech 2021: 1782-1786 - [c164]Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong:
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition. Interspeech 2021: 2596-2600 - [c163]Liang Lu, Zhong Meng, Naoyuki Kanda, Jinyu Li, Yifan Gong:
On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer. Interspeech 2021: 3435-3439 - [c162]Eric Sun, Jinyu Li, Zhong Meng, Yu Wu, Jian Xue, Shujie Liu, Yifan Gong:
Improving Multilingual Transformer Transducer Models by Reducing Language Confusions. Interspeech 2021: 3470-3474 - [c161]Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong:
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition. SLT 2021: 243-250 - 2020
- [c160]Hirofumi Inaguma, Yashesh Gaur, Liang Lu, Jinyu Li, Yifan Gong:
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR. ICASSP 2020: 6064-6068 - [c159]Hu Hu, Rui Zhao, Jinyu Li, Liang Lu, Yifan Gong:
Exploring Pre-Training with Alignments for RNN Transducer Based End-to-End Speech Recognition. ICASSP 2020: 7079-7083 - [c158]Zhong Meng, Hu Hu, Jinyu Li, Changliang Liu, Yan Huang, Yifan Gong, Chin-Hui Lee:
L-Vector: Neural Label Embedding for Domain Adaptation. ICASSP 2020: 7389-7393 - [c157]Yan Huang, Yifan Gong:
Acoustic Model Adaptation for Presentation Transcription and Intelligent Meeting Assistant Systems. ICASSP 2020: 7394-7398 - [c156]Yan Huang, Lei He, Wenning Wei, William Gale, Jinyu Li, Yifan Gong:
Using Personalized Speech Synthesis and Neural Language Generator for Rapid Speaker Adaptation. ICASSP 2020: 7399-7403 - [c155]Eva Sharma, Guoli Ye, Wenning Wei, Rui Zhao, Yao Tian, Jian Wu, Lei He, Ed Lin, Yifan Gong:
Adaptation of RNN Transducer with Text-To-Speech Technology for Keyword Spotting. ICASSP 2020: 7484-7488 - [c154]Jinyu Li, Rui Zhao, Eric Sun, Jeremy Heng Meng Wong, Amit Das, Zhong Meng, Yifan Gong:
High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model. ICASSP 2020: 7699-7703 - [c153]Yan Huang, Jinyu Li, Lei He, Wenning Wei, William Gale, Yifan Gong:
Rapid RNN-T Adaptation Using Personalized Speech Synthesis and Neural Language Generator. INTERSPEECH 2020: 1256-1260 - [c152]Kshitiz Kumar, Bo Ren, Yifan Gong, Jian Wu:
Bandpass Noise Generation and Augmentation for Unified ASR. INTERSPEECH 2020: 1683-1687 - [c151]Jeremy Heng Meng Wong, Yashesh Gaur, Rui Zhao, Liang Lu, Eric Sun, Jinyu Li, Yifan Gong:
Combination of End-to-End and Hybrid Models for Speech Recognition. INTERSPEECH 2020: 1783-1787 - [c150]Kshitiz Kumar, Chaojun Liu, Yifan Gong, Jian Wu:
1-D Row-Convolution LSTM: Fast Streaming ASR at Accuracy Parity with LC-BLSTM. INTERSPEECH 2020: 2107-2111 - [c149]Jinyu Li, Rui Zhao, Zhong Meng, Yanqing Liu, Wenning Wei, Sarangarajan Parthasarathy, Vadim Mazalov, Zhenghao Wang, Lei He, Sheng Zhao, Yifan Gong:
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability. INTERSPEECH 2020: 3590-3594 - [c148]Liang Lu, Changliang Liu, Jinyu Li, Yifan Gong:
Exploring Transformers for Large-Scale Speech Recognition. INTERSPEECH 2020: 5041-5045 - 2019
- [c147]Jinyu Li, Rui Zhao, Hu Hu, Yifan Gong:
Improving RNN Transducer Modeling for End-to-End Speech Recognition. ASRU 2019: 114-121 - [c146]Zhong Meng, Jinyu Li, Yashesh Gaur, Yifan Gong:
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition. ASRU 2019: 268-275 - [c145]Takuya Yoshioka, Yan Huang, Aviv Hurvitz, Li Jiang, Sharon Koubi, Eyal Krupka, Ido Leichter, Changliang Liu, Partha Parthasarathy, Alon Vinnikov, Lingfeng Wu, Igor Abramovski, Xiong Xiao, Wayne Xiong, Huaming Wang, Zhenghao Wang, Jun Zhang, Yong Zhao, Tianyan Zhou, Cem Aksoylar, Zhuo Chen, Moshe David, Dimitrios Dimitriadis, Yifan Gong, Ilya Gurvich, Xuedong Huang:
Advances in Online Audio-Visual Meeting Transcription. ASRU 2019: 276-283 - [c144]Tianyan Zhou, Yong Zhao, Jinyu Li, Yifan Gong, Jian Wu:
CNN with Phonetic Attention for Text-Independent Speaker Verification. ASRU 2019: 718-725 - [c143]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Character-Aware Attention-Based End-to-End Speech Recognition. ASRU 2019: 949-955 - [c142]Xiong Xiao, Zhuo Chen, Takuya Yoshioka, Hakan Erdogan, Changliang Liu, Dimitrios Dimitriadis, Jasha Droppo, Yifan Gong:
Single-channel Speech Extraction Using Speaker Inventory and Attention Network. ICASSP 2019: 86-90 - [c141]Kshitiz Kumar, Tasos Anastasakos, Yifan Gong:
Word Characters and Phone Pronunciation Embedding for ASR Confidence Classifier. ICASSP 2019: 2712-2716 - [c140]Kshitiz Kumar, Yifan Gong:
Static and Dynamic State Predictions for Acoustic Model Combination. ICASSP 2019: 2782-2786 - [c139]Amit Das, Jinyu Li, Changliang Liu, Yifan Gong:
Universal Acoustic Modeling Using Neural Mixture Models. ICASSP 2019: 5681-5685 - [c138]Shi-Xiong Zhang, Yifan Gong, Dong Yu:
Encrypted Speech Recognition Using Deep Polynomial Networks. ICASSP 2019: 5691-5695 - [c137]Zhong Meng, Jinyu Li, Yifan Gong:
Adversarial Speaker Adaptation. ICASSP 2019: 5721-5725 - [c136]Ke Li, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong:
Towards Code-switching ASR for End-to-end CTC Models. ICASSP 2019: 6076-6080 - [c135]Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong:
Adversarial Speaker Verification. ICASSP 2019: 6216-6220 - [c134]Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong:
Conditional Teacher-student Learning. ICASSP 2019: 6445-6449 - [c133]Jinyu Li, Liang Lu, Changliang Liu, Yifan Gong:
Improving Layer Trajectory LSTM with Future Context Frames. ICASSP 2019: 6550-6554 - [c132]Zhong Meng, Jinyu Li, Yifan Gong:
Attentive Adversarial Learning for Domain-invariant Training. ICASSP 2019: 6740-6744 - [c131]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Speaker Adaptation for Attention-Based End-to-End Speech Recognition. INTERSPEECH 2019: 241-245 - [c130]Eric Sun, Jinyu Li, Yifan Gong:
Layer Trajectory BLSTM. INTERSPEECH 2019: 1403-1407 - [c129]Yashesh Gaur, Jinyu Li, Zhong Meng, Yifan Gong:
Acoustic-to-Phrase Models for Speech Recognition. INTERSPEECH 2019: 2240-2244 - [c128]Liang Lu, Eric Sun, Yifan Gong:
Self-Teaching Networks. INTERSPEECH 2019: 2798-2802 - 2018
- [c127]Amit Das, Jinyu Li, Rui Zhao, Yifan Gong:
Advancing Connectionist Temporal Classification with Attention Modeling. ICASSP 2018: 4769-4773 - [c126]Zhuo Chen, Takuya Yoshioka, Xiong Xiao, Linyu Li, Michael L. Seltzer, Yifan Gong:
Efficient Integration of Fixed Beamformers and Speech Separation Networks for Multi-Channel Far-Field Speech Separation. ICASSP 2018: 5384-5388 - [c125]Jinyu Li, Rui Zhao, Zhuo Chen, Changliang Liu, Xiong Xiao, Guoli Ye, Yifan Gong:
Developing Far-Field Speaker System Via Teacher-Student Learning. ICASSP 2018: 5699-5703 - [c124]Jinyu Li, Guoli Ye, Amit Das, Rui Zhao, Yifan Gong:
Advancing Acoustic-to-Word CTC Model. ICASSP 2018: 5794-5798 - [c123]Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang Juang:
Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation. ICASSP 2018: 5949-5953 - [c122]Zhong Meng, Jinyu Li, Zhuo Chen, Yang Zhao, Vadim Mazalov, Yifan Gong, Biing-Hwang Juang:
Speaker-Invariant Training Via Adversarial Learning. ICASSP 2018: 5969-5973 - [c121]Yong Zhao, Jinyu Li, Shi-Xiong Zhang, Liping Chen, Yifan Gong:
Domain and Speaker Adaptation for Cortana Speech Recognition. ICASSP 2018: 5984-5988 - [c120]Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang Fred Juang:
Cycle-Consistent Speech Enhancement. INTERSPEECH 2018: 1165-1169 - [c119]Jinyu Li, Changliang Liu, Yifan Gong:
Layer Trajectory LSTM. INTERSPEECH 2018: 1768-1772 - [c118]Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang Fred Juang:
Adversarial Feature-Mapping for Speech Enhancement. INTERSPEECH 2018: 3259-3263 - [c117]Jinyu Li, Liang Lu, Changliang Liu, Yifan Gong:
Exploring Layer Trajectory LSTM with Depth Processing Units and Attention. SLT 2018: 456-462 - [c116]Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong:
Speaker Adaptation for End-to-End CTC Models. SLT 2018: 542-549 - [c115]Zhuo Chen, Xiong Xiao, Takuya Yoshioka, Hakan Erdogan, Jinyu Li, Yifan Gong:
Multi-Channel Overlapped Speech Recognition with Location Guided Speech Extraction Network. SLT 2018: 558-565 - 2017
- [c114]Jinyu Li, Guoli Ye, Rui Zhao, Jasha Droppo, Yifan Gong:
Acoustic-to-word model without OOV. ASRU 2017: 111-117 - [c113]Zhong Meng, Zhuo Chen, Vadim Mazalov, Jinyu Li, Yifan Gong:
Unsupervised adaptation with domain separation networks for robust speech recognition. ASRU 2017: 214-221 - [c112]Zhuo Chen, Jinyu Li, Xiong Xiao, Takuya Yoshioka, Huaming Wang, Zhenghao Wang, Yifan Gong:
Cracking the cocktail party problem by multi-beam deep attractor network. ASRU 2017: 437-444 - [c111]Jinyu Li, Yan Huang, Yifan Gong:
Improved cepstra minimum-mean-square-error noise reduction algorithm for robust speech recognition. ICASSP 2017: 4865-4869 - [c110]Yong Zhao, Jinyu Li, Kshitiz Kumar, Yifan Gong:
Extended low-rank plus diagonal adaptation for deep and recurrent neural networks. ICASSP 2017: 5040-5044 - [c109]Jinyu Li, Michael L. Seltzer, Xi Wang, Rui Zhao, Yifan Gong:
Large-Scale Domain Adaptation via Teacher-Student Learning. INTERSPEECH 2017: 2386-2390 - [c108]Zhuo Chen, Yan Huang, Jinyu Li, Yifan Gong:
Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection. INTERSPEECH 2017: 3632-3636 - [c107]Michael Levit, Yan Huang, Shuangyu Chang, Yifan Gong:
Don't Count on ASR to Transcribe for You: Breaking Bias with Two Crowds. INTERSPEECH 2017: 3941-3945 - 2016
- [c106]Yajie Miao, Jinyu Li, Yongqiang Wang, Shi-Xiong Zhang, Yifan Gong:
Simplifying long short-term memory acoustic models for fast training and decoding. ICASSP 2016: 2284-2288 - [c105]Jinyu Li, Abdelrahman Mohamed, Geoffrey Zweig, Yifan Gong:
Exploring multidimensional lstms for large vocabulary ASR. ICASSP 2016: 4940-4944 - [c104]Yong Zhao, Jinyu Li, Yifan Gong:
Low-rank plus diagonal adaptation for deep neural networks. ICASSP 2016: 5005-5009 - [c103]Chaojun Liu, Yongqiang Wang, Kshitiz Kumar, Yifan Gong:
Investigations on speaker adaptation of LSTM RNN models for speech recognition. ICASSP 2016: 5020-5024 - [c102]Kshitiz Kumar, Chaojun Liu, Yifan Gong:
Non-negative intermediate-layer DNN adaptation for a 10-KB speaker adaptation profile. ICASSP 2016: 5285-5289 - [c101]Guoli Ye, Chaojun Liu, Yifan Gong:
Geo-location dependent deep neural network acoustic model for speech recognition. ICASSP 2016: 5870-5874 - [c100]Shi-Xiong Zhang, Rui Zhao, Chaojun Liu, Jinyu Li, Yifan Gong:
Recurrent support vector machines for speech recognition. ICASSP 2016: 5885-5889 - [c99]Yan Huang, Yongqiang Wang, Yifan Gong:
Semi-Supervised Training in Deep Learning Acoustic Model. INTERSPEECH 2016: 3848-3852 - [c98]Shi-Xiong Zhang, Zhuo Chen, Yong Zhao, Jinyu Li, Yifan Gong:
End-to-End attention based text-dependent speaker verification. SLT 2016: 171-178 - 2015
- [c97]Jinyu Li, Abdelrahman Mohamed, Geoffrey Zweig, Yifan Gong:
LSTM time and frequency recurrence for automatic speech recognition. ASRU 2015: 187-191 - [c96]Shi-Xiong Zhang, Chaojun Liu, Kaisheng Yao, Yifan Gong:
Deep neural support vector machines for speech recognition. ICASSP 2015: 4275-4279 - [c95]Yong Zhao, Jinyu Li, Jian Xue, Yifan Gong:
Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data. ICASSP 2015: 4310-4314 - [c94]Yongqiang Wang, Jinyu Li, Yifan Gong:
Small-footprint high-performance deep neural network-based speech recognition using split-VQ. ICASSP 2015: 4984-4988 - [c93]Jui-Ting Huang, Jinyu Li, Yifan Gong:
An analysis of convolutional neural networks for speech recognition. ICASSP 2015: 4989-4993 - [c92]Kaustubh Kalgaonkar, Chaojun Liu, Yifan Gong, Kaisheng Yao:
Estimating confidence scores on ASR results using recurrent neural networks. ICASSP 2015: 4999-5003 - [c91]Kshitiz Kumar, Ziad Al Bawab, Yong Zhao, Chaojun Liu, Benoît Dumoulin, Yifan Gong:
Confidence-features and confidence-scores for ASR applications in arbitration and DNN speaker adaptation. INTERSPEECH 2015: 702-706 - [c90]Yan Huang, Yifan Gong:
Regularized sequence-level deep neural network model adaptation. INTERSPEECH 2015: 1081-1085 - [c89]Kshitiz Kumar, Chaojun Liu, Kaisheng Yao, Yifan Gong:
Intermediate-layer DNN adaptation for offline and session-based iterative speaker adaptation. INTERSPEECH 2015: 1091-1095 - [c88]Kshitiz Kumar, Chaojun Liu, Yifan Gong:
Delta-melspectra features for noise robustness to DNN-based ASR systems. INTERSPEECH 2015: 2445-2448 - [c87]Changliang Liu, Jinyu Li, Yifan Gong:
SVD-based universal DNN modeling for multiple scenarios. INTERSPEECH 2015: 3269-3273 - 2014
- [c86]Jinyu Li, Jui-Ting Huang, Yifan Gong:
Factorized adaptation for deep neural network. ICASSP 2014: 5537-5541 - [c85]Jian Xue, Jinyu Li, Dong Yu, Mike Seltzer, Yifan Gong:
Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network. ICASSP 2014: 6359-6363 - [c84]Yan Huang, Malcolm Slaney, Michael L. Seltzer, Yifan Gong:
Towards better performance with heterogeneous training data in acoustic modeling using deep neural networks. INTERSPEECH 2014: 845-849 - [c83]Kshitiz Kumar, Chaojun Liu, Yifan Gong:
Normalization of ASR confidence classifier scores via confidence mapping. INTERSPEECH 2014: 1199-1203 - [c82]Yan Huang, Dong Yu, Chaojun Liu, Yifan Gong:
A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models. INTERSPEECH 2014: 1895-1899 - [c81]Jinyu Li, Rui Zhao, Jui-Ting Huang, Yifan Gong:
Learning small-size DNN with output-distribution-based criteria. INTERSPEECH 2014: 1910-1914 - [c80]Rui Zhao, Jinyu Li, Yifan Gong:
Variable-component deep neural network for robust speech recognition. INTERSPEECH 2014: 2719-2723 - [c79]Yan Huang, Dong Yu, Chaojun Liu, Yifan Gong:
Multi-accent deep neural network acoustic model with accent-specific top layer using the KLD-regularized model adaptation. INTERSPEECH 2014: 2977-2981 - [c78]Rui Zhao, Jinyu Li, Yifan Gong:
Variable-activation and variable-input deep neural network for robust speech recognition. SLT 2014: 542-547 - 2013
- [c77]Jui-Ting Huang, Jinyu Li, Dong Yu, Li Deng, Yifan Gong:
Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. ICASSP 2013: 7304-7308 - [c76]Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng:
Predicting speech recognition confidence using deep learning with word identity and score features. ICASSP 2013: 7413-7417 - [c75]Li Deng, Jinyu Li, Jui-Ting Huang, Kaisheng Yao, Dong Yu, Frank Seide, Michael L. Seltzer, Geoffrey Zweig, Xiaodong He, Jason D. Williams, Yifan Gong, Alex Acero:
Recent advances in deep learning for speech research at Microsoft. ICASSP 2013: 8604-8608 - [c74]Yan Huang, Dong Yu, Yifan Gong, Chaojun Liu:
Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration. INTERSPEECH 2013: 2360-2364 - [c73]Jian Xue, Jinyu Li, Yifan Gong:
Restructuring of deep neural network acoustic models with singular value decomposition. INTERSPEECH 2013: 2365-2369 - 2012
- [c72]Jinyu Li, Michael L. Seltzer, Yifan Gong:
Improvements to VTS feature enhancement. ICASSP 2012: 4677-4680 - [c71]Kaisheng Yao, Yifan Gong, Chaojun Liu:
A Feature Space Transformation Method for Personalization using Generalized I-Vector Clustering. INTERSPEECH 2012: 1352-1355 - [c70]Jinyu Li, Michael L. Seltzer, Yifan Gong:
Efficient VTS Adaptation Using Jacobian Approximation. INTERSPEECH 2012: 1906-1909 - [c69]Jinyu Li, Dong Yu, Jui-Ting Huang, Yifan Gong:
Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM. SLT 2012: 131-136 - [c68]Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, Yifan Gong:
Adaptation of context-dependent deep neural networks for automatic speech recognition. SLT 2012: 366-369 - 2010
- [c67]Jinyu Li, Dong Yu, Yifan Gong, Li Deng:
Unscented transform with online distortion estimation for HMM adaptation. INTERSPEECH 2010: 1660-1663 - 2009
- [c66]Dong Yu, Li Deng, Peng Liu, Jian Wu, Yifan Gong, Alex Acero:
Cross-lingual speech recognition under runtime resource constraints. ICASSP 2009: 4193-4196 - [c65]Hui Lin, Li Deng, Dong Yu, Yifan Gong, Alex Acero, Chin-Hui Lee:
A study on multilingual acoustic modeling for large vocabulary ASR. ICASSP 2009: 4333-4336 - 2008
- [c64]Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero:
A minimum-mean-square-error noise reduction algorithm on Mel-frequency cepstra for robust speech recognition. ICASSP 2008: 4041-4044 - [c63]Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero:
HMM adaptation using a phase-sensitive acoustic distortion model for environment-robust speech recognition. ICASSP 2008: 4069-4072 - [c62]Jinyu Li, Li Deng, Dong Yu, Jian Wu, Yifan Gong, Alex Acero:
Adaptation of compressed HMM parameters for resource-constrained speech recognition. ICASSP 2008: 4333-4336 - [c61]Dong Yu, Li Deng, Yifan Gong, Alex Acero:
Discriminative training of variable-parameter HMMs for noise robust speech recognition. INTERSPEECH 2008: 285-288 - [c60]Dong Yu, Li Deng, Yifan Gong, Alex Acero:
Parameter clustering and sharing in variable-parameter HMMs for noise robust speech recognition. INTERSPEECH 2008: 1253-1256 - [c59]Dong Yu, Li Deng, Jian Wu, Yifan Gong, Alex Acero:
Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition. ISCSLP 2008: 69-72 - 2007
- [c58]Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero:
High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series. ASRU 2007: 65-70 - 2006
- [c57]Xiaodong Cui, Yifan Gong:
Modeling Variance Variation in a Variable Parameter HMM Framework for Noise Robust Speech Recognition. ICASSP (1) 2006: 1117-1120 - 2004
- [c56]Alexis Bernard, Yifan Gong, Xiaodong Cui:
Can back-ends be more robust than front-ends? Investigation over the Aurora-2 database. ICASSP (1) 2004: 1025-1028 - 2003
- [c55]Xiaodong Cui, Yifan Gong:
Variable parameter Gaussian mixture hidden Markov modeling for speech recognition. ICASSP (1) 2003: 12-15 - [c54]Yifan Gong:
Model-space compensation of microphone and noise for speaker-independent speech recognition. ICASSP (1) 2003: 660-663 - 2002
- [c53]Yifan Gong:
Noise-robust open-set speaker recognition using noise-dependent Gaussian mixture classifier. ICASSP 2002: 133-136 - [c52]Yifan Gong, Lorin Netsch:
Experiments on speaker-independent voice command recognition using in-vehicle hands free speech. INTERSPEECH 2002: 801-804 - [c51]Yifan Gong:
A comparative study of approximations for parallel model combination of static and dynamic parameters. INTERSPEECH 2002: 1029-1032 - [c50]Yeshwant K. Muthusamy, Yifan Gong, Roshan Gupta:
The effects of speech compression on speech recognition and text-to-speech synthesis. INTERSPEECH 2002: 2229-2232 - 2000
- [c49]Jim Kleban, Yifan Gong:
HMM adaptation and microphone array processing for distant speech recognition. ICASSP 2000: 1411-1414 - [c48]Yifan Gong, Yu-Hung Kao:
Implementing a high accuracy speaker-independent continuous speech recognizer on a fixed-point DSP. ICASSP 2000: 3686-3689 - 1999
- [c47]Coimbatore S. Ramalingam, Yifan Gong, Lorin Netsch, Wallace W. Anderson, John J. Godfrey, Yu-Hung Kao:
Speaker-dependent name dialing in a car environment with out-of-vocabulary rejection. ICASSP 1999: 165-168 - [c46]Yifan Gong, John J. Godfrey:
Transforming HMMs for speaker-independent hands-free speech recognition in the car. ICASSP 1999: 297-300 - [c45]Yeshwant K. Muthusamy, Rajeev Agarwal, Yifan Gong, Vishu Viswanathan:
Speech-enabled information retrieval in the automobile environment. ICASSP 1999: 2259-2262 - 1997
- [c44]Mohamed Afify, Yifan Gong, Jean-Paul Haton:
A unified maximum likelihood approach to acoustic mismatch compensation: application to noisy Lombard speech recognition. ICASSP 1997: 839-842 - [c43]Irina Illina, Yifan Gong:
Elimination of trajectory folding phenomenon: HMM, trajectory mixture HMM and mixture stochastic trajectory model. ICASSP 1997: 1395-1398 - [c42]Jan P. Verhasselt, Irina Illina, Jean-Pierre Martens, Yifan Gong, Jean-Paul Haton:
The importance of segmentation probability in segment based speech recognizers. ICASSP 1997: 1407-1410 - [c41]Yifan Gong:
Source normalization training for HMM applied to noisy telephone speech recognition. EUROSPEECH 1997: 1555-1558 - [c40]Irina Illina, Yifan Gong:
Speaker normalization training for mixture stochastic trajectory model. EUROSPEECH 1997: 1855-1858 - [c39]Mohamed Afify, Yifan Gong, Jean Paul Haton:
Correlation based predictive adaptation of hidden Markov models. EUROSPEECH 1997: 2059-2062 - [c38]Mohamed Afify, Yifan Gong, Jean Paul Haton:
An acoustic subword unit approach to non-linguistic speech feature identification. EUROSPEECH 1997: 2291-2294 - 1996
- [c37]Olivier Siohan, Yifan Gong:
A semi-continuous stochastic trajectory model for phoneme-based continuous speech recognition. ICASSP 1996: 471-474 - [c36]Haizhou Li, Yifan Gong, Jean-Paul Haton:
Probabilistic mapping networks for speaker recognition. ICASSP 1996: 3374-3377 - [c35]Yifan Gong, Irina Illina, Jean Paul Haton:
Modelling long term variability information in mixture stochastic trajectory framework. ICSLP 1996: 334-337 - [c34]Irina Illina, Yifan Gong:
Stochastic trajectory model with state-mixture for continuous speech recognition. ICSLP 1996: 342-345 - [c33]Xiaohui Ma, Yifan Gong, Yuqing Fu, Jiren Lu, Jean Paul Haton:
A study on continuous Chinese speech recognition based on stochastic trajectory models. ICSLP 1996: 482-485 - [c32]Irina Illina, Yifan Gong:
Improvement in n-best search for continuous speech recognition. ICSLP 1996: 2147-2150 - 1995
- [c31]George Saon, Abdel Belaïd, Yifan Gong:
Stochastic trajectory modeling for recognition of unconstrained handwritten words. ICDAR 1995: 508-511 - [c30]Haizhou Li, Jean Paul Haton, Yifan Gong:
On MMI learning of Gaussian mixture for speaker models. EUROSPEECH 1995: 363-366 - [c29]Yifan Gong:
Evaluation of Bayes decision approach to automatic determination of thresholds for speaker verification. EUROSPEECH 1995: 367-370 - [c28]Olivier Siohan, Yifan Gong, Jean Paul Haton:
Noise adaptation using linear regression for continuous noisy speech recognition. EUROSPEECH 1995: 465-468 - [c27]Mohamed Afify, Yifan Gong, Jean Paul Haton:
Stochastic trajectory models for speech recognition: an extension to modelling time correlation. EUROSPEECH 1995: 515-518 - [c26]Haizhou Li, Jean Paul Haton, Jian Su, Yifan Gong:
Speaker recognition with temporal transition models. EUROSPEECH 1995: 617-620 - 1994
- [c25]Yifan Gong, Jean Paul Haton:
Stochastic trajectory modeling for speech recognition. ICASSP (1) 1994: 57-60 - [c24]William C. Treurniet, Yifan Gong:
Noise independent speech recognition for a variety of noise types. ICASSP (1) 1994: 437-440 - [c23]Mohamed Afify, Yifan Gong, Jean Paul Haton:
Nonlinear time alignment in stochastic trajectory models for speech recognition. ICSLP 1994: 291-294 - [c22]Olivier Siohan, Yifan Gong, Jean Paul Haton:
A comparison of three noisy speech recognition approaches. ICSLP 1994: 1031-1034 - [c21]George Saon, Abdel Belaïd, Yifan Gong:
Off-line Handwriting Recognition by Statistical Correlation. MVA 1994: 371-374 - 1993
- [c20]Yifan Gong, William C. Treurniet:
Duration of phones as function of utterance length and its use in automatic speech recognition. EUROSPEECH 1993: 315-318 - [c19]Olivier Siohan, Yifan Gong, Jean Paul Haton:
A Bayesian approach to phone duration adaptation for lombard speech recognition. EUROSPEECH 1993: 1639-1642 - [c18]Yifan Gong, Jean Paul Haton:
Iterative transformation and alignment for speech labeling. EUROSPEECH 1993: 1759-1762 - [c17]Feriel Mouria, Yifan Gong, Jean Paul Haton:
Use of explicit context-dependent phonemic model in continuous speech recognition. EUROSPEECH 1993: 2223-2226 - [c16]Yifan Gong:
Base transformation for environment adaptation in continuous speech recognition. EUROSPEECH 1993: 2227-2230 - 1992
- [c15]Yifan Gong, Jean-Paul Haton:
Nonlinear vectorial interpolation for speaker recognition. ICASSP 1992: 173-176 - [c14]Yifan Gong, Anne Boyer:
Hand-written text recognition based on a new formulation. ICPR (2) 1992: 112-115 - [c13]Yifan Gong, Olivier Siohan, Jean Paul Haton:
Minimization of speech alignment error by iterative transformation for speaker adaptation. ICSLP 1992: 377-380 - [c12]Yifan Gong, Jean Paul Haton:
DTW-based phonetic labeling using explicit phoneme duration constraints. ICSLP 1992: 863-866 - 1991
- [c11]Yifan Gong, Jean-Paul Haton:
Non-linear vector interpolation by neural network for phoneme identification in continuous speech. ICASSP 1991: 121-124 - [c10]Yifan Gong, Ying Cheng, Jean-Paul Haton:
Neural network coupled with IIR sequential adapter for phoneme recognition in continuous speech. ICASSP 1991: 153-156 - [c9]Yifan Gong, Jean-Paul Haton, Feriel Mouria:
Continuous speech recognition based on high plausibility regions. ICASSP 1991: 725-728 - [c8]Yifan Gong, Jean Paul Haton:
Comparing two phoneme identification methods using a continuous speech recognizer. EUROSPEECH 1991: 417-420 - [c7]Yifan Gong, Jean Paul Haton:
VINICS: a continuous speech recognizer based on a new robust formulation. EUROSPEECH 1991: 1221-1224 - 1990
- [c6]Yifan Gong, Jean-Paul Haton:
Text-independent speaker recognition by trajectory space comparison. ICASSP 1990: 285-288 - [c5]Yifan Gong, Jean-Paul Haton:
Towards a general signal interpretation system-signal-to-symbol conversion level. ICPR (2) 1990: 79-84 - [c4]Yifan Gong, Jean-Paul Haton:
A multiknowledge base system for continuous speech understanding. ICPR (2) 1990: 224-227 - 1989
- [c3]Yifan Gong, Anne Boyer, Jean Paul Haton:
Parallel construction of syntactic structure for continuous speech recognition. EUROSPEECH 1989: 2047-2050 - 1988
- [c2]Yifan Gong, Jean-Paul Haton:
A specialist society for continuous speech understanding. ICASSP 1988: 627-630 - 1987
- [c1]Yifan Gong, Jean Paul Haton:
Phoneme-based continuous speech recognition without pre-segmentation. ECST 1987: 1121-1124
Parts in Books or Collections
- 2017
- [p1]Yifan Gong, Yan Huang, Kshitiz Kumar, Jinyu Li, Chaojun Liu, Guoli Ye, Shi-Xiong Zhang, Yong Zhao, Rui Zhao:
Challenges in and Solutions to Deep Learning Network Acoustic Modeling in Speech Recognition Products at Microsoft. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 401-417
Informal and Other Publications
- 2024
- [i52]Alon Vinnikov, Amir Ivry, Aviv Hurvitz, Igor Abramovski, Sharon Koubi, Ilya Gurvich, Shai Pe'er, Xiong Xiao, Benjamin Martinez Elizalde, Naoyuki Kanda, Xiaofei Wang, Shalev Shaer, Stav Yagev, Yossi Asher, Sunit Sivasankaran, Yifan Gong, Min Tang, Huaming Wang, Eyal Krupka:
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription. CoRR abs/2401.08887 (2024) - 2023
- [i51]Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue, Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong:
Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training. CoRR abs/2303.00786 (2023) - [i50]Shaoshi Ling, Guoli Ye, Rui Zhao, Yifan Gong:
Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation. CoRR abs/2309.07369 (2023) - [i49]Jing Yao, Xiaoyuan Yi, Xiting Wang, Yifan Gong, Xing Xie:
Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human Values. CoRR abs/2311.10766 (2023) - 2022
- [i48]Liang Lu, Jinyu Li, Yifan Gong:
Endpoint Detection for Streaming End-to-End Multi-talker ASR. CoRR abs/2201.09979 (2022) - [i47]Yashesh Gaur, Nick Kibre, Jian Xue, Kangyuan Shu, Yuhui Wang, Issac Alphonso, Jinyu Li, Yifan Gong:
Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition. CoRR abs/2211.03721 (2022) - 2021
- [i46]Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong:
Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition. CoRR abs/2102.01380 (2021) - [i45]Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong:
Streaming Multi-talker Speech Recognition with Joint Speaker Identification. CoRR abs/2104.02109 (2021) - [i44]Rui Zhao, Jian Xue, Jinyu Li, Wenning Wei, Lei He, Yifan Gong:
On Addressing Practical Challenges for RNN-Transducer. CoRR abs/2105.00858 (2021) - [i43]Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong:
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition. CoRR abs/2106.02302 (2021) - [i42]Jeremy Heng Meng Wong, Igor Abramovski, Xiong Xiao, Yifan Gong:
Diarisation using location tracking with agglomerative clustering. CoRR abs/2109.10598 (2021) - [i41]Jeremy Heng Meng Wong, Yifan Gong:
Joint speaker diarisation and tracking in switching state-space model. CoRR abs/2109.11140 (2021) - [i40]Guoli Ye, Vadim Mazalov, Jinyu Li, Yifan Gong:
Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition. CoRR abs/2110.04891 (2021) - [i39]Zhong Meng, Yashesh Gaur, Naoyuki Kanda, Jinyu Li, Xie Chen, Yu Wu, Yifan Gong:
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition. CoRR abs/2110.05354 (2021) - 2020
- [i38]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Character-Aware Attention-Based End-to-End Speech Recognition. CoRR abs/2001.01795 (2020) - [i37]Zhong Meng, Jinyu Li, Yashesh Gaur, Yifan Gong:
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition. CoRR abs/2001.01798 (2020) - [i36]Jinyu Li, Rui Zhao, Eric Sun, Jeremy Heng Meng Wong, Amit Das, Zhong Meng, Yifan Gong:
High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model. CoRR abs/2003.07482 (2020) - [i35]Hirofumi Inaguma, Yashesh Gaur, Liang Lu, Jinyu Li, Yifan Gong:
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR. CoRR abs/2004.05009 (2020) - [i34]Zhong Meng, Hu Hu, Jinyu Li, Changliang Liu, Yan Huang, Yifan Gong, Chin-Hui Lee:
L-Vector: Neural Label Embedding for Domain Adaptation. CoRR abs/2004.13480 (2020) - [i33]Hu Hu, Rui Zhao, Jinyu Li, Liang Lu, Yifan Gong:
Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition. CoRR abs/2005.00572 (2020) - [i32]Liang Lu, Changliang Liu, Jinyu Li, Yifan Gong:
Exploring Transformers for Large-Scale Speech Recognition. CoRR abs/2005.09684 (2020) - [i31]Jinyu Li, Rui Zhao, Zhong Meng, Yanqing Liu, Wenning Wei, Sarangarajan Parthasarathy, Vadim Mazalov, Zhenghao Wang, Lei He, Sheng Zhao, Yifan Gong:
Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability. CoRR abs/2007.15188 (2020) - [i30]Peidong Wang, Zhuo Chen, DeLiang Wang, Jinyu Li, Yifan Gong:
Speaker Separation Using Speaker Inventories and Estimated Speech. CoRR abs/2010.10556 (2020) - [i29]Xiong Xiao, Naoyuki Kanda, Zhuo Chen, Tianyan Zhou, Takuya Yoshioka, Sanyuan Chen, Yong Zhao, Gang Liu, Yu Wu, Jian Wu, Shujie Liu, Jinyu Li, Yifan Gong:
Microsoft Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2020. CoRR abs/2010.11458 (2020) - [i28]Liang Lu, Zhong Meng, Naoyuki Kanda, Jinyu Li, Yifan Gong:
On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer. CoRR abs/2010.12673 (2020) - [i27]Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong:
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition. CoRR abs/2011.01991 (2020) - [i26]Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong:
Streaming end-to-end multi-talker speech recognition. CoRR abs/2011.13148 (2020) - 2019
- [i25]Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong:
Speaker Adaptation for End-to-End CTC Models. CoRR abs/1901.01239 (2019) - [i24]Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong:
Conditional Teacher-Student Learning. CoRR abs/1904.12399 (2019) - [i23]Zhong Meng, Jinyu Li, Yifan Gong:
Attentive Adversarial Learning for Domain-Invariant Training. CoRR abs/1904.12400 (2019) - [i22]Zhong Meng, Yong Zhao, Jinyu Li, Yifan Gong:
Adversarial Speaker Verification. CoRR abs/1904.12406 (2019) - [i21]Zhong Meng, Jinyu Li, Yifan Gong:
Adversarial Speaker Adaptation. CoRR abs/1904.12407 (2019) - [i20]Shi-Xiong Zhang, Yifan Gong, Dong Yu:
Encrypted Speech Recognition using Deep Polynomial Networks. CoRR abs/1905.05605 (2019) - [i19]Liang Lu, Xiong Xiao, Zhuo Chen, Yifan Gong:
PyKaldi2: Yet another speech toolkit based on Kaldi and PyTorch. CoRR abs/1907.05955 (2019) - [i18]Liang Lu, Eric Sun, Yifan Gong:
Self-Teaching Networks. CoRR abs/1909.04157 (2019) - [i17]Jinyu Li, Rui Zhao, Hu Hu, Yifan Gong:
Improving RNN Transducer Modeling for End-to-End Speech Recognition. CoRR abs/1909.12415 (2019) - [i16]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Speaker Adaptation for Attention-Based End-to-End Speech Recognition. CoRR abs/1911.03762 (2019) - [i15]Takuya Yoshioka, Igor Abramovski, Cem Aksoylar, Zhuo Chen, Moshe David, Dimitrios Dimitriadis, Yifan Gong, Ilya Gurvich, Xuedong Huang, Yan Huang, Aviv Hurvitz, Li Jiang, Sharon Koubi, Eyal Krupka, Ido Leichter, Changliang Liu, Partha Parthasarathy, Alon Vinnikov, Lingfeng Wu, Xiong Xiao, Wayne Xiong, Huaming Wang, Zhenghao Wang, Jun Zhang, Yong Zhao, Tianyan Zhou:
Advances in Online Audio-Visual Meeting Transcription. CoRR abs/1912.04979 (2019) - 2018
- [i14]Amit Das, Jinyu Li, Rui Zhao, Yifan Gong:
Advancing Connectionist Temporal Classification With Attention Modeling. CoRR abs/1803.05563 (2018) - [i13]Jinyu Li, Guoli Ye, Amit Das, Rui Zhao, Yifan Gong:
Advancing Acoustic-to-Word CTC Model. CoRR abs/1803.05566 (2018) - [i12]Zhuo Chen, Jinyu Li, Xiong Xiao, Takuya Yoshioka, Huaming Wang, Zhenghao Wang, Yifan Gong:
Cracking the cocktail party problem by multi-beam deep attractor network. CoRR abs/1803.10924 (2018) - [i11]Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang Juang:
Adversarial Teacher-Student Learning for Unsupervised Domain Adaptation. CoRR abs/1804.00644 (2018) - [i10]Zhong Meng, Jinyu Li, Zhuo Chen, Yong Zhao, Vadim Mazalov, Yifan Gong, Biing-Hwang Juang:
Speaker-Invariant Training via Adversarial Learning. CoRR abs/1804.00732 (2018) - [i9]Jinyu Li, Rui Zhao, Zhuo Chen, Changliang Liu, Xiong Xiao, Guoli Ye, Yifan Gong:
Developing Far-Field Speaker System Via Teacher-Student Learning. CoRR abs/1804.05166 (2018) - [i8]Jinyu Li, Changliang Liu, Yifan Gong:
Layer Trajectory LSTM. CoRR abs/1808.09522 (2018) - [i7]Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang Juang:
Adversarial Feature-Mapping for Speech Enhancement. CoRR abs/1809.02251 (2018) - [i6]Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang Juang:
Cycle-Consistent Speech Enhancement. CoRR abs/1809.02253 (2018) - [i5]Amit Das, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong:
Advancing Acoustic-to-Word CTC Model with Attention and Mixed-Units. CoRR abs/1812.11928 (2018) - 2017
- [i4]Shi-Xiong Zhang, Zhuo Chen, Yong Zhao, Jinyu Li, Yifan Gong:
End-to-End Attention based Text-Dependent Speaker Verification. CoRR abs/1701.00562 (2017) - [i3]Jinyu Li, Michael L. Seltzer, Xi Wang, Rui Zhao, Yifan Gong:
Large-Scale Domain Adaptation via Teacher-Student Learning. CoRR abs/1708.05466 (2017) - [i2]Zhong Meng, Zhuo Chen, Vadim Mazalov, Jinyu Li, Yifan Gong:
Unsupervised Adaptation with Domain Separation Networks for Robust Speech Recognition. CoRR abs/1711.08010 (2017) - [i1]Jinyu Li, Guoli Ye, Rui Zhao, Jasha Droppo, Yifan Gong:
Acoustic-To-Word Model Without OOV. CoRR abs/1711.10136 (2017)
Coauthor Index
aka: Jean-Paul Haton
aka: Jeremy Heng Meng Wong
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-08 21:31 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint