Остановите войну!
for scientists:
default search action
Rongjie Huang
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c24]Yu Zhang, Rongjie Huang, Ruiqi Li, Jinzheng He, Yan Xia, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao:
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis. AAAI 2024: 19597-19605 - [c23]Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Yuexian Zou, Zhou Zhao, Shinji Watanabe:
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. AAAI 2024: 23802-23804 - [i34]Zhenhui Ye, Tianyun Zhong, Yi Ren, Jiaqi Yang, Weichuang Li, Jiawei Huang, Ziyue Jiang, Jinzheng He, Rongjie Huang, Jinglin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao:
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis. CoRR abs/2401.08503 (2024) - [i33]Shengpeng Ji, Minghui Fang, Ziyue Jiang, Rongjie Huang, Jialong Zuo, Shulei Wang, Zhou Zhao:
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models. CoRR abs/2402.12208 (2024) - [i32]Yongqi Wang, Ruofan Hu, Rongjie Huang, Zhiqing Hong, Ruiqi Li, Wenrui Liu, Fuming You, Tao Jin, Zhou Zhao:
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt. CoRR abs/2403.11780 (2024) - 2023
- [c22]Jinzheng He, Jinglin Liu, Zhenhui Ye, Rongjie Huang, Chenye Cui, Huadai Liu, Zhou Zhao:
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis. ACL (Findings) 2023: 236-248 - [c21]Rongjie Huang, Yi Ren, Ziyue Jiang, Chenye Cui, Jinglin Liu, Zhou Zhao:
FastDiff 2: Revisiting and Incorporating GANs and Diffusion Models in High-Fidelity Speech Synthesis. ACL (Findings) 2023: 6994-7009 - [c20]Ruiqi Li, Rongjie Huang, Lichao Zhang, Jinglin Liu, Zhou Zhao:
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment. ACL (Findings) 2023: 7074-7088 - [c19]Rongjie Huang, Chunlei Zhang, Yi Ren, Zhou Zhao, Dong Yu:
Prosody-TTS: Improving Prosody with Masked Autoencoder and Conditional Diffusion Model For Expressive Text-to-Speech. ACL (Findings) 2023: 8018-8034 - [c18]Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao:
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation. ACL (1) 2023: 8590-8604 - [c17]Zhenhui Ye, Rongjie Huang, Yi Ren, Ziyue Jiang, Jinglin Liu, Jinzheng He, Xiang Yin, Zhou Zhao:
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training. ACL (1) 2023: 9317-9331 - [c16]Linjun Li, Tao Jin, Xize Cheng, Ye Wang, Wang Lin, Rongjie Huang, Zhou Zhao:
Contrastive Token-Wise Meta-Learning for Unseen Performer Visual Temporal-Aligned Translation. ACL (Findings) 2023: 10993-11007 - [c15]Ziyue Jiang, Qian Yang, Jialong Zuo, Zhenhui Ye, Rongjie Huang, Yi Ren, Zhou Zhao:
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models. ACL (Findings) 2023: 11655-11671 - [c14]Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, Hong Chen, Jinzheng He, Zhou Zhao:
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer. EMNLP 2023: 15957-15969 - [c13]Chenye Cui, Zhou Zhao, Yi Ren, Jinglin Liu, Rongjie Huang, Feiyang Chen, Zhefeng Wang, Baoxing Huai, Fei Wu:
VarietySound: Timbre-Controllable Video to Sound Generation Via Unsupervised Information Disentanglement. ICASSP 2023: 1-5 - [c12]Xize Cheng, Tao Jin, Rongjie Huang, Linjun Li, Wang Lin, Zehan Wang, Ye Wang, Huadai Liu, Aoxiong Yin, Zhou Zhao:
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition. ICCV 2023: 15689-15699 - [c11]Rongjie Huang, Jinglin Liu, Huadai Liu, Yi Ren, Lichao Zhang, Jinzheng He, Zhou Zhao:
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation. ICLR 2023 - [c10]Rongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Luping Liu, Mingze Li, Zhenhui Ye, Jinglin Liu, Xiang Yin, Zhou Zhao:
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models. ICML 2023: 13916-13932 - [c9]Zhiqing Hong, Chenye Cui, Rongjie Huang, Lichao Zhang, Jinglin Liu, Jinzheng He, Zhou Zhao:
UniSinger: Unified End-to-End Singing Voice Synthesis With Cross-Modality Information Matching. ACM Multimedia 2023: 7569-7579 - [i31]Rongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Luping Liu, Mingze Li, Zhenhui Ye, Jinglin Liu, Xiang Yin, Zhou Zhao:
Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models. CoRR abs/2301.12661 (2023) - [i30]Dongchao Yang, Songxiang Liu, Rongjie Huang, Guangzhi Lei, Chao Weng, Helen Meng, Dong Yu:
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt. CoRR abs/2301.13662 (2023) - [i29]Xize Cheng, Linjun Li, Tao Jin, Rongjie Huang, Wang Lin, Zehan Wang, Huangdai Liu, Ye Wang, Aoxiong Yin, Zhou Zhao:
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition. CoRR abs/2303.05309 (2023) - [i28]Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe:
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head. CoRR abs/2304.12995 (2023) - [i27]Zhenhui Ye, Jinzheng He, Ziyue Jiang, Rongjie Huang, Jiawei Huang, Jinglin Liu, Yi Ren, Xiang Yin, Zejun Ma, Zhou Zhao:
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation. CoRR abs/2305.00787 (2023) - [i26]Dongchao Yang, Songxiang Liu, Rongjie Huang, Jinchuan Tian, Chao Weng, Yuexian Zou:
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec. CoRR abs/2305.02765 (2023) - [i25]Ruiqi Li, Rongjie Huang, Lichao Zhang, Jinglin Liu, Zhou Zhao:
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment. CoRR abs/2305.04476 (2023) - [i24]Jinzheng He, Jinglin Liu, Zhenhui Ye, Rongjie Huang, Chenye Cui, Huadai Liu, Zhou Zhao:
RMSSinger: Realistic-Music-Score based Singing Voice Synthesis. CoRR abs/2305.10686 (2023) - [i23]Zhenhui Ye, Rongjie Huang, Yi Ren, Ziyue Jiang, Jinglin Liu, Jinzheng He, Xiang Yin, Zhou Zhao:
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training. CoRR abs/2305.10763 (2023) - [i22]Huadai Liu, Rongjie Huang, Jinzheng He, Gang Sun, Ran Shen, Xize Cheng, Zhou Zhao:
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing. CoRR abs/2305.12552 (2023) - [i21]Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, Hong Chen, Jinzheng He, Zhou Zhao:
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer. CoRR abs/2305.12708 (2023) - [i20]Ziyue Jiang, Qian Yang, Jialong Zuo, Zhenhui Ye, Rongjie Huang, Yi Ren, Zhou Zhao:
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models. CoRR abs/2305.13612 (2023) - [i19]Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, Jinzheng He, Lichao Zhang, Jinglin Liu, Xiang Yin, Zhou Zhao:
AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation. CoRR abs/2305.15403 (2023) - [i18]Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, Jinglin Liu, Xiang Yin, Zejun Ma, Zhou Zhao:
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation. CoRR abs/2305.18474 (2023) - [i17]Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Luping Liu, Zhenhui Ye, Ziyue Jiang, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Unified Voice Synthesis With Discrete Representation. CoRR abs/2305.19269 (2023) - [i16]Luping Liu, Zijian Zhang, Yi Ren, Rongjie Huang, Xiang Yin, Zhou Zhao:
Detector Guidance for Multi-Object Text-to-Image Generation. CoRR abs/2306.02236 (2023) - [i15]Ziyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao:
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias. CoRR abs/2306.03509 (2023) - [i14]Yongqi Wang, Jionghao Bai, Rongjie Huang, Ruiqi Li, Zhiqing Hong, Zhou Zhao:
Speech-to-Speech Translation with Discrete-Unit-Based Style Transfer. CoRR abs/2309.07566 (2023) - [i13]Dongchao Yang, Jinchuan Tian, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, Jiatong Shi, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, Shinji Watanabe, Helen Meng:
UniAudio: An Audio Foundation Model Toward Universal Audio Generation. CoRR abs/2310.00704 (2023) - [i12]Haifeng Huang, Zehan Wang, Rongjie Huang, Luping Liu, Xize Cheng, Yang Zhao, Tao Jin, Zhou Zhao:
Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers. CoRR abs/2312.08168 (2023) - [i11]Yu Zhang, Rongjie Huang, Ruiqi Li, Jinzheng He, Yan Xia, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao:
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis. CoRR abs/2312.10741 (2023) - [i10]Xize Cheng, Rongjie Huang, Linjun Li, Tao Jin, Zehan Wang, Aoxiong Yin, Minglei Li, Xinyu Duan, Changpeng Yang, Zhou Zhao:
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation. CoRR abs/2312.15197 (2023) - 2022
- [j1]Rongjie Huang, Guizhong Xie, Yudong Zhong, Hongrui Geng, Hao Li, Liangwen Wang:
Boundary element analysis of thin structures using a dual transformation method for weakly singular boundary integrals. Comput. Math. Appl. 113: 198-213 (2022) - [c8]Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao:
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis. IJCAI 2022: 4157-4163 - [c7]Rongjie Huang, Chenye Cui, Feiyang Chen, Yi Ren, Jinglin Liu, Zhou Zhao, Baoxing Huai, Zhefeng Wang:
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation. ACM Multimedia 2022: 2525-2535 - [c6]Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren:
ProDiff: Progressive Fast Diffusion Model for High-Quality Text-to-Speech. ACM Multimedia 2022: 2595-2605 - [c5]Rongjie Huang, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao:
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech. NeurIPS 2022 - [c4]Lichao Zhang, Ruiqi Li, Shoutong Wang, Liqun Deng, Jinglin Liu, Yi Ren, Jinzheng He, Rongjie Huang, Jieming Zhu, Xiao Chen, Zhou Zhao:
M4Singer: A Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus. NeurIPS 2022 - [i9]Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao:
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis. CoRR abs/2204.09934 (2022) - [i8]Rongjie Huang, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao:
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech Synthesis. CoRR abs/2205.07211 (2022) - [i7]Rongjie Huang, Zhou Zhao, Jinglin Liu, Huadai Liu, Yi Ren, Lichao Zhang, Jinzheng He:
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation. CoRR abs/2205.12523 (2022) - [i6]Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren:
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech. CoRR abs/2207.06389 (2022) - [i5]Chenye Cui, Yi Ren, Jinglin Liu, Rongjie Huang, Zhou Zhao:
VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement. CoRR abs/2211.10666 (2022) - 2021
- [c3]Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao:
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model. Interspeech 2021: 2766-2770 - [c2]Rongjie Huang, Feiyang Chen, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao:
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus. ACM Multimedia 2021: 3945-3954 - [i4]Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao:
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model. CoRR abs/2106.09317 (2021) - [i3]Max W. Y. Lam, Jun Wang, Rongjie Huang, Dan Su, Dong Yu:
Bilateral Denoising Diffusion Models. CoRR abs/2108.11514 (2021) - [i2]Feiyang Chen, Rongjie Huang, Chenye Cui, Yi Ren, Jinglin Liu, Zhou Zhao, Nicholas Jing Yuan, Baoxing Huai:
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation. CoRR abs/2110.07468 (2021) - [i1]Rongjie Huang, Feiyang Chen, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao:
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus. CoRR abs/2112.10358 (2021)
2010 – 2019
- 2017
- [c1]Shubin Cai, Rongjie Huang, Ningsheng Yang, Jinwen Jiang, Zhong Ming, Zhengping Liang, Zhiguang Shan:
Research on Dynamic Safe Loading Techniques in Android Application Protection System. SmartCom 2017: 134-143
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-04-25 01:34 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint