default search action
Qin Jin
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j18]Yawen Zeng, Ning Han, Keyu Pan, Qin Jin:
Temporally Language Grounding With Multi-Modal Multi-Prompt Tuning. IEEE Trans. Multim. 26: 3366-3377 (2024) - [c165]Liang Zhang, Qin Jin, Haoyang Huang, Dongdong Zhang, Furu Wei:
Respond in my Language: Mitigating Language Inconsistency in Response Generation based on Large Language Models. ACL (1) 2024: 4177-4192 - [c164]Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng, Qin Jin:
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline. ACL (1) 2024: 9479-9493 - [c163]Zihao Yue, Liang Zhang, Qin Jin:
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective. ACL (1) 2024: 11766-11781 - [c162]Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin:
ESCoT: Towards Interpretable Emotional Support Dialogue Systems. ACL (1) 2024: 13395-13412 - [c161]Yuting Mei, Linli Yao, Qin Jin:
UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos. ICMR 2024: 1034-1042 - [i78]Jiatong Shi, Yueqian Lin, Xinyi Bai, Keyi Zhang, Yuning Wu, Yuxun Tang, Yifeng Yu, Qin Jin, Shinji Watanabe:
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2. CoRR abs/2401.17619 (2024) - [i77]Zihao Yue, Liang Zhang, Qin Jin:
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective. CoRR abs/2402.14545 (2024) - [i76]Boshen Xu, Sipeng Zheng, Qin Jin:
POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View World. CoRR abs/2403.05856 (2024) - [i75]Boshen Xu, Sipeng Zheng, Qin Jin:
SPAFormer: Sequential 3D Part Assembly with Transformers. CoRR abs/2403.05874 (2024) - [i74]Anwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan, Liang Zhang, Bo Zhang, Chen Li, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou:
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding. CoRR abs/2403.12895 (2024) - [i73]Zihao Yue, Yepeng Zhang, Ziheng Wang, Qin Jin:
Movie101v2: Improved Movie Narration Benchmark. CoRR abs/2404.13370 (2024) - [i72]Qingrong He, Kejun Lin, Shizhe Chen, Anwen Hu, Qin Jin:
Think-Program-reCtify: 3D Situated Reasoning with Large Language Models. CoRR abs/2404.14705 (2024) - [i71]Liang Zhang, Anwen Hu, Haiyang Xu, Ming Yan, Yichen Xu, Qin Jin, Ji Zhang, Fei Huang:
TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning. CoRR abs/2404.16635 (2024) - [i70]Zhaopei Huang, Jinming Zhao, Qin Jin:
ECR-Chain: Advancing Generative Language Models to Better Emotion-Cause Reasoners through Reasoning Chains. CoRR abs/2405.10860 (2024) - [i69]Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng, Qin Jin:
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline. CoRR abs/2405.14040 (2024) - [i68]Boshen Xu, Ziheng Wang, Yang Du, Zhinan Song, Sipeng Zheng, Qin Jin:
EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions? CoRR abs/2405.17719 (2024) - [i67]Xuankai Chang, Jiatong Shi, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin Jin:
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units. CoRR abs/2406.07725 (2024) - [i66]Yuning Wu, Chunlei Zhang, Jiatong Shi, Yuxun Tang, Shan Yang, Qin Jin:
TokSing: Singing Voice Synthesis based on Discrete Tokens. CoRR abs/2406.08416 (2024) - [i65]Yuxun Tang, Yuning Wu, Jiatong Shi, Qin Jin:
SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models. CoRR abs/2406.08905 (2024) - [i64]Fengyuan Zhang, Zhaopei Huang, Xinjie Zhang, Qin Jin:
Adaptive Temporal Motion Guided Graph Convolution Network for Micro-expression Recognition. CoRR abs/2406.08997 (2024) - [i63]Yuxun Tang, Jiatong Shi, Yuning Wu, Qin Jin:
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction. CoRR abs/2406.10911 (2024) - [i62]Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin:
ESCoT: Towards Interpretable Emotional Support Dialogue Systems. CoRR abs/2406.10960 (2024) - [i61]Yuting Mei, Linli Yao, Qin Jin:
UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos. CoRR abs/2406.16301 (2024) - [i60]Ye Wang, Yuting Mei, Sipeng Zheng, Qin Jin:
QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds. CoRR abs/2406.16578 (2024) - 2023
- [j17]Liang Zhang, Ludan Ruan, Anwen Hu, Qin Jin:
Multimodal Pretraining from Monolingual to Multilingual. Mach. Intell. Res. 20(2): 220-232 (2023) - [j16]Yun Zhang, Qi Lu, Qin Jin, Wanting Meng, Shuhu Yang, Shen Huang, Yanling Han, Zhonghua Hong, Zhansheng Chen, Weiliang Liu:
Global Sea Surface Height Measurement From CYGNSS Based on Machine Learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 16: 841-852 (2023) - [c160]Yuqi Liu, Luhui Xu, Pengfei Xiong, Qin Jin:
Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language. AAAI 2023: 1781-1789 - [c159]Yawen Zeng, Qin Jin, Tengfei Bao, Wenfeng Li:
Multi-Modal Knowledge Hypergraph for Diverse Image Retrieval. AAAI 2023: 3376-3383 - [c158]Ludan Ruan, Anwen Hu, Yuqing Song, Liang Zhang, Sipeng Zheng, Qin Jin:
Accommodating Audio Modality in CLIP for Multimodal Processing. AAAI 2023: 9641-9649 - [c157]Liang Zhang, Anwen Hu, Jing Zhang, Shuo Hu, Qin Jin:
MPMQA: Multimodal Question Answering on Product Manuals. AAAI 2023: 13958-13966 - [c156]Tao Qian, Fan Lou, Jiatong Shi, Yuning Wu, Shuai Guo, Xiang Yin, Qin Jin:
UniLG: A Unified Structure-aware Framework for Lyrics Generation. ACL (1) 2023: 983-1001 - [c155]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation. ACL (1) 2023: 3171-3185 - [c154]Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin Jin:
Movie101: A New Movie Understanding Benchmark. ACL (1) 2023: 4669-4684 - [c153]Dingyi Yang, Qin Jin:
Attractive Storyteller: Stylized Visual Storytelling with Unpaired Text. ACL (1) 2023: 11053-11066 - [c152]Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo:
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation. CVPR 2023: 10219-10228 - [c151]Sipeng Zheng, Boshen Xu, Qin Jin:
Open-Category Human-Object Interaction Pre-training via Language Modeling Framework. CVPR 2023: 19392-19402 - [c150]Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Guohai Xu, Chenliang Li, Junfeng Tian, Qi Qian, Ji Zhang, Qin Jin, Liang He, Xin Lin, Fei Huang:
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model. EMNLP (Findings) 2023: 2841-2858 - [c149]Yuning Wu, Jiatong Shi, Tao Qian, Dongji Gao, Qin Jin:
Phoneix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation With Phoneme Distribution Predictor. ICASSP 2023: 1-5 - [c148]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
Explore and Tell: Embodied Visual Captioning in 3D Environments. ICCV 2023: 2482-2491 - [c147]Jieting Chen, Junkai Ding, Wenping Chen, Qin Jin:
Knowledge Enhanced Model for Live Video Comment Generation. ICME 2023: 2267-2272 - [c146]Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu:
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World. ACM Multimedia 2023: 1303-1313 - [c145]Boshen Xu, Sipeng Zheng, Qin Jin:
POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-view World. ACM Multimedia 2023: 2807-2816 - [c144]Dingyi Yang, Hongyu Chen, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin:
Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences. ACM Multimedia 2023: 5705-5715 - [c143]Yuchen Liu, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma, Qin Jin:
Emotionally Situated Text-to-Speech Synthesis in User-Agent Conversation. ACM Multimedia 2023: 5966-5974 - [c142]Zihao Yue, Anwen Hu, Liang Zhang, Qin Jin:
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation. NeurIPS 2023 - [c141]Zhaopei Huang, Jinming Zhao, Qin Jin:
Two-Stage Adaptation for Cross-Corpus Multimodal Emotion Recognition. NLPCC (2) 2023: 431-443 - [c140]Weijing Chen, Linli Yao, Qin Jin:
Rethinking Benchmarks for Cross-modal Image-text Retrieval. SIGIR 2023: 1241-1251 - [c139]Linli Yao, Weijing Chen, Qin Jin:
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge. WWW 2023: 2392-2401 - [i59]Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu:
TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat. CoRR abs/2301.05880 (2023) - [i58]Ludan Ruan, Anwen Hu, Yuqing Song, Liang Zhang, Sipeng Zheng, Qin Jin:
Accommodating Audio Modality in CLIP for Multimodal Processing. CoRR abs/2303.06591 (2023) - [i57]Yuning Wu, Jiatong Shi, Tao Qian, Dongji Gao, Qin Jin:
PHONEix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation with Phoneme Distribution Predictor. CoRR abs/2303.08607 (2023) - [i56]Liang Zhang, Anwen Hu, Jing Zhang, Shuo Hu, Qin Jin:
MPMQA: Multimodal Question Answering on Product Manuals. CoRR abs/2304.09660 (2023) - [i55]Weijing Chen, Linli Yao, Qin Jin:
Rethinking Benchmarks for Cross-modal Image-text Retrieval. CoRR abs/2304.10824 (2023) - [i54]Jieting Chen, Junkai Ding, Wenping Chen, Qin Jin:
Knowledge Enhanced Model for Live Video Comment Generation. CoRR abs/2304.14657 (2023) - [i53]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation. CoRR abs/2305.06002 (2023) - [i52]Linli Yao, Yuanmeng Zhang, Ziheng Wang, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin:
Edit As You Wish: Video Description Editing with Multi-grained Commands. CoRR abs/2305.08389 (2023) - [i51]Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, Qin Jin:
Movie101: A New Movie Understanding Benchmark. CoRR abs/2305.12140 (2023) - [i50]Zihao Yue, Anwen Hu, Liang Zhang, Qin Jin:
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation. CoRR abs/2306.13460 (2023) - [i49]Qi Zhang, Sipeng Zheng, Qin Jin:
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection. CoRR abs/2307.10567 (2023) - [i48]Dingyi Yang, Hongyu Chen, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin:
Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences. CoRR abs/2307.16399 (2023) - [i47]Yuning Wu, Yifeng Yu, Jiatong Shi, Tao Qian, Qin Jin:
A Systematic Exploration of Joint-training for Singing Voice Synthesis. CoRR abs/2308.02867 (2023) - [i46]Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin:
Explore and Tell: Embodied Visual Captioning in 3D Environments. CoRR abs/2308.10447 (2023) - [i45]Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Guohai Xu, Chenliang Li, Junfeng Tian, Qi Qian, Ji Zhang, Qin Jin, Liang He, Xin Alex Lin, Fei Huang:
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model. CoRR abs/2310.05126 (2023) - 2022
- [j15]Ludan Ruan, Qin Jin:
Survey: Transformer based video-language pre-training. AI Open 3: 1-13 (2022) - [j14]Yuqing Song, Shizhe Chen, Qin Jin, Wei Luo, Jun Xie, Fei Huang:
Enhancing Neural Machine Translation With Dual-Side Multimodal Awareness. IEEE Trans. Multim. 24: 3013-3024 (2022) - [c138]Linli Yao, Weiying Wang, Qin Jin:
Image Difference Captioning with Pre-training and Contrastive Learning. AAAI 2022: 3108-3116 - [c137]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. ACL (1) 2022: 5699-5710 - [c136]Yuchen Liu, Jinming Zhao, Jingwen Hu, Ruichen Li, Qin Jin:
DialogueEIN: Emotion Interaction Network for Dialogue Affective Analysis. COLING 2022: 684-693 - [c135]Liyu Meng, Yuchen Liu, Xiaolong Liu, Zhaopei Huang, Wenqiang Jiang, Tenggan Zhang, Chuanhe Liu, Qin Jin:
Valence and Arousal Estimation based on Multimodal Temporal-Aware Features for Videos in the Wild. CVPR Workshops 2022: 2344-2351 - [c134]Sipeng Zheng, Shizhe Chen, Qin Jin:
VRDFormer: End-to-End Video Visual Relation Detection with Transformers. CVPR 2022: 18814-18824 - [c133]Tenggan Zhang, Chuanhe Liu, Xiaolong Liu, Yuchen Liu, Liyu Meng, Lei Sun, Wenqiang Jiang, Fengyuan Zhang, Jinming Zhao, Qin Jin:
Multi-Task Learning Framework for Emotion Recognition In-the-Wild. ECCV Workshops (6) 2022: 143-156 - [c132]Sipeng Zheng, Shizhe Chen, Qin Jin:
Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning. ECCV (4) 2022: 297-313 - [c131]Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin:
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval. ECCV (14) 2022: 319-335 - [c130]Qi Zhang, Yuqing Song, Qin Jin:
Unifying Event Detection and Captioning as Sequence Generation via Pre-training. ECCV (36) 2022: 363-379 - [c129]Qi Zhang, Zihao Yue, Anwen Hu, Ziheng Wang, Qin Jin:
MovieUN: A Dataset for Movie Understanding and Narrating. EMNLP (Findings) 2022: 1873-1885 - [c128]Yuwen Chen, Jian Ma, Peihu Zhu, Xiaoming Huang, Qin Jin:
Leveraging Trust Relations to Improve Academic Patent Recommendation. HICSS 2022: 1-10 - [c127]Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li:
Memobert: Pre-Training Model with Prompt-Based Learning for Multimodal Emotion Recognition. ICASSP 2022: 4703-4707 - [c126]Tao Qian, Jiatong Shi, Shuai Guo, Peter Wu, Qin Jin:
Training Strategies for Automatic Song Writing: A Unified Framework Perspective. ICASSP 2022: 4738-4742 - [c125]Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin:
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy. INTERSPEECH 2022: 4272-4276 - [c124]Jiatong Shi, Shuai Guo, Tao Qian, Tomoki Hayashi, Yuning Wu, Fangzheng Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis. INTERSPEECH 2022: 4277-4281 - [c123]Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:
M4MM '22: 1st International Workshop on Methodologies for Multimedia. ACM Multimedia 2022: 7394-7396 - [c122]Si Liu, Qin Jin, Luoqi Liu, Zongheng Tang, Linli Lin:
PIC'22: 4th Person in Context Workshop. ACM Multimedia 2022: 7418-7419 - [c121]Liang Zhang, Anwen Hu, Qin Jin:
Multi-Lingual Acquisition on Multimodal Pre-training for Cross-modal Retrieval. NeurIPS 2022 - [c120]Yida Zhao, Yuqing Song, Qin Jin:
Progressive Learning for Image Retrieval with Hybrid-Modality Queries. SIGIR 2022: 1012-1021 - [e2]João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:
MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022. ACM 2022, ISBN 978-1-4503-9203-7 [contents] - [i44]Linli Yao, Weiying Wang, Qin Jin:
Image Difference Captioning with Pre-training and Contrastive Learning. CoRR abs/2202.04298 (2022) - [i43]Liyu Meng, Yuchen Liu, Xiaolong Liu, Zhaopei Huang, Yuan Cheng, Meng Wang, Chuanhe Liu, Qin Jin:
Multi-modal Emotion Estimation for in-the-wild Videos. CoRR abs/2203.13032 (2022) - [i42]Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin:
SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy. CoRR abs/2203.17001 (2022) - [i41]Yida Zhao, Yuqing Song, Qin Jin:
Progressive Learning for Image Retrieval with Hybrid-Modality Queries. CoRR abs/2204.11212 (2022) - [i40]Jiatong Shi, Shuai Guo, Tao Qian, Nan Huo, Tomoki Hayashi, Yuning Wu, Frank Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis. CoRR abs/2205.04029 (2022) - [i39]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. CoRR abs/2205.10237 (2022) - [i38]Liang Zhang, Anwen Hu, Qin Jin:
Generalizing Multimodal Pre-training into Multilingual via Language Acquisition. CoRR abs/2206.11091 (2022) - [i37]Yuqi Liu, Pengfei Xiong, Luhui Xu, Shengming Cao, Qin Jin:
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval. CoRR abs/2207.07852 (2022) - [i36]Qi Zhang, Yuqing Song, Qin Jin:
Unifying Event Detection and Captioning as Sequence Generation via Pre-Training. CoRR abs/2207.08625 (2022) - [i35]Sipeng Zheng, Qi Zhang, Bei Liu, Qin Jin, Jianlong Fu:
Exploring Anchor-based Detection for Ego4D Natural Language Query. CoRR abs/2208.05375 (2022) - [i34]Linli Yao, Weijing Chen, Qin Jin:
CapEnrich: Enriching Caption Semantics for Web Images via Cross-modal Pre-trained Knowledge. CoRR abs/2211.09371 (2022) - [i33]Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo:
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation. CoRR abs/2212.09478 (2022) - 2021
- [j13]Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan Yao, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu:
Pre-trained models: Past, present and future. AI Open 2: 225-250 (2021) - [c119]Jinming Zhao, Ruichen Li, Qin Jin:
Missing Modality Imagination Network for Emotion Recognition with Uncertain Missing Modalities. ACL/IJCNLP (1) 2021: 2608-2618 - [c118]Jingwen Hu, Yuchen Liu, Jinming Zhao, Qin Jin:
MMGCN: Multimodal Fusion via Deep Graph Convolution Network for Emotion Recognition in Conversation. ACL/IJCNLP (1) 2021: 5666-5675 - [c117]Yuqing Song, Shizhe Chen, Qin Jin:
Towards Diverse Paragraph Captioning for Untrimmed Videos. CVPR 2021: 11245-11254 - [c116]Jia Chen, Yike Wu, Shiwan Zhao, Qin Jin:
Language Resource Efficient Learning for Captioning. EMNLP (Findings) 2021: 1887-1895 - [c115]Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin:
Sequence-To-Sequence Singing Voice Synthesis With Perceptual Entropy Loss. ICASSP 2021: 76-80 - [c114]Ruichen Li, Jinming Zhao, Qin Jin:
Speech Emotion Recognition via Multi-Level Cross-Modal Distillation. Interspeech 2021: 4488-4492 - [c113]Bei Liu, Jianlong Fu, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Yong Rui:
MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding. ICMR 2021: 694-695 - [c112]Tenggan Zhang, Zhaopei Huang, Ruichen Li, Jinming Zhao, Qin Jin:
Multimodal Fusion Strategies for Physiological-emotion Analysis. MuSe @ ACM Multimedia 2021: 43-50 - [c111]Yuqing Song, Shizhe Chen, Qin Jin, Wei Luo, Jun Xie, Fei Huang:
Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training. ACM Multimedia 2021: 2843-2852 - [c110]Anwen Hu, Shizhe Chen, Qin Jin:
Question-controlled Text-aware Image Captioning. ACM Multimedia 2021: 3097-3105 - [c109]Ludan Ruan, Qin Jin:
Efficient Proposal Generation with U-shaped Network for Temporal Sentence Grounding. MMAsia 2021: 26:1-26:7 - [e1]Bei Liu, Jianlong Fu, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Yong Rui:
MMPT@ICMR2021: Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding, Taipei, Taiwan, August 21, 2021. ACM 2021, ISBN 978-1-4503-8530-5 [contents] - [i32]