default search action
James R. Glass
Jim Glass 0001
Person information
- affiliation: Massachusetts Institute of Technology (MIT), CSAIL, Cambridge, MA, USA
Other persons with the same name
- Jim Glass 0002 — Electric Power Board of Chattanooga, TN, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c347]Junmo Kang, Hongyin Luo, Yada Zhu, Jacob A. Hansen, James R. Glass, David D. Cox, Alan Ritter, Rogério Feris, Leonid Karlinsky:
Self-Specialization: Uncovering Latent Expertise within Large Language Models. ACL (Findings) 2024: 2681-2706 - [c346]Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long T. Le, Abhishek Kumar, James R. Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister:
Found in the middle: Calibrating Positional Attention Bias Improves Long Context Utilization. ACL (Findings) 2024: 14982-14995 - [c345]Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Daniel Kondermann, Samuel Thomas, Shih-Fu Chang, Rogério Feris, James R. Glass, Hilde Kuehne:
What, When, and Where? Self-Supervised Spatio- Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. CVPR 2024: 18419-18429 - [c344]Wei Fang, Yung-Sung Chuang, James R. Glass:
Joint Inference of Retrieval and Generation for Passage Re-ranking. EACL (Findings) 2024: 2289-2298 - [c343]Yung-Sung Chuang, Linlu Qiu, Cheng-Yu Hsieh, Ranjay Krishna, Yoon Kim, James R. Glass:
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps. EMNLP 2024: 1419-1436 - [c342]Tianhua Zhang, Kun Li, Hongyin Luo, Xixin Wu, James R. Glass, Helen Meng:
Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers. EMNLP 2024: 13444-13461 - [c341]Sameer Khurana, Nauman Dawalatabad, Antoine Laurent, Luis Vicente, Pablo Gimeno, Victoria Mingote, James R. Glass:
Cross-Lingual Transfer Learning for Low-Resource Speech Translation. ICASSP Workshops 2024: 670-674 - [c340]Alexander H. Liu, Sung-Lin Yeh, James R. Glass:
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective. ICASSP 2024: 12051-12055 - [c339]Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James R. Glass:
Listen, Think, and Understand. ICLR 2024 - [c338]Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James R. Glass, Pengcheng He:
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models. ICLR 2024 - [c337]Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James R. Glass, Akash Srivastava, Pulkit Agrawal:
Curiosity-driven Red-teaming for Large Language Models. ICLR 2024 - [c336]Heng-Jui Chang, James R. Glass:
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces. NAACL-HLT 2024: 642-662 - [c335]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, Jim Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. NAACL-HLT (Findings) 2024: 4131-4155 - [i159]Alexander H. Liu, Sung-Lin Yeh, James R. Glass:
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective. CoRR abs/2401.08833 (2024) - [i158]Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James R. Glass, Akash Srivastava, Pulkit Agrawal:
Curiosity-driven Red-teaming for Large Language Models. CoRR abs/2402.19464 (2024) - [i157]Philip Schroeder, Nathaniel Morgan, Hongyin Luo, James R. Glass:
THREAD: Thinking Deeper with Recursive Spawning. CoRR abs/2405.17402 (2024) - [i156]Andrew Rouditchenko, Yuan Gong, Samuel Thomas, Leonid Karlinsky, Hilde Kuehne, Rogério Feris, James R. Glass:
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation. CoRR abs/2406.10082 (2024) - [i155]Tianhua Zhang, Kun Li, Hongyin Luo, Xixin Wu, James R. Glass, Helen Meng:
Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers. CoRR abs/2406.10991 (2024) - [i154]Junmo Kang, Leonid Karlinsky, Hongyin Luo, Zhen Wang, Jacob A. Hansen, Jim Glass, David D. Cox, Rameswar Panda, Rogério Feris, Alan Ritter:
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts. CoRR abs/2406.12034 (2024) - [i153]Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long T. Le, Abhishek Kumar, James R. Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister:
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization. CoRR abs/2406.16008 (2024) - [i152]Liming Wang, Yuan Gong, Nauman Dawalatabad, Marco Vilela, Katerina Placek, Brian Tracey, Yishu Gong, Alan Premasiri, Fernando Vieira, James R. Glass:
Automatic Prediction of Amyotrophic Lateral Sclerosis Progression using Longitudinal Speech Transformer. CoRR abs/2406.18625 (2024) - [i151]Yung-Sung Chuang, Linlu Qiu, Cheng-Yu Hsieh, Ranjay Krishna, Yoon Kim, James R. Glass:
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps. CoRR abs/2407.07071 (2024) - [i150]Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-Wei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James R. Glass, Shinji Watanabe, Hung-yi Lee:
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models. CoRR abs/2409.14085 (2024) - [i149]Zhenting Qi, Hongyin Luo, Xuliang Huang, Zhuokai Zhao, Yibo Jiang, Xiangjun Fan, Himabindu Lakkaraju, James R. Glass:
Quantifying Generalization Complexity for Large Language Models. CoRR abs/2410.01769 (2024) - [i148]Muhammad Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogério Feris, Leonid Karlinsky, James R. Glass:
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models. CoRR abs/2410.06154 (2024) - [i147]Kun Li, Tianhua Zhang, Xixin Wu, Hongyin Luo, James R. Glass, Helen Meng:
Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains. CoRR abs/2410.18415 (2024) - [i146]Nour Jedidi, Yung-Sung Chuang, Leslie Shing, James R. Glass:
Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback. CoRR abs/2410.21242 (2024) - [i145]Heng-Jui Chang, Hongyu Gong, Changhan Wang, James R. Glass, Yu-An Chung:
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models. CoRR abs/2410.24177 (2024) - 2023
- [c334]Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James R. Glass, Yulia Tsvetkov:
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. ACL (1) 2023: 12067-12097 - [c333]Yung-Sung Chuang, Wei Fang, Shang-Wen Li, Wen-tau Yih, James R. Glass:
Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering. ACL (Findings) 2023: 12131-12147 - [c332]Jiaxin Ge, Hongyin Luo, Yoon Kim, James R. Glass:
Entailment as Robust Self-Learner. ACL (1) 2023: 13803-13817 - [c331]Yuan Gong, Alexander H. Liu, Hongyin Luo, Leonid Karlinsky, James R. Glass:
Joint Audio and Speech Understanding. ASRU 2023: 1-8 - [c330]Cheng-I Jeff Lai, Freda Shi, Puyuan Peng, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David D. Cox, David Harwath, Yang Zhang, Karen Livescu, James R. Glass:
Audio-Visual Neural Syntax Acquisition. ASRU 2023: 1-8 - [c329]Hongyin Luo, James R. Glass:
Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence Reasoning. EACL 2023: 1235-1246 - [c328]Hongyin Luo, Tianhua Zhang, Yung-Sung Chuang, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, James R. Glass:
Search Augmented Instruction Learning. EMNLP (Findings) 2023: 3717-3729 - [c327]Nauman Dawalatabad, Sameer Khurana, Antoine Laurent, James R. Glass:
On Unsupervised Uncertainty-Driven Speech Pseudo-Label Filtering and Model Calibration. ICASSP 2023: 1-5 - [c326]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. ICASSP 2023: 1-5 - [c325]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. ICLR 2023 - [c324]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. INTERSPEECH 2023: 2268-2272 - [c323]Yuan Gong, Sameer Khurana, Leonid Karlinsky, James R. Glass:
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers. INTERSPEECH 2023: 2798-2802 - [c322]Heng-Jui Chang, Alexander H. Liu, James R. Glass:
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering. INTERSPEECH 2023: 2983-2987 - [c321]Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass:
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning. NeurIPS 2023 - [c320]David Cheng-Han Chiang, Hung-yi Lee, Yung-Sung Chuang, James R. Glass:
Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS. RepL4NLP@ACL 2023: 289-302 - [c319]Jingyu Zhang, James R. Glass, Tianxing He:
PCFG-Based Natural Language Interface Improves Generalization for Controlled Text Generation. *SEM@ACL 2023: 295-313 - [i144]Hongyin Luo, James R. Glass:
Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence Reasoning. CoRR abs/2303.05670 (2023) - [i143]Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Daniel Kondermann, Samuel Thomas, Shih-Fu Chang, Rogério Feris, James R. Glass, Hilde Kuehne:
What, when, and where? - Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions. CoRR abs/2303.16990 (2023) - [i142]Tianhua Zhang, Hongyin Luo, Yung-Sung Chuang, Wei Fang, Luc Gaitskell, Thomas Hartvigsen, Xixin Wu, Danny Fox, Helen Meng, James R. Glass:
Interpretable Unified Language Checking. CoRR abs/2304.03728 (2023) - [i141]Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass:
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning. CoRR abs/2305.10005 (2023) - [i140]Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James R. Glass:
Listen, Think, and Understand. CoRR abs/2305.10790 (2023) - [i139]Heng-Jui Chang, Alexander H. Liu, James R. Glass:
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering. CoRR abs/2305.11072 (2023) - [i138]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. CoRR abs/2305.12606 (2023) - [i137]Hongyin Luo, Yung-Sung Chuang, Yuan Gong, Tianhua Zhang, Yoon Kim, Xixin Wu, Danny Fox, Helen Meng, James R. Glass:
SAIL: Search-Augmented Instruction Learning. CoRR abs/2305.15225 (2023) - [i136]Yung-Sung Chuang, Wei Fang, Shang-Wen Li, Wen-tau Yih, James R. Glass:
Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering. CoRR abs/2305.17080 (2023) - [i135]Jiaxin Ge, Hongyin Luo, Yoon Kim, James R. Glass:
Entailment as Robust Self-Learner. CoRR abs/2305.17197 (2023) - [i134]Sameer Khurana, Nauman Dawalatabad, Antoine Laurent, Luis Vicente, Pablo Gimeno, Victoria Mingote, James R. Glass:
Improved Cross-Lingual Transfer Learning For Automatic Speech Translation. CoRR abs/2306.00789 (2023) - [i133]David Cheng-Han Chiang, Yung-Sung Chuang, James R. Glass, Hung-yi Lee:
Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS. CoRR abs/2306.05083 (2023) - [i132]Yuan Gong, Sameer Khurana, Leonid Karlinsky, James R. Glass:
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers. CoRR abs/2307.03183 (2023) - [i131]Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James R. Glass, Pengcheng He:
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models. CoRR abs/2309.03883 (2023) - [i130]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Xixin Wu, Yoon Kim, Helen Meng, James R. Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. CoRR abs/2309.10814 (2023) - [i129]Yuan Gong, Alexander H. Liu, Hongyin Luo, Leonid Karlinsky, James R. Glass:
Joint Audio and Speech Understanding. CoRR abs/2309.14405 (2023) - [i128]Junmo Kang, Hongyin Luo, Yada Zhu, James R. Glass, David D. Cox, Alan Ritter, Rogério Feris, Leonid Karlinsky:
Self-Specialization: Uncovering Latent Expertise within Large Language Models. CoRR abs/2310.00160 (2023) - [i127]Cheng-I Jeff Lai, Freda Shi, Puyuan Peng, Yoon Kim, Kevin Gimpel, Shiyu Chang, Yung-Sung Chuang, Saurabhchand Bhati, David D. Cox, David Harwath, Yang Zhang, Karen Livescu, James R. Glass:
Audio-Visual Neural Syntax Acquisition. CoRR abs/2310.07654 (2023) - [i126]Heng-Jui Chang, James R. Glass:
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces. CoRR abs/2311.09117 (2023) - 2022
- [j36]Gene-Ping Yang, Sung-Lin Yeh, Yu-An Chung, James R. Glass, Hao Tang:
Autoregressive Predictive Coding: A Comprehensive Study. IEEE J. Sel. Top. Signal Process. 16(6): 1380-1390 (2022) - [j35]Sameer Khurana, Antoine Laurent, James R. Glass:
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-Level Cross-Lingual Speech Representation. IEEE J. Sel. Top. Signal Process. 16(6): 1493-1504 (2022) - [j34]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: Towards Unifying Audio and Visual Models. IEEE Signal Process. Lett. 29: 2437-2441 (2022) - [c318]Yuan Gong, Cheng-I Lai, Yu-An Chung, James R. Glass:
SSAST: Self-Supervised Audio Spectrogram Transformer. AAAI 2022: 10699-10709 - [c317]Alexander H. Liu, SouYoung Jin, Cheng-I Lai, Andrew Rouditchenko, Aude Oliva, James R. Glass:
Cross-Modal Discrete Representation Learning. ACL (1) 2022: 3013-3035 - [c316]Jiabao Ji, Yoon Kim, James R. Glass, Tianxing He:
Controlling the Focus of Pretrained Language Generation Models. ACL (Findings) 2022: 3291-3306 - [c315]Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne:
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. CVPR 2022: 19988-19997 - [c314]Nauman Dawalatabad, Yuan Gong, Sameer Khurana, Rhoda Au, James R. Glass:
Detecting Dementia from Long Neuropsychological Interviews. EMNLP (Findings) 2022: 5270-5283 - [c313]Yuan Gong, Jin Yu, James R. Glass:
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition. ICASSP 2022: 151-155 - [c312]Sameer Khurana, Antoine Laurent, James R. Glass:
Magic Dust for Cross-Lingual Adaptation of Monolingual Wav2vec-2.0. ICASSP 2022: 6647-6651 - [c311]R'mani Haulcy, Katerina Placek, Brian Tracey, Adam P. Vogel, James R. Glass:
Repetition Assessment for Speech and Language Disorders: A Study of the Logopenic Variant of Primary Progressive Aphasia. ICASSP 2022: 6932-6936 - [c310]Yuan Gong, Ziyi Chen, Iek-Heng Chu, Peng Chang, James R. Glass:
Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment. ICASSP 2022: 7262-7266 - [c309]Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. ICASSP 2022: 8447-8451 - [c308]Alexander H. Liu, Cheng-I Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James R. Glass:
Simple and Effective Unsupervised Speech Synthesis. INTERSPEECH 2022: 843-847 - [c307]Christopher Song, David Harwath, Tuka Alhanai, James R. Glass:
Speak: A Toolkit Using Amazon Mechanical Turk to Collect and Validate Speech Audio Recordings. LREC 2022: 7253-7258 - [c306]Hongyin Luo, Shang-Wen Li, Mingye Gao, Seunghak Yu, James R. Glass:
Cooperative Self-training of Machine Reading Comprehension. NAACL-HLT 2022: 244-257 - [c305]Yung-Sung Chuang, Rumen Dangovski, Hongyin Luo, Yang Zhang, Shiyu Chang, Marin Soljacic, Shang-Wen Li, Scott Yih, Yoon Kim, James R. Glass:
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings. NAACL-HLT 2022: 4207-4218 - [i125]Jiabao Ji, Yoon Kim, James R. Glass, Tianxing He:
Controlling the Focus of Pretrained Language Generation Models. CoRR abs/2203.01146 (2022) - [i124]Yuan Gong, Sameer Khurana, Andrew Rouditchenko, James R. Glass:
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification. CoRR abs/2203.06760 (2022) - [i123]Alexander H. Liu, Cheng-I Jeff Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James R. Glass:
Simple and Effective Unsupervised Speech Synthesis. CoRR abs/2204.02524 (2022) - [i122]Yung-Sung Chuang, Rumen Dangovski, Hongyin Luo, Yang Zhang, Shiyu Chang, Marin Soljacic, Shang-Wen Li, Wen-tau Yih, Yoon Kim, James R. Glass:
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings. CoRR abs/2204.10298 (2022) - [i121]Yuan Gong, Ziyi Chen, Iek-Heng Chu, Peng Chang, James R. Glass:
Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment. CoRR abs/2205.03432 (2022) - [i120]Yuan Gong, Jin Yu, James R. Glass:
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition. CoRR abs/2205.03433 (2022) - [i119]Sameer Khurana, Antoine Laurent, James R. Glass:
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation. CoRR abs/2205.08180 (2022) - [i118]Vijay Gadepally, Gregory Angelides, Andrei Barbu, Andrew Bowne, Laura J. Brattain, Tamara Broderick, Armando Cabrera, Glenn Carl, Ronisha Carter, Miriam Cha, Emilie Cowen, Jesse Cummings, Bill Freeman, James R. Glass, Sam Goldberg, Mark Hamilton, Thomas Heldt, Kuan Wei Huang, Phillip Isola, Boris Katz, Jamie Koerner, Yen-Chen Lin, David Mayo, Kyle McAlpin, Taylor Perron, Jean E. Piou, Hrishikesh M. Rao, Hayley Reynolds, Kaira Samuel, Siddharth Samsi, Morgan Schmidt, Leslie Shing, Olga Simek, Brandon Swenson, Vivienne Sze, Jonathan Taylor, Paul Tylkin, Mark Veillette, Matthew L. Weiss, Allan B. Wollaber, Sophia Yuditskaya, Jeremy Kepner:
Developing a Series of AI Challenges for the United States Department of the Air Force. CoRR abs/2207.07033 (2022) - [i117]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: A Unified Model for Audio-Visual Learning. CoRR abs/2208.00061 (2022) - [i116]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. CoRR abs/2210.03625 (2022) - [i115]Jingyu Zhang, James R. Glass, Tianxing He:
PCFG-based Natural Language Interface Improves Generalization for Controlled Text Generation. CoRR abs/2210.07431 (2022) - [i114]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. CoRR abs/2210.07839 (2022) - [i113]Nauman Dawalatabad, Sameer Khurana, Antoine Laurent, James R. Glass:
On Unsupervised Uncertainty-Driven Speech Pseudo-Label Filtering and Model Calibration. CoRR abs/2211.07795 (2022) - [i112]Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James R. Glass, Yulia Tsvetkov:
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation. CoRR abs/2212.10020 (2022) - 2021
- [j33]Yuan Gong, Yu-An Chung, James R. Glass:
PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3292-3306 (2021) - [c304]Wei-Ning Hsu, David Harwath, Tyler Miller, Christopher Song, James R. Glass:
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units. ACL/IJCNLP (1) 2021: 5284-5300 - [c303]Mathew Monfort, SouYoung Jin, Alexander H. Liu, David Harwath, Rogério Feris, James R. Glass, Aude Oliva:
Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions. CVPR 2021: 14871-14881 - [c302]Tianxing He, Jun Liu, Kyunghyun Cho, Myle Ott, Bing Liu, James R. Glass, Fuchun Peng:
Analyzing the Forgetting Problem in Pretrain-Finetuning of Open-domain Dialogue Response Models. EACL 2021: 1121-1133 - [c301]Tianxing He, Jingzhao Zhang, Zhiming Zhou, James R. Glass:
Exposure Bias versus Self-Recovery: Are Distortions Really Incremental for Autoregressive Text Generation? EMNLP (1) 2021: 5087-5102 - [c300]Yu-An Chung, Yonatan Belinkov, James R. Glass:
Similarity Analysis of Self-Supervised Speech Representations. ICASSP 2021: 3040-3044 - [c299]Cheng-I Lai, Yung-Sung Chuang, Hung-Yi Lee, Shang-Wen Li, James R. Glass:
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining. ICASSP 2021: 7468-7472 - [c298]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. ICCV 2021: 7992-8001 - [c297]Yuan Gong, Yu-An Chung, James R. Glass:
AST: Audio Spectrogram Transformer. Interspeech 2021: 571-575 - [c296]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogério Schmidt Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021: 1584-1588 - [c295]R'mani Haulcy, James R. Glass:
CLAC: A Speech Corpus of Healthy English Speakers. Interspeech 2021: 2966-2970 - [c294]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. Interspeech 2021: 3006-3010 - [c293]Hongyin Luo, James R. Glass, Garima Lalwani, Yi Zhang, Shang-Wen Li:
Joint Retrieval-Extraction Training for Evidence-Aware Dialog Response Selection. Interspeech 2021: 3241-3245 - [c292]Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James R. Glass:
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset. Interspeech 2021: 3650-3654 - [c291]