


default search action
Sreyan Ghosh
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
[c43]Sreyan Ghosh, Mohammad Sadegh Rasooli, Michael Levit, Peidong Wang, Jian Xue, Dinesh Manocha, Jinyu Li:
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation. ACL (Findings) 2025: 2466-2482
[c42]Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha:
ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds. ICASSP 2025: 1-5
[c41]Chenxuan Liu, Liping Chen, Peiwang Tang, Weitai Zhang, Xiaoxi Li, Sreyan Ghosh, Zhongyi Ye, Mingjia Yu:
Large Language Models Are Efficient Learners as Zero-Shot Speech Translators. ICASSP 2025: 1-5
[c40]Chenxuan Liu, Liping Chen, Weitai Zhang, Xiaoxi Li, Peiwang Tang, Mingjia Yu, Sreyan Ghosh, Zhongyi Ye:
Adversarial Speech-Text Pre-Training for Speech Translation. ICASSP 2025: 1-5
[c39]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Oriol Nieto, Zeyu Jin, Dinesh Manocha:
Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs. ICLR 2025
[c38]Sreyan Ghosh, Sonal Kumar, Zhifeng Kong, Rafael Valle, Bryan Catanzaro, Dinesh Manocha:
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data. ICLR 2025
[c37]S. Sakshi, Utkarsh Tyagi, Sonal Kumar, Ashish Seth, Ramaneswaran Selvakumar, Oriol Nieto, Ramani Duraiswami, Sreyan Ghosh, Dinesh Manocha:
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark. ICLR 2025
[c36]Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S. Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro:
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities. ICML 2025
[c35]Ramaneswaran Selvakumar, Sonal Kumar, Hemant Kumar Giri, Nishit Anand, Ashish Seth, Sreyan Ghosh, Dinesh Manocha:
Do Audio-Language Models Understand Linguistic Variations? NAACL (Short Papers) 2025: 899-913
[c34]Ashish Seth, Ramaneswaran Selvakumar, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification. NAACL (Long Papers) 2025: 12376-12394
[c33]Sonal Kumar, Sreyan Ghosh, Utkarsh Tyagi, Anton Jeran Ratnarajah, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha:
ProSE: Diffusion Priors for Speech Enhancement. NAACL (Long Papers) 2025: 12470-12483
[i54]Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S. Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro:
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities. CoRR abs/2503.03983 (2025)
[i53]Chao-Han Huck Yang, Sreyan Ghosh, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, Sakshi Singh, Vaibhavi Lokegaonkar, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Gunhee Kim, Jun Du, Rafael Valle, Bryan Catanzaro:
Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge. CoRR abs/2505.07365 (2025)
[i52]Arushi Goel, Sreyan Ghosh, Jaehyeon Kim, Sonal Kumar, Zhifeng Kong, Sang-gil Lee, Chao-Han Huck Yang, Ramani Duraiswami, Dinesh Manocha, Rafael Valle, Bryan Catanzaro:
Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models. CoRR abs/2507.08128 (2025)
[i51]Ramaneswaran Selvakumar, Ashish Seth, Nishit Anand, Utkarsh Tyagi, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
MultiVox: Benchmarking Voice Assistants for Multimodal Interactions. CoRR abs/2507.10859 (2025)
[i50]Zhifeng Kong, Arushi Goel, João Felipe Santos, Sreyan Ghosh, Rafael Valle, Wei Ping, Bryan Catanzaro:
Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding. CoRR abs/2508.11818 (2025)
[i49]Ashish Seth, Utkarsh Tyagi, Ramaneswaran Selvakumar, Nishit Anand, Sonal Kumar, Sreyan Ghosh, Ramani Duraiswami, Chirag Agarwal, Dinesh Manocha:
EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding. CoRR abs/2508.12687 (2025)
[i48]Sonal Kumar, Simon Sedlácek, Vaibhavi Lokegaonkar, Fernando López, Wenyi Yu, Nishit Anand, Hyeonggon Ryu, Lichang Chen, Maxim Plicka, Miroslav Hlavácek, William Fineas Ellingwood, Sathvik Udupa, Siyuan Hou, Allison Ferner, Sara Barahona, Cecilia Bolaños, Satish Rahi, Laura Herrera-Alarcón, Satvik Dixit, Rupali S. Patil, Soham Deshmukh, Lasha Koroshinadze, Yao Liu, Leibny Paola Garcia Perera, Eleni Zanou, Themos Stafylakis, Joon Son Chung, David Harwath, Chao Zhang, Dinesh Manocha, Alicia Lozano-Diez, Santosh Kesiraju, Sreyan Ghosh, Ramani Duraiswami:
MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence. CoRR abs/2508.13992 (2025)
[i47]Jinchuan Tian, Sang-gil Lee, Zhifeng Kong, Sreyan Ghosh, Arushi Goel, Chao-Han Huck Yang, Wenliang Dai, Zihan Liu, Hanrong Ye, Shinji Watanabe, Mohammad Shoeybi, Bryan Catanzaro, Rafael Valle, Wei Ping:
UALM: Unified Audio Language Model for Understanding, Generation and Reasoning. CoRR abs/2510.12000 (2025)
[i46]Hanrong Ye, Chao-Han Huck Yang, Arushi Goel, Wei Huang, Ligeng Zhu, Yuanhang Su, Sean Lin, An-Chieh Cheng, Zhen Wan, Jinchuan Tian, Yuming Lou, Dong Yang, Zhijian Liu, Yukang Chen, Ambrish Dantrey, Ehsan Jahangiri, Sreyan Ghosh, Daguang Xu, Ehsan Hosseini-Asl, Danial Mohseni-Taheri, Vidya Murali, Sifei Liu, Yao Lu, Oluwatobi Olabiyi, Yu-Chiang Frank Wang, Rafael Valle, Bryan Catanzaro, Andrew Tao, Song Han, Jan Kautz, Hongxu Yin, Pavlo Molchanov:
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM. CoRR abs/2510.15870 (2025)
[i45]Sakshi Singh, Vaibhavi Lokegaonkar, Neil Zhang, Ramani Duraiswami, Sreyan Ghosh, Dinesh Manocha, Lie Lu:
SPUR: A Plug-and-Play Framework for Integrating Spatial Audio Understanding and Reasoning into Large Audio-Language Models. CoRR abs/2511.06606 (2025)
[i44]Sreyan Ghosh, Arushi Goel, Lasha Koroshinadze, Sang-gil Lee, Zhifeng Kong, João Felipe Santos, Ramani Duraiswami, Dinesh Manocha, Weiping Ding, Mohammad Shoeybi, Bryan Catanzaro:
Music Flamingo: Scaling Music Understanding in Audio Language Models. CoRR abs/2511.10289 (2025)- 2024
[c32]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, S. Sakshi, Sanjoy Chowdhury, Dinesh Manocha:
ASPIRE: Language-Guided Data Augmentation for Improving Robustness Against Spurious Correlations. ACL (Findings) 2024: 386-406
[c31]Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramaneswaran S., S. Sakshi, Dinesh Manocha:
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions. ACL (1) 2024: 726-748
[c30]Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar, Purva Chiniya, Dinesh Manocha:
AV-RIR: Audio-Visual Room Impulse Response Estimation. CVPR 2024: 27154-27165
[c29]Sreyan Ghosh, Sonal Kumar, Ashish Seth, Chandra Kiran Reddy Evuru, Utkarsh Tyagi, S. Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha:
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities. EMNLP 2024: 6288-6313
[c28]Ashish Seth, Ramaneswaran Selvakumar, S. Sakshi, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning. EMNLP 2024: 6386-6400
[c27]Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha:
Recap: Retrieval-Augmented Audio Captioning. ICASSP 2024: 1161-1165
[c26]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
Stable Distillation: Regularizing Continued Pre-Training for Low-Resource Automatic Speech Recognition. ICASSP 2024: 10821-10825
[c25]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
FusDom: Combining in-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning. ICASSP 2024: 12572-12576
[c24]Sreyan Ghosh, Ashish Seth, Sonal Kumar, Utkarsh Tyagi, Chandra Kiran Reddy Evuru, Ramaneswaran S., Sakshi Singh, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha:
CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models. ICLR 2024
[c23]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Ramaneswaran S., Deepali Aneja, Zeyu Jin, Ramani Duraiswami, Dinesh Manocha:
A Closer Look at the Limitations of Instruction Tuning. ICML 2024
[c22]Sreyan Ghosh, Sonal Kumar, Ashish Seth, Purva Chiniya, Utkarsh Tyagi, Ramani Duraiswami, Dinesh Manocha:
LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition. INTERSPEECH 2024
[c21]Sonal Kumar, Sreyan Ghosh, S. Sakshi, Utkarsh Tyagi, Dinesh Manocha:
Do Vision-Language Models Understand Compound Nouns? NAACL (Short Papers) 2024: 519-527
[c20]Chandra Kiran Reddy Evuru, Sreyan Ghosh, Sonal Kumar, Ramaneswaran S., Utkarsh Tyagi, Dinesh Manocha:
CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP. NAACL-HLT (Findings) 2024: 3754-3769
[i43]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Ramaneswaran S., Deepali Aneja, Zeyu Jin, Ramani Duraiswami
, Dinesh Manocha:
A Closer Look at the Limitations of Instruction Tuning. CoRR abs/2402.05119 (2024)
[i42]Chandra Kiran Reddy Evuru, Sreyan Ghosh, Sonal Kumar, Ramaneswaran S., Utkarsh Tyagi, Dinesh Manocha:
CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP. CoRR abs/2404.00415 (2024)
[i41]Sonal Kumar, Sreyan Ghosh, Sakshi Singh, Utkarsh Tyagi, Dinesh Manocha:
Do Vision-Language Models Understand Compound Nouns? CoRR abs/2404.00419 (2024)
[i40]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Oriol Nieto, Zeyu Jin, Dinesh Manocha:
VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap. CoRR abs/2405.15683 (2024)
[i39]Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Chandra Kiran Reddy Evuru, S. Ramaneswaran, Sakshi Singh, Dinesh Manocha:
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions. CoRR abs/2406.04286 (2024)
[i38]Sreyan Ghosh, Sonal Kumar, Ashish Seth, Purva Chiniya, Utkarsh Tyagi, Ramani Duraiswami
, Dinesh Manocha:
LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition. CoRR abs/2406.04432 (2024)
[i37]Sreyan Ghosh, Sonal Kumar, Ashish Seth, Chandra Kiran Reddy Evuru, Utkarsh Tyagi, Sakshi Singh, Oriol Nieto, Ramani Duraiswami
, Dinesh Manocha:
GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities. CoRR abs/2406.11768 (2024)
[i36]Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Oriol Nieto, Ramani Duraiswami
, Dinesh Manocha:
ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds. CoRR abs/2409.09213 (2024)
[i35]Sreyan Ghosh, Sonal Kumar, Zhifeng Kong, Rafael Valle, Bryan Catanzaro, Dinesh Manocha:
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data. CoRR abs/2410.02056 (2024)
[i34]Ashish Seth, Ramaneswaran Selvakumar, S. Sakshi, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning. CoRR abs/2410.13179 (2024)
[i33]Sreyan Ghosh, Mohammad Sadegh Rasooli, Michael Levit, Peidong Wang, Jian Xue, Dinesh Manocha, Jinyu Li:
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation. CoRR abs/2410.13198 (2024)
[i32]Ashish Seth, Ramaneswaran Selvakumar, Sonal Kumar, Sreyan Ghosh, Dinesh Manocha:
PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification. CoRR abs/2410.15062 (2024)
[i31]Ramaneswaran Selvakumar, Sonal Kumar, Hemant Kumar Giri, Nishit Anand, Ashish Seth, Sreyan Ghosh, Dinesh Manocha:
Do Audio-Language Models Understand Linguistic Variations? CoRR abs/2410.16505 (2024)
[i30]S. Sakshi, Utkarsh Tyagi, Sonal Kumar, Ashish Seth, Ramaneswaran Selvakumar, Oriol Nieto, Ramani Duraiswami, Sreyan Ghosh, Dinesh Manocha:
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark. CoRR abs/2410.19168 (2024)- 2023
[c19]Sreyan Ghosh, Utkarsh Tyagi, Manan Suri, Sonal Kumar, Ramaneswaran S., Dinesh Manocha:
ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER. ACL (1) 2023: 104-125
[c18]Sreyan Ghosh, Manan Suri, Purva Chiniya, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha:
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network. EMNLP 2023: 6159-6173
[c17]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Ramaneswaran S., Sakshi Singh, Utkarsh Tyagi, Dinesh Manocha:
DALE: Generative Data Augmentation for Low-Resource Legal NLP. EMNLP 2023: 8511-8565
[c16]Sreyan Ghosh, Ashish Seth, Srinivasan Umesh, Dinesh Manocha:
MAST: Multiscale Audio Spectrogram Transformers. ICASSP 2023: 1-5
[c15]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
Data2vec-Aqc: Search for the Right Teaching Assistant in the Teacher-Student Training Setup. ICASSP 2023: 1-5
[c14]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
Unfused: Unsupervised Finetuning Using Self Supervised Distillation. ICASSP Workshops 2023: 1-5
[c13]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
SLICER: Learning Universal Audio Representations Using Low-Resource Self-Supervised Pre-Training. ICASSP 2023: 1-5
[c12]Sanjoy Chowdhury, Sreyan Ghosh, Subhrajyoti Dasgupta, Anton Ratnarajah, Utkarsh Tyagi, Dinesh Manocha:
AdVerb: Visually Guided Audio Dereverberation. ICCV 2023: 7850-7862
[c11]Sreyan Ghosh, Utkarsh Tyagi, S. Ramaneswaran, Harshvardhan Srivastava, Dinesh Manocha:
MMER: Multimodal Multi-task Learning for Speech Emotion Recognition. INTERSPEECH 2023: 1209-1213
[c10]Sreyan Ghosh
, Utkarsh Tyagi
, Sonal Kumar
, Dinesh Manocha
:
BioAug: Conditional Generation based Data Augmentation for Low-Resource Biomedical NER. SIGIR 2023: 1853-1858
[i29]Sreyan Ghosh, Manan Suri
, Purva Chiniya, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha:
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network. CoRR abs/2303.03387 (2023)
[i28]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation. CoRR abs/2303.05668 (2023)
[i27]Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Dinesh Manocha:
BioAug: Conditional Generation based Data Augmentation for Low-Resource Biomedical NER. CoRR abs/2305.10647 (2023)
[i26]Sreyan Ghosh, Utkarsh Tyagi, Manan Suri, Sonal Kumar, Ramaneswaran S., Dinesh Manocha:
ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER. CoRR abs/2306.00928 (2023)
[i25]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Sakshi Singh, Sanjoy Chowdhury, Dinesh Manocha:
ASPIRE: Language-Guided Augmentation for Robust Image Classification. CoRR abs/2308.10103 (2023)
[i24]Sanjoy Chowdhury, Sreyan Ghosh, Subhrajyoti Dasgupta, Anton Ratnarajah, Utkarsh Tyagi, Dinesh Manocha:
AdVerb: Visually Guided Audio Dereverberation. CoRR abs/2308.12370 (2023)
[i23]Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramani Duraiswami
, Dinesh Manocha:
RECAP: Retrieval-Augmented Audio Captioning. CoRR abs/2309.09836 (2023)
[i22]Sreyan Ghosh, Ashish Seth, Sonal Kumar, Utkarsh Tyagi, Chandra Kiran Reddy Evuru, Ramaneswaran S., Sakshi Singh, Oriol Nieto, Ramani Duraiswami
, Dinesh Manocha:
CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models. CoRR abs/2310.08753 (2023)
[i21]Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Ramaneswaran S., Sakshi Singh, Utkarsh Tyagi, Dinesh Manocha:
DALE: Generative Data Augmentation for Low-Resource Legal NLP. CoRR abs/2310.15799 (2023)
[i20]Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar, Purva Chiniya, Dinesh Manocha:
AV-RIR: Audio-Visual Room Impulse Response Estimation. CoRR abs/2312.00834 (2023)
[i19]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech Recognition. CoRR abs/2312.12783 (2023)
[i18]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
FusDom: Combining In-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning. CoRR abs/2312.13026 (2023)- 2022
[j2]Sreyan Ghosh, Ashish Seth
, Srinivasan Umesh:
Decorrelating Feature Spaces for Learning General-Purpose Audio Representations. IEEE J. Sel. Top. Signal Process. 16(6): 1402-1414 (2022)
[c9]Ramaneswaran S., Sean Benhur, Sreyan Ghosh:
Span Extraction Aided Improved Code-mixed Sentiment Classification. W-NUT@COLING 2022: 162-170
[c8]Sreyan Ghosh, Sonal Kumar, Yaman Kumar
, Rajiv Ratn Shah
, Srinivasan Umesh:
Span Classification with Structured Information for Disfluency Detection in Spoken Utterances. INTERSPEECH 2022: 3998-4002
[c7]Sreyan Ghosh, Samden Lepcha, Sakshi Singh, Rajiv Ratn Shah
, Srinivasan Umesh:
DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances. INTERSPEECH 2022: 5185-5189
[c6]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
CCC-WAV2VEC 2.0: Clustering AIDED Cross Contrastive Self-Supervised Learning of Speech Representations. SLT 2022: 1-8
[c5]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations. SLT 2022: 136-143
[i17]Sreyan Ghosh, Ashish Seth, Srinivasan Umesh:
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning. CoRR abs/2203.13628 (2022)
[i16]Sreyan Ghosh, Sonal Kumar, Yaman Kumar Singla
, Rajiv Ratn Shah
, Srinivasan Umesh:
Span Classification with Structured Information for Disfluency Detection in Spoken Utterances. CoRR abs/2203.16028 (2022)
[i15]Harshvardhan Srivastava, Sreyan Ghosh, Srinivasan Umesh:
MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances. CoRR abs/2203.16794 (2022)
[i14]Sreyan Ghosh, Harshvardhan Srivastava, Srinivasan Umesh:
A Discourse Aware Sequence Learning Approach for Emotion Recognition in Conversations. CoRR abs/2203.16799 (2022)
[i13]Lodagala V. S. V. Durga Prasad, Sreyan Ghosh, Srinivasan Umesh:
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations. CoRR abs/2203.16965 (2022)
[i12]Lodagala Durga Prasad, Ashish Seth, Sreyan Ghosh, Srinivasan Umesh:
Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition. CoRR abs/2203.16973 (2022)
[i11]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representations. CoRR abs/2210.02592 (2022)
[i10]Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup. CoRR abs/2211.01246 (2022)
[i9]Sreyan Ghosh, Ashish Seth, Srinivasan Umesh, Dinesh Manocha:
MAST: Multiscale Audio Spectrogram Transformers. CoRR abs/2211.01515 (2022)
[i8]Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
SLICER: Learning universal audio representations using low-resource self-supervised pre-training. CoRR abs/2211.01519 (2022)
[i7]Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Manan Suri
, Rajiv Ratn Shah
:
A novel multimodal dynamic fusion network for disfluency detection in spoken utterances. CoRR abs/2211.14700 (2022)- 2021
[c4]Zaki Mustafa Farooqi, Sreyan Ghosh, Rajiv Ratn Shah:
Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets. FIRE (Working Notes) 2021: 63-74
[c3]Sreyan Ghosh, Sonal Kumar:
Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments. SemEval@ACL/IJCNLP 2021: 249-257
[i6]Sreyan Ghosh, Sonal Kumar, Harsh Jalan, Hemant Yadav, Rajiv Ratn Shah:
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualised Embeddings. CoRR abs/2101.11422 (2021)
[i5]Sreyan Ghosh, Sonal Kumar:
Cisco at SemEval-2021 Task 5: What's Toxic?: Leveraging Transformers for Multiple Toxic Span Extraction from Online Comments. CoRR abs/2105.13959 (2021)
[i4]Sreyan Ghosh, Samden Lepcha, Sakshi Singh, Rajiv Ratn Shah:
Speech Toxicity Analysis: A New Spoken Language Processing Task. CoRR abs/2110.07592 (2021)
[i3]Sreyan Ghosh, Sandesh V. Katta, Ashish Seth, Srinivasan Umesh:
Deep Clustering For General-Purpose Audio Representations. CoRR abs/2110.08895 (2021)
[i2]Zaki Mustafa Farooqi, Sreyan Ghosh, Rajiv Ratn Shah:
Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets. CoRR abs/2112.09986 (2021)- 2020
[c2]Hemant Yadav, Sreyan Ghosh, Yi Yu, Rajiv Ratn Shah
:
End-to-End Named Entity Recognition from English Speech. INTERSPEECH 2020: 4268-4272
[i1]Hemant Yadav, Sreyan Ghosh, Yi Yu, Rajiv Ratn Shah:
End-to-end Named Entity Recognition from English Speech. CoRR abs/2005.11184 (2020)
2010 – 2019
- 2016
[j1]Sudip Pan
, Diego Moreno
, Sreyan Ghosh, Pratim Kumar Chattaraj
, Gabriel Merino:
Structure and stability of noble gas bound EX3+ compounds (E = C, Ge, Sn, Pb; X = H, F, Cl, Br). J. Comput. Chem. 37(2): 226-236 (2016)
[c1]Marcel Penz, Sreyan Ghosh:
Embodied Material Guidance: Augmenting Material for Carving. FMT 2016: 100-104
Coauthor Index
aka: Ramaneswaran Selvakumar

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-01-16 01:26 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







