


default search action
24th SPECOM 2022: Gurugram, India
- S. R. Mahadeva Prasanna

, Alexey Karpov
, K. Samudravijaya
, Shyam S. Agrawal:
Speech and Computer - 24th International Conference, SPECOM 2022, Gurugram, India, November 14-16, 2022, Proceedings. Lecture Notes in Computer Science 13721, Springer 2022, ISBN 978-3-031-20979-6 - Eleonora Akinshina

, Tatiana Y. Sherstinova
:
Thematic Diversity of Everyday Russian Discourse: A Case Study Based on the ORD Corpus. 1-9 - Jahangir Alam, Woo Hyun Kang, Abderrahim Fathan:

Neural Embedding Extractors for Text-Independent Speaker Verification. 10-23 - Anastasia Avdeeva

, Sergey Novoselov:
Deep Speaker Embeddings Based Online Diarization. 24-32 - Shikha Baghel, S. R. M. Prasanna, Prithwijit Guha:

Overlapped Speech Detection Using AM-FM Based Time-Frequency Representations. 33-43 - Oindrila Banerjee

, D. Govind, Akhilesh Kumar Dubey, Suryakanth V. Gangashetty:
Significance of Dimensionality Reduction in CNN-Based Vowel Classification from Imagined Speech Using Electroencephalogram Signals. 44-55 - Shweta Bansal, Shambhu Sharan

, Shyam S. Agrawal:
Study of Speech Recognition System Based on Transformer and Connectionist Temporal Classification Models for Low Resource Language. 56-63 - Rhythm Bhatia, Tomi H. Kinnunen:

An Initial Study on Birdsong Re-synthesis Using Neural Vocoders. 64-74 - Mrinmoy Bhattacharjee

, S. R. Mahadeva Prasanna
, Prithwijit Guha
:
Speech Music Overlap Detection Using Spectral Peak Evolutions. 75-86 - Joyshree Chakraborty

, Rohit Sinha
, Priyankoo Sarmah
:
Influence of Accented Speech in Automatic Speech Recognition: A Case Study on Assamese L1 Speakers Speaking Code Switched Hindi-English. 87-98 - Daniil Chernyshev

, Boris V. Dobrov:
ClusterVote: Automatic Summarization Dataset Construction with Document Clusters. 99-113 - Shanatip Choosaksakunwiboon, Karla Pizzi, Ching-Yu Kao:

Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples. 114-127 - Maria Chubarova, Tatiana Shevchenko:

Celtic English Continuum in Pitch Patterns of Spontaneous Talk: Evidence of Long-Term Contacts. 128-138 - Dadi Ramesh, Suresh Kumar Sanampudi:

Coherence Based Automatic Essay Scoring Using Sentence Embedding and Recurrent Neural Networks. 139-154 - Goutam Datta

, Nisheeth Joshi
, Kusum Gupta:
Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore vs BLEU Score. 155-162 - Denis Dresvyanskiy

, Yamini Sinha
, Matthias Busch
, Ingo Siegert
, Alexey Karpov
, Wolfgang Minker
:
DyCoDa: A Multi-modal Data Collection of Multi-user Remote Survival Game Recordings. 163-177 - José Vicente Egas López, Róbert Busa-Fekete, Gábor Gosztolya:

On the Use of Ensemble X-Vector Embeddings for Improved Sleepiness Detection. 178-187 - Abderrahim Fathan, Jahangir Alam, Woo Hyun Kang:

Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection. 188-200 - Parismita Gogoi

, Priyankoo Sarmah
, S. R. M. Prasanna
:
Automatic Rhythm and Speech Rate Analysis of Mising Spontaneous Speech. 201-213 - Aleksey Grigorev

, Anna V. Kurazhova
, Egor Kleshnev
, Aleksandr Nikolaev
, Olga V. Frolova
, Elena E. Lyakso
:
An Electroglottographic Method for Assessing the Emotional State of the Speaker. 214-225 - Priyanka Gupta

, Hemant A. Patil
:
Significance of Distance on Pop Noise for Voice Liveness Detection. 226-237 - Vishwa Gupta, Gilles Boulianne

:
CRIM's Speech Recognition System for OpenASR21 Evaluation with Conformer and Voice Activity Detector Embeddings. 238-251 - Alisa P. Gvozdeva

, Alexander M. Lunichkin, Larisa G. Zaytseva, Elena A. Ogorodnikova, Irina G. Andreeva
:
Joint Changes in First and Second Formants of /a/, /i/, /u/ Vowels in Babble Noise - a New Statistical Approach. 252-264 - Maria-Loulou Hajj

, Martin Lenglet
, Olivier Perrotin
, Gérard Bailly
:
Comparing NLP Solutions for the Disambiguation of French Heterophonic Homographs for End-to-End TTS Systems. 265-278 - Attila Zoltán Jenei, Gábor Kiss, Dávid Sztahó:

Detection of Speech Related Disorders by Pre-trained Embedding Models Extracted Biomarkers. 279-289 - Mélanie Jouaiti

, Kerstin Dautenhahn:
Multi-label Dysfluency Classification. 290-301 - Mélanie Jouaiti

, Kerstin Dautenhahn:
Harnessing Uncertainty - Multi-label Dysfluency Classification with Uncertain Labels. 302-311 - Aastha Kachhi, Anand Therattil, Priyanka Gupta

, Hemant A. Patil
:
Continuous Wavelet Transform for Severity-Level Classification of Dysarthria. 312-324 - Aastha Kachhi, Anand Therattil, Ankur T. Patil, Hardik B. Sailor, Hemant A. Patil:

Significance of Energy Features for Severity Classification of Dysarthria. 325-337 - Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan:

An Analytic Study on Clustering-Based Pseudo-labels for Self-supervised Deep Speaker Verification. 338-348 - Irina S. Kipyatkova

:
Investigation of Transfer Learning for End-to-End Russian Speech Recognition. 349-357 - Uliana E. Kochetkova

, Pavel A. Skrelin
, Rada German
, Daria Novoselova
:
Prosodic Features of Verbal Irony in Russian and French: Universal vs. Language-Specific. 358-371 - Liliya Komalova

, Lyubov Kalyuzhnaya
:
Categorization of Threatening Speech Acts. 372-381 - Evgeny Kostyuchenko

, Ivan Rakhmanenko
, Lidiya N. Balatskaya
:
Assessment of Speech Quality During Speech Rehabilitation Based on the Solution of the Classification Problem. 382-390 - Dani Krebbers, Heysem Kaya

, Alexey Karpov
:
Multi-level Fusion of Fisher Vector Encoded BERT and Wav2vec 2.0 Embeddings for Native Language Identification. 391-403 - Devesh Kumar, Pavan Kumar V. Patil, Ayush Agarwal, S. R. Mahadeva Prasanna:

Fake Speech Detection Using OpenSMILE Features. 404-415 - Anna Leonteva

, Tatiana Sokoreva
:
Nonverbal Constituents of Argumentative Discourse: Gesture and Prosody Interaction. 416-425 - Seema Lokhandwala

, Priyankoo Sarmah
, Rohit Sinha
:
Classifying Mahout and Social Interactions of Asian Elephants Based on Trumpet Calls. 426-437 - Elena E. Lyakso

, Olga V. Frolova
, Anton Matveev
, Yuri Matveev
, Aleksey Grigorev
, Olesia Makhnytkina
, Nersisson Ruban
:
Recognition of the Emotional State of Children with Down Syndrome by Video, Audio and Text Modalities: Human and Automatic. 438-450 - Raghav Magazine, Ayush Agarwal, Anand Hedge, S. R. Mahadeva Prasanna:

Fake Speech Detection Using Modulation Spectrogram. 451-463 - Danila Mamontov, Wolfgang Minker, Alexey Karpov:

Self-Configuring Genetic Programming Feature Generation in Affect Recognition Tasks. 464-476 - Jose Mathew, Pranjal Sahu, Bhavuk Singhal, Aniket Joshi, Krishna Reddy Medikonda, Jairaj Sathyanarayana:

A Multi-modal Approach to Mining Intent from Code-Mixed Hindi-English Calls in the Hyperlocal-Delivery Domain. 477-493 - Jagabandhu Mishra, S. R. Mahadeva Prasanna:

Importance of Supra-Segmental Information and Self-Supervised Framework for Spoken Language Diarization Task. 494-507 - Anton Nesterenko, Ruslan Akhmerov, Yulia Matveeva, Anna Goremykina, Dmitry Astankov, Evgeniy Shuranov, Alexandra Shirshova:

Low-Resource Emotional Speech Synthesis: Transfer Learning and Data Requirements. 508-521 - Dariya Novokhrestova

, Ilya A. Hodashinsky
, Evgeny Kostyuchenko
, Konstantin S. Sarin
, Marina Bardamova
:
Fuzzy Classifier for Speech Assessment in Speech Rehabilitation. 522-532 - Moumita Pakrashi

, Shakuntala Mahanta
:
Analysis-By-Synthesis Modeling of Bengali Intonation. 533-544 - K. S. Pavithra, H. M. Chandrashekar

, Veena Karjigi:
Neural Network Based Curve Fitting to Enhance the Intelligibility of Dysarthric Speech. 545-553 - Pavel Posokhov

, Anastasia Matveeva
, Olesia Makhnytkina
, Anton Matveev
, Yuri Matveev
:
Personalizing Retrieval-Based Dialogue Agents. 554-566 - Rodmonga Potapova

, Vsevolod Potapov
, Irina Kuryanova
:
Forensic Identification of Foreign-Language Speakers by the Method of Structural-Melodic Analysis of Phonograms. 567-578 - Rodmonga Potapova

, Vsevolod Potapov
, Oleg Kuzmin:
Logistics Translator. Concept Vision on Future Interlanguage Computer Assisted Translation. 579-589 - Aditya Pusuluri, Aastha Kachhi, Hemant A. Patil:

Analysis of Time-Averaged Feature Extraction Techniques on Infant Cry Classification. 590-603 - Elena I. Riekhakaynen

, Elena Zatevalova
:
Should We Believe Our Eyes or Our Ears? Processing Incongruent Audiovisual Stimuli by Russian Listeners. 604-615 - Elena Ryumina

, Denis Ivanko
:
Emotional Speech Recognition Based on Lip-Reading. 616-625 - Anna Shestakova, Andrea Corradini:

Exploring the Use of Machine Learning for Resume Recommendations. 626-640 - Tatiana Sokoreva

, Tatiana Shevchenko
:
The Role of Pause in Interaction: A Case of Polylogue. 641-650 - Valery D. Solovyev

, Musa Islamov, Venera Bayrasheva
:
Dictionary with the Evaluation of Positivity/Negativity Degree of the Russian Words. 651-664 - Nikolaos Tsiftsis, Konstantinos Moustakas, Nikolaos D. Fakotakis:

Effects of Depth of Field on Focus Using a Virtual Reality Escape Room. 665-675 - Yaroslav Turovsky

, Daniyar Wolf
, Roman V. Meshcheryakov
, Anastasia Iskhakova
:
Dynamics of Frequency Characteristics of Visually Evoked Potentials of Electroencephalography During the Work with Brain-Computer Interfaces. 676-687 - Spoorthy Venkatesh

, Shashidhar G. Koolagudi:
Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network. 688-699 - Zhandos Yessenbayev

, Zhanibek Kozhirbayev
:
Comparison of Word Embeddings of Unaligned Audio and Text Data Using Persistent Homology. 700-711 - Alexander Zatvornitskiy

:
Low-Cost Training of Speech Recognition System for Hindi ASR Challenge 2022. 712-718

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














