


default search action
27th TSD 2024: Brno, Czech Republic - Part II
- Elmar Nöth
, Ales Horák
, Petr Sojka
:
Text, Speech, and Dialogue - 27th International Conference, TSD 2024, Brno, Czech Republic, September 9-13, 2024, Proceedings, Part II. Lecture Notes in Computer Science 15049, Springer 2024, ISBN 978-3-031-70565-6
Speech
- Gokul Srinivasagan
, Munir Georges
:
Retrieval Augmented Spoken Language Generation for Transport Domain. 3-12 - Sven Aller
, Mark Fishel
:
Adapting Audiovisual Speech Synthesis to Estonian. 13-23 - Dosti Aziz
, Dávid Sztahó
:
Dysphonia Diagnosis Using Self-supervised Speech Models in Mono and Cross-Lingual Settings. 24-35 - Daniel Tihelka
, Jindrich Matousek
, Zdenek Hanzlícek
, Lukás Vladar
:
Sentences vs Phrases in Neural Speech Synthesis. 36-45 - Jan Lehecka
, Zdenek Hanzlícek
, Jindrich Matousek
, Daniel Tihelka
:
Zero-Shot vs. Few-Shot Multi-speaker TTS Using Pre-trained Czech SpeechT5 Model. 46-57 - Mohammed Hamzah Abed
, Dávid Sztahó
:
Deep Speaker Embeddings for Speaker Verification of Children. 58-69 - Kwok Chin Yuen
, Jia Qi Yip
, Eng Siong Chng
:
Improved Alignment for Score Combination of RNN-T and CTC Decoder for Online Decoding. 70-80 - Erfan A. Shams
, Julie Carson-Berndsen
:
Attention to Phonetics: A Visually Informed Explanation of Speech Transformers. 81-93 - Lukás Vladar
, Jindrich Matousek
:
Effects of Training Strategies and the Amount of Speech Data on the Quality of Speech Synthesis. 94-104 - Santiago Andres Moreno-Acevedo
, Juan Camilo Vásquez-Correa
, Juan M. Martín-Doñas
, Aitor Álvarez
:
Stream-based Active Learning for Speech Emotion Recognition via Hybrid Data Selection and Continuous Learning. 105-117 - Zdenek Hanzlícek
:
Data Alignment and Duration Modelling in VITS. 118-129 - Ilaria Manfredi
:
Multiword Expressions Resources for Italian: Presenting a Manually Annotated Spoken Corpus. 130-138 - David Portes
, Ales Horák
:
Generating High-Quality F0 Embeddings Using the Vector-Quantized Variational Autoencoder. 139-148 - Abner Hernandez
, Paula Andrea Pérez-Toro
, Tomás Arias-Vergara
, Juan Camilo Vásquez-Correa
, Seung Hee Yang, Juan Rafael Orozco-Arroyave
, Andreas K. Maier
:
Anonymizing Dysarthric Speech: Investigating the Effects of Voice Conversion on Pathological Information Preservation. 149-160 - Mala J. B.
, S. M. Alex Raj, Rajeev Rajan
:
X-Vector-Based Speaker Diarization Using Bi-LSTM and Interim Voting-Driven Post-processing. 161-173 - Thibault Bañeras Roux, Mickael Rouvier, Jane Wottawa, Richard Dufour:
A Paradigm for Interpreting Metrics and Measuring Error Severity in Automatic Speech Recognition. 174-183 - Maros Jakubec
, Roman Jarina
, Eva Lieskovska
, Peter Kasak
, Michal Spisiak
:
Enhancing Speech Emotion Recognition Using Transfer Learning from Speaker Embeddings. 184-195
Dialogue
- Lucas Druart, Valentin Vielzeuf
, Yannick Estève
:
Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets. 199-209 - Kwan Yeung Wong
, Korris Fu-Lai Chung
:
PiCo-VITS: Leveraging Pitch Contours for Fine-Grained Emotional Speech Synthesis. 210-221 - Daniel Ortega, Steven Söhnel, Ngoc Thang Vu:
Improving and Understanding Clarifying Question Generation in Conversational Search. 222-235 - Duygu Altinok
:
Explainable Multimodal Fusion for Dementia Detection From Text and Speech. 236-251 - Diego Alexander Lopez-Santander
, Cristian David Ríos-Urrego
, Christian Bergler, Elmar Nöth
, Juan Rafael Orozco-Arroyave
:
Robust Classification of Parkinson's Speech: an Approximation to a Scenario With Non-controlled Acoustic Conditions. 252-262 - Ondrej Sotolár
, Jaromír Plhák
, David Smahel
:
Leveraging Conceptual Similarities to Enhance Modeling of Factors Affecting Adolescents' Well-Being. 263-274 - Ankit Kumar
, Munir Georges
:
Joint-Average Mean and Variance Feature Matching (JAMVFM) Semi-supervised GAN with Additional-Objective Training Function for Intent Detection. 275-287 - Niko Kleer
, Leon Weyand
, Michael Feld
, Klaus Berberich
:
Capturing Task-Related Information for Text-Based Grasp Classification Using Fine-Tuned Embeddings. 288-299 - Julian Wolter
, Niko Kleer
, Michael Feld
:
StepDP: A Step Towards Expressive and Pervasive Dialogue Platforms. 300-312 - Jeferson David Gallo-Aristizábal
, Daniel Escobar-Grisales
, Cristian David Ríos-Urrego
, Elmar Nöth
, Juan Rafael Orozco-Arroyave
:
Automatic Classification of Parkinson's Disease Using Wav2vec Embeddings at Phoneme, Syllable, and Word Levels. 313-323

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.