default search action
32nd EUSIPCO 2024: Lyon, France
- 32nd European Signal Processing Conference, EUSIPCO 2024, Lyon, France, August 26-30, 2024. IEEE 2024, ISBN 978-9-4645-9361-7
- Junqi Zhao, Xubo Liu, Jinzheng Zhao, Yi Yuan, Qiuqiang Kong, Mark D. Plumbley, Wenwu Wang:
Universal Sound Separation with Self-Supervised Audio Masked Autoencoder. 1-5 - Shrishail Baligar, Shawn D. Newsam:
McRTSE: Multi-channel Reverberant Target Sound Extraction. 6-10 - Arnout Roebben, Toon van Waterschoot, Marc Moonen:
Cascaded Noise Reduction and Acoustic Echo Cancellation Based on an Extended Noise Reduction. 11-15 - Yanjue Song, Doyeon Kim, Hong-Goo Kang, Nilesh Madhu:
Spectrum-Aware Neural Vocoder Based on Self-Supervised Learning for Speech Enhancement. 16-20 - George Close, Thomas Hain, Stefan Goetze:
Hallucination in Perceptual Metric-Driven Speech Enhancement Networks. 21-25 - Yiwei Ding, Christof Weiß:
Towards Robust Local Key Estimation with a Musically Inspired Neural Network. 26-30 - Aapo Hakala, Trevor Kincy, Tuomas Virtanen:
Automatic Live Music Song Identification Using Multi-level Deep Sequence Similarity Learning. 31-35 - Yuta Kusaka, Akira Maezawa:
Mobile-AMT: Real-Time Polyphonic Piano Transcription for In-the-Wild Recordings. 36-40 - Barbara Pascal, Mathieu Lagrange:
On the Robustness of Musical Timbre Perception Models: From Perceptual to Learned Approaches. 41-45 - Jiacheng Gou, Yuheng Song, Chuang Shi, Huiyong Li:
Self-Supervised Mean Opinion Score Prediction of Phase-Vocoder-Based Virtual Bass System. 46-50 - David Perera, Slim Essid, Gaël Richard:
Invariance-Based Layer Regularization for Sound Event Detection. 51-55 - Modan Tailleur, Junwon Lee, Mathieu Lagrange, Keunwoo Choi, Laurie M. Heller, Keisuke Imoto, Yuki Okamoto:
Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependent. 56-60 - Alexander Werning, Reinhold Haeb-Umbach:
Target-Specific Dataset Pruning for Compression of Audio Tagging Models. 61-65 - Manu Harju, Irene Martín-Morató, Toni Heittola, Annamaria Mesaros:
Sound Event Detection with Soft Labels: A New Perspective on Evaluation. 66-70 - Shunsuke Tsubaki, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Keisuke Imoto:
Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval. 71-75 - Yuuki Tachioka:
Outlier Exposure with Efficient Division of Positive and Negative Examples for Anomalous Sound Detection. 76-80 - Shakeel A. Sheikh, Ina Kodrasi:
Impact of Speech Mode in Automatic Pathological Speech Detection. 81-85 - Mahdi Amiri, Ina Kodrasi:
Test-Time Adaptation for Automatic Pathological Speech Detection in Noisy Environments. 86-90 - Kota Dohi, Yohei Kawaguchi:
Distributed Collaborative Anomalous Sound Detection by Embedding Sharing. 91-95 - Manjunath Mulimani, Annamaria Mesaros:
Online Domain-Incremental Learning Approach to Classify Acoustic Scenes in All Locations. 96-100 - Takezo Ohta, Yoshiaki Bando, Keisuke Imoto, Masaki Onishi:
A Sequential Audio Spectrogram Transformer for Real-Time Sound Event Detection. 101-105 - Haruto Yokota, Mert Bozkurtlar, Benjamin Yen, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
A Video Vision Transformer for Sound Source Localization. 106-110 - Gabrielle Flood, Filip Elvander:
Multi-Source Localization and Data Association for Time-Difference of Arrival Measurements. 111-115 - Thomas Deppisch, Jens Ahrens, Sebastià V. Amengual Garí, Paul Calamia:
Spatial Room Impulse Response Identification from Rotating Equatorial Microphone Arrays. 116-120 - Shoma Ayano, Li Li, Shogo Seki, Daichi Kitamura:
Audio Spotforming Using Nonnegative Tensor Factorization with Attractor-Based Regularization. 121-125 - Xinmeng Luan, Marco Olivieri, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti:
Complex - Valued Physics-Informed Neural Network for Near-Field Acoustic Holography. 126-130 - Chanho Park, Hyunsik Kang, Thomas Hain:
Character Error Rate Estimation for Automatic Speech Recognition of Short Utterances. 131-135 - Cong-Thanh Do, Shuhei Imai, Rama Doddipatla, Thomas Hain:
Improving Accented Speech Recognition Using Data Augmentation Based on Unsupervised Text-to-Speech Synthesis. 136-140 - Thibault Bañeras Roux, Mickael Rouvier, Jane Wottawa, Richard Dufour:
A Comprehensive Analysis of Tokenization and Self-Supervised Learning in End-to-End Automatic Speech Recognition Applied on French Language. 141-145 - Jiawen Huang, Emmanouil Benetos:
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model. 146-150 - Abdul Hannan, Alessio Brutti, Daniele Falavigna:
LDASR: An Experimental Study on Layer Drop Using Conformer-Based Architecture. 151-155 - Takuya Fujimura, Keisuke Imoto, Tomoki Toda:
Discriminative Neighborhood Smoothing for Generative Anomalous Sound Detection. 156-160 - Zied Mnasri, Jhony H. Giraldo, Thierry Bouwmans:
Anomalous Sound Detection For Road Surveillance Based On Graph Signal Processing. 161-165 - Kevin Wilkinghoff, Alessia Cornaggia-Urrigshardt:
Multi-Sample Dynamic Time Warping for Few-Shot Keyword Spotting. 166-170 - Jonas Lindenberger, Christof Pichler, S. Schuster, Oliver Lang, Markus Neumayer, Alexander Haberl, Clemens Staudinger, Bernhard Lehner, Christoph Feilmayr, Hannes Wegleiter, Mario Huemer:
Detection of an Approximately Periodic Sequence of Pulses for Acoustic Condition Monitoring. 171-175 - Sinisa Suzic, Irene Martín-Morató, Nikola Simic, Charitha Raghavaraju, Toni Heittola, Vuk Stanojev, Dragana Bajovic:
UNS Exterior Spatial Sound Events Dataset for Urban Monitoring. 176-180 - Zhaoyi Liu, Álvaro López-Chilet, Dading Chong, Sam Michiels, Jon Ander Gómez, Friedrich Wolf-Monheim, David Newton, Danny Hughes:
SRAD-CLF: Squeak and Rattle Anomaly Detection via Contrastive Learning Framework on Real Industrial Noise Recordings. 181-185 - Subhajit Saha, Md. Sahidullah, Swagatam Das:
Exploring Green AI for Audio Deepfake Detection. 186-190 - Shun Sawada:
Symbolic-Domain Musical Instrument Classification Using Knowledge Distillation from Audio-Teacher to Symbolic-Student. 191-195 - Hafsa Ouajdi, Oussama Hadder, Modan Tailleur, Mathieu Lagrange, Laurie M. Heller:
Detection of Deepfake Environmental Audio. 196-200 - Paul M. Baggenstoss:
Projected Belief Networks with Discriminative Alignment for Classifying Marine Mammals. 201-205 - Willem Alexander Klatt, Michael Bürger, Rainer Martin, Henning Puder:
Filter Synthesis for Robust Feedback Active Noise Control Using Non-Uniformly Discretized Fourier Spectra. 206-210 - Ryota Noguchi, Yosuke Sugiura, Tetsuya Shimamura:
Frequency-domain Feedback Active Noise Control using Weighted LMS Algorithm. 211-215 - Alireza Nezamdoust, Michele Scarpiniti, Aurelio Uncini, Danilo Comminiello:
Spline Adaptive Exponential Functional Link Filter for Nonlinear Acoustic Echo Cancellation. 216-220 - Kai Xie, Ziye Yang, Jie Chen, Junjie Li:
Attention-Based Dual Stream Interactive Network for Nonlinear Residual Echo Suppression. 221-225 - Zhimin Qiu, Hongsen He, Jingdong Chen, Jacob Benesty, Yi Yu:
Normalized Multichannel Frequency-Domain LMS Filter With Nearest Kronecker Product Decomposition for Blind Identification of Low-Rank Acoustic Systems. 226-230 - Amos Schreibman, Elior Hadad, Boris Rubenchik, Moshe Tzur, Eli Tzirkel-Hancock:
Single-Channel Speech Restoration Using Deep Speech Features Reconstruction. 231-235 - Satoru Emura:
Estimation of Output SI-SDR Solely from Enhanced Speech Signals in Diffusion-Based Generative Speech Enhancement Method. 236-240 - Wang Dai, Xiaofei Li, Archontis Politis, Tuomas Virtanen:
Reference Channel Selection by Multi-Channel Masking for End-to-End Multi-Channel Speech Enhancement. 241-245 - Yuancheng Luo:
Constant Directivity Loudspeaker Beamforming. 246-250 - Bilgesu Çakmak, Thomas Dietzen, Randall Ali, Patrick A. Naylor, Toon van Waterschoot:
Microphone Pair Selection for Sound Source Localization in Massive Arrays of Spatially Distributed Microphones. 251-255 - Jiachen Wang, Tomoki Toda:
Unsupervised Training of Neural Network-Based Virtual Microphone Estimator. 256-260 - Jinzheng Zhao, Xinyuan Qian, Yong Xu, Haohe Liu, Yin Cao, Davide Berghi, Wenwu Wang:
Text-Queried Target Sound Event Localization. 261-265 - Tommaso Gambini, Davide Albertini, Alberto Bernardini:
Sound-Intensity-Based Direction of Arrival Estimation Using Centro-Symmetric Sensor Arrays. 266-270 - Thiago Lobato, Roland Sottek:
A Process for Calibrating HRTFs Based on Differentiable Implicit Representations and Domain Adversarial Learning. 271-275 - David Sundström, Anton Björkman, Andreas Jakobsson, Filip Elvander:
Room Impulse Response Estimation Using Optimal Transport: Simulation-Informed Inference. 276-280 - Manish Kumar, Lachlan Birnie, Thushara D. Abhayapala, Sandra Arcos Holzinger, Amy Bastine, Daniel Grixti-Cheng, Prasanga N. Samarasinghe:
Speech Denoising in Multi-Noise Source Environments Using Multiple Microphone Devices Via Relative Transfer Matrix. 281-285 - Daniel Aleksander Krause, Archontis Politis, Annamaria Mesaros:
Sound Event Detection and Localization with Distance Estimation. 286-290 - Jesper Brunnström, Marc Moonen, Filip Elvander:
Robust Signal and Noise Covariance Matrix Estimation Using Riemannian Optimization. 291-295 - Andrea Napoli, Paul R. White:
Diversity-Based Sampling for Imbalanced Domain Adaptation. 296-300 - Thomas Muller, Stéphane Ragot, Pierrick Philippe, Pascal Scalart:
Post-Training Latent Dimension Reduction in Neural Audio Coding. 301-305 - Ahmed Adel Attia, Yashish M. Siriwardena, Carol Y. Espy-Wilson:
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables. 306-310 - Benoît Giniès, Xiaoyu Bie, Olivier Fercoq, Gaël Richard:
Using Random Codebooks for Audio Neural AutoEncoders. 311-315 - Luca Comanducci, Fabio Antonacci, Augusto Sarti:
Interpreting End-to-End Deep Learning Models for Speech Source Localization Using Layer-Wise Relevance Propagation. 316-320 - Paul Primus, Gerhard Widmer:
Fusing Audio and Metadata Embeddings Improves Language-Based Audio Retrieval. 321-325 - Jiahao Ji, Lixian Zhu, Haojie Zhang, Kun Qian, Kele Xu, Zikai Song, Bin Hu, Björn W. Schuller, Yoshiharu Yamamoto:
Weight Light, Hear Right: Heart Sound Classification with a Low-Complexity Model. 326-330 - Philipp Wagner, Andreas Triantafyllopoulos, Alexander Gebhard, Björn W. Schuller:
Audio-Based Step-Count Estimation for Running - Windowing and Neural Network Baselines. 331-335 - Leila Ben Letaifa, Jean-Luc Rouas:
Towards Green AI: Assessing the Robustness of Conformer and Transformer Models Under Compression. 336-340 - Mohammad Bokaei, Jesper Jensen, Simon Doclo, Jan Østergaard:
Deep Digital Joint Source-Channel Based Wireless Speech Transmission. 341-345 - Marcella Astrid, Enjie Ghorbel, Djamila Aouada:
Targeted Augmented Data for Audio Deepfake Detection. 346-350 - Tornike Karchkhadze, Hassan Salami Kavaki, Mohammad Rasool Izadi, Bryce Irvin, Mikolaj Kegler, Ari Hertz, Shuo Zhang, Marko Stamenovic:
Latent CLAP Loss for Better Foley Sound Synthesis. 351-355 - Behrad Taghibeyglou, Alexander Chow, Parker Mclaurin, Oviga Yasokaran, Rene Adams, Majida Mohammed, Mandeep Singh, Najib Ayas, Sachin R. Pendharkar, Fernanda R. Almeida, Valeria Rac, Azadeh Yadollahi:
Accessible Obstructive Sleep Apnea Screening Using Classical Acoustic Speech Representations. 356-360 - Meishu Song, Ilhan Aslan, Emilia Parada-Cabaleiro, Zijiang Yang, Elisabeth André, Yoshiharu Yamamoto, Björn W. Schuller:
Lecture Video Highlights Detection from Speech. 361-365 - Théo Nguyen, Yann Teytaut, Axel Roebel:
On Strategies to Exploit Dependencies Between Singing Voice Alignment and Separation. 366-370 - Gauri Deshpande, Björn W. Schuller:
Analysis of Respiratory Health Indicators in Speech-Breathing-Patterns. 371-375 - Harish Battula, Gauri Deshpande, Sachin Patel, Björn W. Schuller:
Heart Rate from Read-Speech Influenced by Physical Exercise. 376-380 - Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Noboru Harada:
Learning to Assess Subjective Impressions from Speech. 381-385 - Eirini Sisamaki, Vassilis Tsiaras, Yannis Stylianou:
Memory Efficient Neural Speech Synthesis Based on FastSpeech2 Using Attention Free Transformer. 386-390 - Sungwoong Hwang, Changhwan Kim:
LNACont: Language-Normalized Affine Coupling Layer with Contrastive Learning for Cross-Lingual Multi-Speaker Text-to-Speech. 391-395 - Antonio Jesús Muñoz-Montoro, Marco Olivieri, Mirco Pezzoli, Julio J. Carabias-Orti, Fabio Antonacci, Augusto Sarti:
Ray-Space Constrained Multichannel Nonnegative Matrix Factorization for Audio Source Separation. 396-400 - Shrishail Baligar, Mikolaj Kegler, Bryce Irvin, Marko Stamenovic, Shawn D. Newsam:
CATSE: A Context-Aware Framework for Causal Target Sound Extraction. 401-405 - Wojciech Czaja, Canran Ji, Shashank Sule, Matthias Wellershoff:
Neural Network-Based Speech Reconstruction from Undersampled STFT Magnitude Data. 406-410 - Christos Garoufis, Athanasia Zlatintsi, Petros Maragos:
Pre-training Music Classification Models via Music Source Separation. 411-415 - Xinyu Liang, Fredrik Cumlin, Victor Ungureanu, Chandan K. A. Reddy, Christian Schüldt, Saikat Chatterjee:
DeePMOS-$\mathcal{B}$: Deep Posterior Mean-Opinion-Score Using Beta Distribution. 416-420 - Robert Sutherland, George Close, Thomas Hain, Stefan Goetze, Jon Barker:
Using Speech Foundational Models in Loss Functions for Hearing Aid Speech Enhancement. 421-425 - Kuan-Chen Wang, You-Jin Li, Wei-Lun Chen, Yu-Wen Chen, Yi-Ching Wang, Ping-Cheng Yeh, Chao Zhang, Yu Tsao:
Bridging the Gap: Integrating Pre-Trained Speech Enhancement and Recognition Models for Robust Speech Recognition. 426-430 - Negar Riazifar, Nigel G. Stocks:
Alias-Free Level Crossing Sampling. 431-435 - Yuying Xie, Michael Kuhlmann, Frederik Rautenberg, Zheng-Hua Tan, Reinhold Haeb-Umbach:
Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder. 436-440 - Francesca Ronchini, Luca Comanducci, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti:
Room Transfer Function Reconstruction Using Complex-valued Neural Networks and Irregularly Distributed Microphones. 441-445 - Nicolas Cherel, Andrés Almansa, Yann Gousseau, Alasdair Newson:
Diffusion-Based Image Inpainting with Internal Learning. 446-450 - Zeynep Ovgu Yayci, Mehmet Türkan:
Sparse Features for Multi-Exposure Fusion. 451-455 - Takuro Matsui, Masaaki Ikehara:
Edge-Guided Low-Light Image Enhancement Based on GAN with Effective Modules. 456-460 - Matthieu Muller, Daniele Picone, Mauro Dalla Mura, Magnus O. Ulfarsson:
Pattern-Invariant Unrolling for Robust Demosaicking. 461-465 - André Ricardo Backes:
Texture Classification Using Features from Multi-level Local Binary Patterns. 466-470 - Sabrina Cynthia Triess, Timo Leitritz, Christian Jauch:
Exploring AI-Based Anonymization of Industrial Image and Video Data in the Context of Feature Preservation. 471-475 - Hubert Leterme, Kévin Polisano, Valérie Perrier, Karteek Alahari:
From CNNs to Shift-Invariant Twin Models Based on Complex Wavelets. 476-480 - Kota Mochida, Teppei Nakano, Shinya Fujie, Mari Wakabayashi, Tomomi Sato, Tetsuji Ogawa:
Exploring Robust and Explainable Design for Facial Expression-Based Emotional State Estimation in Children with Profound Intellectual Multiple Disabilities. 481-485 - Philippe Massicotte, Louis-Alexandre Leclaire, Mohamed Gaha, Guillaume Houle, Christian Buteau:
Automated Inventory of Electrical Distribution Assets Based on Image Recognition and Ground LiDAR. 486-490 - Simon Schabat, Bruno Colicchio, Jean-Baptiste Courbot, Alain Dieterlen, Radhia M'Kacher:
Chromosome localisation and segmentation in fluorescence microscopy images. 491-495 - Sergio Urrea, Pablo Gomez, Karen Fonseca, Hans Garcia, Henry Arguello:
Mismatch Correction for End-to-End Designed Phase-Encoded-Based Spectral Imaging System. 496-500 - Romario Gualdrón-Hurtado, Roman Jacome, Sergio Urrea, Henry Arguello, Luis Gonzalez:
Learning Point Spread Function Invertibility Assessment for Image Deconvolution. 501-505 - Suparna Rooj, Patitapaban Palo, Rahul Mahadik, Aurobinda Routray, Manas K. Mandal:
Improving Thermal Facial Emotion Recognition in Subject-Independent Scenarios Through Multigraph Convolutional Networks. 506-510 - Emmanuel Martinez, Roman Jacome, Andrés Jerez, Henry Arguello:
RGB-Guided Spectral Image Generation through GANs with Implicit Spectral Learning. 511-515 - Adam Wieckowski, Christian Stoffers, Heiko Schwarz, Benjamin Bross, Detlev Marpe:
Fast Dependent Quantization Using Trellis Pruning, Forward Context Adaptation and Vectorization. 516-520 - Tero Partanen, Miika Kotajärvi, Alexandre Mercat, Jarno Vanne:
Motion-Vector-Driven Lightweight ROI Tracking for Real-Time Saliency-Guided Video Encoding. 521-525 - Mohammad Ghasempour, Yiying Wei, Hadi Amirpour, Christian Timmerer:
Content-Aware Reference Frame Synthesis for Enhanced Inter Prediction. 526-530 - Amar Tious, Toinon Vigier, Vincent Ricordel:
Impact of Point Cloud Normals Compression on Objective Quality Assessment. 531-535 - Ryoichi Takashima, Fumiya Nakamura, Ryo Aihara, Tetsuya Takiguchi, Yusuke Itani:
Generation of Colored Subtitle Images Based on Emotional Information of Speech Utterances. 536-540 - André Ricardo Backes, Mostafa Khojastehnazhand:
Iranian Wheat Varieties Classification by Using a Fusion of Texture Features. 541-545 - El-Hadji Samba Diop, Abdel-Ouahab Boudraa, Ndéye N. Gueye:
2D Teager-Kaiser Analysis on Gaussian Noise. 546-550 - Sinian Li, Doruk Barokas Profeta, Justin Dauwels:
MoReSo: A DNN Framework Expediting Content-Based Video Image Retrieval (CBVIR). 551-555 - Shun Wang, Yolanda Vidal, Francesc Pozo:
Two-Dimensional Color Distribution Entropy: Validation and Application on Non-Contact Fault Diagnosis for Induction Motor. 556-560 - Ali Abbasi Boroujeni, Rahil Mahdian Toroghi, Hassan Zareian:
Object-Aware Anchor-Free Video Object Tracking using Attention Mechanism and Target Dynamics. 561-565 - Nils Defauw, Marielle Malfante, Olivier Antoni, Tiana A. Rakotovao, Suzanne Lesecq:
Efficient Binary Segmentation Through Dense Neural Networks in a Truncated Frequency Domain. 566-570 - Piyush Mishra, Philippe Roudot:
Comparative Study of Transformer Robustness for Multiple Particle Tracking Without Clutter. 571-575 - Baptiste Wagner, Denis Pellerin, Sylvain Huet:
Forgetting Analysis by Module Probing for Online Object Detection with Faster R-CNN. 576-580 - Gökhan Güney, Maurice Rohr, Sebastian Dill, Christoph Hoog Antink:
On the Usability of Structural Similarity for Action Unit Intensity Detection. 581-585 - Elisha Dayag, Kevin Bui, Fredrick Park, Jack Xin:
An Image Segmentation Model with Transformed Total Variation. 586-590 - Andrzej Sluzek:
Incremental Image Decolorization with Randomizing Factors. 591-595 - Meishu Song, Xin Jing, Emilia Parada-Cabaleiro, Zijiang Yang, Yoshiharu Yamamoto, Björn W. Schuller:
Temporal Oriented ResNet for Gaming Dimensional Emotion Prediction. 596-600 - Ban-Sok Shin, Luis Wientgens, Dmitriy Shutin:
Imaging on the Edge: GPU-Accelerated Traveltime Tomography on a Jetson Nano. 601-605 - Ryugo Morita, Hitoshi Nishimura, Ko Watanabe, Andreas Dengel, Jinjia Zhou:
Edge-Based Denoising Image Compression. 606-610 - Daniil Konstantinov, Sergey Lavrushkin, Dmitriy Vatolin:
Image Robustness to Adversarial Attacks on No-Reference Image-Quality Metrics. 611-615 - Liyun Gong, Miao Yu, Gautam Siddharth Kashyap, Sheldon McCall, Mamatha Thota, Saeid Pourroostaei Ardakani:
Innovate Spatial-Temporal Attention Network (STAN) for Accurate 3D Mice Pose Estimation with a Single Monocular RGB Camera. 616-620