default search action
ICASSP 2015: South Brisbane, Queensland, Australia
- 2015 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015. IEEE 2015, ISBN 978-1-4673-6997-8
AASP-L1: Microphone Array Source Localization
- Vinod V. Reddy, Andy W. H. Khong:
Direction-of-arrival estimation of speech sources under aliasing conditions. 1-5 - Archontis Politis, Symeon Delikaris-Manias, Ville Pulkki:
Direction-of-arrival and diffuseness estimation above spatial aliasing for symmetrical directional microphone arrays. 6-10 - Jesper Rindom Jensen, Jesper Kjær Nielsen, Mads Græsbøll Christensen, Søren Holdt Jensen:
On frequency domain models for TDOA estimation. 11-15 - Mojtaba Farmani, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen:
Maximum likelihood approach to "informed" Sound Source Localization for Hearing Aid applications. 16-20 - Jes Thyssen, Ashutosh Pandey, Bengt Jonas Borgstrom:
A novel Time-Delay-of-Arrival estimation technique for multi-microphone audio processing. 21-25 - Arun Parthasarathy, Saurabh Kataria, Lalan Kumar, Rajesh M. Hegde:
Representation and modeling of spherical harmonics manifold for source localization. 26-30
AASP-L2: Reverberant Signal Analysis and Decomposition for Audio and Speech Processing
- Clement S. J. Doire, Mike Brookes, Patrick A. Naylor, Dave Betts, Christopher M. Hicks, Mohammad A. Dmour, Søren Holdt Jensen:
Single-channel blind estimation of reverberation parameters. 31-35 - Christian Uhle, Emanuël A. P. Habets:
Direct-ambient decomposition using parametric wiener filtering with spatial cue control. 36-40 - Felicia Lim, Mark R. P. Thomas, Ivan J. Tashev:
Blur kernel estimation approach to blind reverberation time estimation. 41-45 - James Eaton, Alastair H. Moore, Patrick A. Naylor, Jan Skoglund:
Direct-to-Reverberant Ratio estimation using a null-steered beamformer. 46-50 - Nikolaos Stefanakis, Athanasios Mouchtaris:
Foreground suppression for capturing and reproduction of crowded acoustic environments. 51-55 - Christian Schüldt, Peter Händel:
Noise robust integration for blind and non-blind reverberation time estimation. 56-60
AASP-L3: Single-Channel Audio Source Separation
- Yanhui Tu, Jun Du, Li-Rong Dai, Chin-Hui Lee:
Speech Separation based on signal-noise-dependent deep neural networks for robust speech recognition. 61-65 - Jonathan Le Roux, John R. Hershey, Felix Weninger:
Deep NMF for speech separation. 66-70 - François G. Germain, Gautham J. Mysore:
Speaker and noise independent online single-channel speech enhancement. 71-75 - Antoine Liutkus, Derry Fitzgerald, Zafar Rafii:
Scalable audio separation with light Kernel Additive Modelling. 76-80 - Paul Magron, Roland Badeau, Bertrand David:
Phase recovery in NMF for audio source separation: An insightful benchmark. 81-85 - Hirokazu Kameoka:
Multi-resolution signal decomposition with time-domain spectrogram factorization. 86-90
AASP-L4: Multichannel Denoising and Dereverberation
- Adam Kuklasinski, Simon Doclo, Timo Gerkmann, Søren Holdt Jensen, Jesper Jensen:
Multi-channel PSD estimators for speech dereverberation - A theoretical and experimental comparison. 91-95 - Ante Jukic, Nasser Mohammadiha, Toon van Waterschoot, Timo Gerkmann, Simon Doclo:
Multi-channel linear prediction-based speech dereverberation with low-rank power spectrogram approximation. 96-100 - Masahito Togami:
Variational Bayes state space model for acoustic echo reduction and dereverberation. 101-105 - Ofer Schwartz, Sharon Gannot, Emanuël A. P. Habets:
Nested generalized sidelobe canceller for joint dereverberation and noise reduction. 106-110 - Rainer Martin, Masoumeh Azarpour, Gerald Enzner:
Binaural speech enhancement with instantaneous coherence smoothing using the cepstral correlation coefficient. 111-115 - Shoko Araki, Tomoki Hayashi, Marc Delcroix, Masakiyo Fujimoto, Kazuya Takeda, Tomohiro Nakatani:
Exploring multi-channel features for denoising-autoencoder-based speech enhancement. 116-120
AASP-L5: Music Information Extraction: Singing Voice and Music Structure
- Simon Leglaive, Romain Hennequin, Roland Badeau:
Singing voice detection with deep recurrent neural networks. 121-125 - Jonathan Driedger, Meinard Müller:
Extracting singing voice from music recordings by cascading audio decomposition techniques. 126-130 - Xiu Zhang, Wei Li, Bilei Zhu:
Latent time-frequency component analysis: A novel pitch-based approach for singing voice separation. 131-135 - Prateek Verma, Vinutha T. P., Parthe Pandit, Preeti Rao:
Structural segmentation of Hindustani concert audio with posterior features. 136-140 - Andre Holzapfel, Umut Simsekli, Sertan Sentürk, Ali Taylan Cemgil:
Section-level modeling of musical audio for linking performances to scores in Turkish makam music. 141-145 - Nanzhu Jiang, Meinard Müller:
Estimating double thumbnails for music recordings. 146-150
AASP-L6: Acoustic Event Detection and Classification
- Annamaria Mesaros, Toni Heittola, Onur Dikmen, Tuomas Virtanen:
Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations. 151-155 - Keisuke Imoto, Nobutaka Ono:
Acoustic scene analysis from acoustic event sequence with intermittent missing event. 156-160 - Mahesh Kumar Nandwana, Ali Ziaei, John H. L. Hansen:
Robust unsupervised detection of human screams in noisy acoustic environments. 161-165 - Xueyuan Zhang, Qianhua He, Xiaohui Feng:
Acoustic feature extraction by tensor-based sparse representation for sound effects classification. 166-170 - Justin Salamon, Juan Pablo Bello:
Unsupervised feature learning for urban sound classification. 171-175 - Jonathan William Dennis, Tran Huy Dat, Haizhou Li:
Combining robust spike coding with spiking neural networks for sound event classification. 176-180
AASP-P1: Music Analysis and Synthesis I, Signal Enhancement I
- Balaji Thoshkahna, Meinard Müller, Venkatesh Kulkarni, Nanzhu Jiang:
Novel audio features for capturing tempo salience in music recordings. 181-185 - Camila de Andrade Scatolini, Gaël Richard, Benoit Fuentes:
Multipitch estimation using a PLCA-based model: Impact of partial user annotation. 186-190 - H. G. Ranjani, Thippur V. Sreenivas:
Multi-instrument detection in polyphonic music using Gaussian Mixture based factorial HMM. 191-195 - Satoshi Maruo, Kazuyoshi Yoshii, Katsutoshi Itoyama, Matthias Mauch, Masataka Goto:
A feedback framework for improved chord recognition based on NMF-based approximate note transcription. 196-200 - Jiaolong Yu, Jacob Benesty, Gongping Huang, Jingdong Chen:
Optimal single-channel noise reduction filtering matrices from the pearson correlation coefficient perspective. 201-205 - Gongping Huang, Jingdong Chen, Jacob Benesty:
Investigation of a parametric gain approach to single-channel speech enhancement. 206-210 - Marc Aubreville, Stefan Petrausch:
Directionality assessment of adaptive binaural beamforming with noise suppression in hearing aids. 211-215 - Andreas Gaich, Pejman Mowlaee:
On speech quality estimation of phase-aware single-channel speech enhancement. 216-220 - Mark J. Harvilla, Richard M. Stern:
Efficient audio declipping using regularized least squares. 221-225 - Juin-Hwey Chen, Thomas Baker, Evan McCarthy, Jes Thyssen:
System architectures and digital signal processing algorithms for enhancing the output audio quality of stereo FM broadcast receivers. 226-230 - Yuan Zeng, Richard C. Hendriks, Nikolay D. Gaubitch:
On clock synchronization for multi-microphone speech processing in wireless acoustic sensor networks. 231-235
AASP-P2: Source Separation I, Audio Systems
- Ping-Keng Jao, Yi-Hsuan Yang, Brendt Wohlberg:
Informed monaural source separation of music based on convolutional sparse coding. 236-240 - Tom Barker, Tuomas Virtanen, Niels Henrik Pontoppidan:
Low-latency sound-source-separation using non-negative matrix factorisation with coupled analysis and synthesis dictionaries. 241-245 - Hui Zhang, Xueliang Zhang, Shuai Nie, Guanglai Gao, Wenju Liu:
A pairwise algorithm for pitch estimation and speech separation using deep stacking network. 246-250 - Mahmoud Fakhry, Piergiorgio Svaizer, Maurizio Omologo:
Audio source separation using a redundant library of source spectral bases for non-negative tensor factorization. 251-255 - Dalia El Badawy, Alexey Ozerov, Ngoc Q. K. Duong:
Relative group sparsity for non-negative matrix factorization with application to on-the-fly audio source separation. 256-260 - Xin Guo, Stefan Uhlich, Yuki Mitsufuji:
NMF-based blind source separation using a linear predictive coding error clustering criterion. 261-265 - Antoine Liutkus, Roland Badeau:
Generalized Wiener filtering with fractional power spectrograms. 266-270 - Zafar Rafii, Antoine Liutkus, Bryan Pardo:
A simple user interface system for recovering patterns repeating in time and frequency in mixtures of sounds. 271-275 - Daichi Kitamura, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari:
Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model. 276-280 - Hirofumi Nakajima, Naoto Sakata, Kihiro Hashino:
Non-linear distortion reduction for a loudspeaker based on recursive source equalization. 281-285 - Wataru Owaki, Kota Takahashi:
Novel sound mixing method for voice and background music. 290-294
AASP-P3: Microphone Array Processing I, Fingerprinting, Watermarking
- Liheng Zhao, Jacob Benesty, Jingdong Chen:
Optimal design of directivity patterns for endfire linear microphone arrays. 295-299 - Meng Guo, Jan Mark de Haan, Jesper Jensen:
A simple modification to facilitate robust generalized sidelobe canceller for hearing aids. 300-304 - Xiaoguang Wu, Huawei Chen:
On directivity factor of the first-order steerable differential microphone array. 305-309 - Satoru Emura:
ℓ1-constrained MVDR-based selection of nonidentical directivities in microphone array. 310-314 - Ina Kodrasi, Daniel Marquardt, Simon Doclo:
Curvature-based optimization of the trade-off parameter in the speech distortion weighted multichannel wiener filter. 315-319 - Xiaofei Li, Laurent Girin, Radu Horaud, Sharon Gannot:
Estimation of relative transfer function in the presence of stationary noise based on segmental power spectral density matrix subtraction. 320-324 - Weiqiao Zheng, Yue Xian Zou, Christian H. Ritz:
Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation. 325-329 - Thibault Nowakowski, Laurent Daudet, Julien de Rosny:
Microphone array position calibration in the frequency domain using a single unknown source. 330-334 - Lucas Ondel, Xavier Anguera, Jordi Luque:
MASK+: Data-driven regions selection for acoustic fingerprinting. 335-339 - T. J. Tsai, Gerald Friedland, Xavier Anguera:
An information-theoretic metric of fingerprint effectiveness. 340-344 - Nhut Minh Ngo, Masashi Unoki:
Robust and reliable audio watermarking based on phase coding. 345-349 - Jianjun He, Woon-Seng Gan:
Multi-shift principal component analysis based primary component extraction for spatial audio reproduction. 350-354
AASP-P4: Signal Enhancement II, Audio Coding
- Antoine Deleforge, Walter Kellermann:
Phase-optimized K-SVD for signal extraction from underdetermined multichannel sparse mixtures. 355-359 - Sebastian Braun, Konrad Kowalczyk, Emanuël A. P. Habets:
Residual noise control using a parametric multichannel Wiener filter. 360-364 - Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann:
Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation. 365-369 - Chung-Chien Hsu, Kah-Meng Cheong, Jen-Tzung Chien, Tai-Shih Chi:
Modulation Wiener filter for improving speech intelligibility. 370-374 - Robert Rehr, Timo Gerkmann:
Cepstral noise subtraction for robust automatic speech recognition. 375-378 - Simon J. Godsill, Herbert Buchner, Jan Skoglund:
Detection and suppression of keyboard transient noise in audio streams with auxiliary keybed microphone. 379-383 - Vinay Melkote, Malcolm Law, Rhonda Wilson:
Hierarchical and Lossless Coding of audio objects in Dolby TrueHD. 384-388 - Christian R. Helmrich, Andreas Niedermeier, Sascha Disch, Florin Ghido:
Spectral envelope reconstruction via IGF for audio transform coding. 389-393 - Niklas Koep, Magnus Schaefer, Peter Vary:
Noise-shaping for closed-loop Multi-Channel Linear Prediction. 394-398 - Stephan Preihs, Jörn Ostermann:
Globally optimized dynamic bit-allocation strategy for subband ADPCM-based low delay audio coding. 399-403 - Gerald Schuller, Jakob Abeßer, Christian Kehling:
Parameter extraction for bass guitar sound models including playing styles. 404-408
AASP-P5: Music Information Retrieval I, Source Localization and Counting
- Simon Durand, Juan Pablo Bello, Bertrand David, Gaël Richard:
Downbeat tracking with multiple features and deep neural networks. 409-413 - Axel Roebel, Jordi Pons, Marco Liuni, Mathieu Lagrange:
On automatic drum transcription using non-negative matrix deconvolution and itakura saito divergence. 414-418 - Mi Tian, György Fazekas, Dawn A. A. Black, Mark B. Sandler:
On the use of the tempogram to describe audio content and its application to Music structural segmentation. 419-423 - Thomas Fillon, Cyril Joder, Simon Durand, Slim Essid:
A Conditional Random Field system for beat tracking. 424-428 - Andrea Cogliati, Zhiyao Duan:
Piano music transcription modeling note temporal evolution. 429-433 - Qin Yan, Cong Ding, Jingjing Yin, Yong Lv:
Improving music auto-tagging with trigger-based context model. 434-438 - Mojtaba Farmani, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen:
On the influence of microphone array geometry on HRTF-based Sound Source Localization. 439-443 - Kai Wu, V. G. Reju, Andy W. H. Khong:
Multi-source direction-of-arrival estimation in a reverberant environment using single acoustic vector sensor. 444-448 - Angelo M. C. R. Borzino, José Antonio Apolinário, Marcello L. R. de Campos:
Robust DOA estimation of heavily noisy gunshot signals. 449-453 - Jesper Rindom Jensen, Mads Græsbøll Christensen:
A joint audio-visual approach to audio localization. 454-458 - Oliver Walter, Lukas Drude, Reinhold Haeb-Umbach:
Source counting in speech mixtures by nonparametric Bayesian estimation of an infinite Gaussian mixture model. 459-463
AASP-P6: Source Separation II, Spatial Audio I
- Keisuke Kinoshita, Tomohiro Nakatani:
Modeling inter-node acoustic dependencies with Restricted Boltzmann Machine for distributed microphone array based BSS. 464-468 - Ana Ramírez López, Nobutaka Ono, Ulpu Remes, Kalle J. Palomäki, Mikko Kurimo:
Designing multichannel source separation based on single-channel source separation. 469-473 - Waqas Rafique, Syed Mohsen Naqvi, Philip J. B. Jackson, Jonathon A. Chambers:
IVA algorithms using a multivariate Student's t source prior for speech source separation in real room environments. 474-478 - Minje Kim, Paris Smaragdis, Gautham J. Mysore:
Efficient manifold preserving audio source separation using locality sensitive hashing. 479-483 - Nathan Souviraà-Labastie, Emmanuel Vincent, Frédéric Bimbot:
Music separation guided by cover tracks: Designing the joint NMF model. 484-488 - Il-Young Jeong, Kyogu Lee:
Informed source separation from monaural music with limited binary time-frequency annotation. 489-493 - Yuki Murota, Daichi Kitamura, Shoichi Koyama, Hiroshi Saruwatari, Satoshi Nakamura:
Statistical modeling of binaural signal and its application to binaural source separation. 494-498 - Hannes Gamper, Mark R. P. Thomas, Ivan J. Tashev:
Estimation of multipath propagation delays and interaural time differences from 3-D head scans. 499-503 - Kohei Yatabe, Yasuhiro Oikawa:
Optically visualized sound field reconstruction based on sparse selection of point sound sources. 504-508 - Nachanant Chitanont, Keita Yaginuma, Kohei Yatabe, Yasuhiro Oikawa:
Visualization of sound field by means of Schlieren method with spatio-temporal filtering. 509-513 - Luca Remaggi, Philip J. B. Jackson, Wenwu Wang, Jonathon A. Chambers:
A 3D model for room boundary estimation. 514-518 - Hanieh Khalilian, Ivan V. Bajic, Rodney G. Vaughan:
Joint optimization of loudspeaker placement and radiation patterns for Sound Field Reproduction. 519-523
AASP-P7: Microphone Array Processing II, Audio Content Analysis
- Akihiko Sugiyama, Ryoji Miyahara:
A directional noise suppressor with a specified beamwidth. 524-528 - Craig A. Anderson, Stefan Meier, Walter Kellermann, Paul D. Teal, Mark A. Poletti:
Trinicon-BSS system incorporating robust dual beamformers for noise reduction. 529-533 - Kenta Niwa, Tatsuya Kako, Kazunori Kobayashi:
Microphone array for increasing mutual information between sound sources and observation signals. 534-538 - Maja Taseska, Emanuël A. P. Habets:
Minimum Bayes risk signal detection for speech enhancement based on a narrowband DOA model. 539-543 - Shmulik Markovich Golan, Sharon Gannot:
Performance analysis of the covariance subtraction method for relative transfer function estimation and comparison to the covariance whitening method. 544-548 - Jingjing Yu, Kevin D. Donohue:
Optimization for randomly described arrays based on geometry descriptors. 549-553 - Robin Scheibler, Ivan Dokmanic, Martin Vetterli:
Raking echoes in the time domain. 554-558 - Haomin Zhang, Ian McLoughlin, Yan Song:
Robust sound event recognition using convolutional neural networks. 559-563 - Olfa Fraj, Raja Ghozi, Meriem Jaïdane-Saïdane:
Temporal entropy-based texturedness indicator for audio signals. 564-568
AASP-P8: Music Analysis and Synthesis II, Echo Control
- Sebastian Ewert, Mark D. Plumbley, Mark B. Sandler:
A dynamic programming variant of non-negative matrix deconvolution for the transcription of struck string instruments. 569-573 - Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama:
Singing voice analysis and editing based on mutually dependent F0 estimation and source separation. 574-578