default search action
ICASSP 2017: New Orleans, LA, USA
- 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2017, New Orleans, LA, USA, March 5-9, 2017. IEEE 2017, ISBN 978-1-5090-4117-6
- Gilles Puy, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez:
Informed source separation via compressive graph signal sampling. 1-5 - Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez, Gaël Richard:
Motion informed audio source separation. 6-10 - Keiichi Osako, Yuki Mitsufuji, Rita Singh, Bhiksha Raj:
Supervised monaural source separation based on autoencoders. 11-15 - Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Sharon Gannot, Radu Horaud:
An EM algorithm for joint source separation and diarisation of multichannel convolutive speech mixtures. 16-20 - Yoshiki Mitsui, Daichi Kitamura, Shinnosuke Takamichi, Nobutaka Ono, Hiroshi Saruwatari:
Blind source separation based on independent low-rank matrix analysis with sparse regularization for time-series activity. 21-25 - Simon Leglaive, Roland Badeau, Gaël Richard:
Multichannel audio source separation: Variational inference of time-frequency sources from time-domain observations. 26-30 - Victor Bisot, Slim Essid, Gaël Richard:
Overlapping sound event detection with supervised Nonnegative Matrix Factorization. 31-35 - Romain Serizel, Victor Bisot, Slim Essid, Gaël Richard:
Supervised group nonnegative matrix factorisation with similarity constraints and applications to speaker identification. 36-40 - Elio Quinton, Ken O'Hanlon, Simon Dixon, Mark B. Sandler:
Tracking metrical structure changes with sparse-NMF. 41-45 - Clement Laroche, Hélène Papadopoulos, Matthieu Kowalski, Gaël Richard:
Drum extraction in single channel audio signals using multi-layer Non negative Matrix Factor Deconvolution. 46-50 - Delia Fano Yela, Sebastian Ewert, Derry FitzGerald, Mark B. Sandler:
Interference reduction in music recordings combining Kernel Additive Modelling and Non-Negative Matrix Factorization. 51-55 - Hirokazu Kameoka, Hideaki Kagami, Masahiro Yukawa:
Complex NMF with the generalized Kullback-Leibler divergence. 56-60 - Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani:
Deep clustering and conventional networks for music separation: Stronger together. 61-65 - Lukas Pfeifenberger, Matthias Zöhrer, Franz Pernkopf:
DNN-based speech mask estimation for eigenvector beamforming. 66-70 - Zhong-Qiu Wang, DeLiang Wang:
Recurrent deep stacking networks for supervised speech separation. 71-75 - Minje Kim:
Collaborative Deep Learning for speech enhancement: A run-time model selection method using autoencoders. 76-80 - Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Yoichi Haneda:
DNN-based source enhancement self-optimized by reinforcement learning using sound quality measurements. 81-85 - Paris Smaragdis, Shrikant Venkataramani:
A neural network alternative to non-negative audio models. 86-90 - Takuma Okamoto:
Analytical approach to 2.5D sound field control using a circular double-layer array of fixed-directivity loudspeakers. 91-95 - Christoph Hohnerlein, Jens Ahrens:
Perceptual evaluation of a multiband acoustic crosstalk canceler using a linear loudspeaker array. 96-100 - Satoru Emura:
Sound field estimation using two spherical microphone arrays. 101-105 - Youssef El Baba, Andreas Walther, Emanuël A. P. Habets:
Time of arrival disambiguation using the linear Radon transform. 106-110 - Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari:
Listening-area-informed sound field reproduction based on circular harmonic expansion. 111-115 - Wen Zhang, Christian Hofmann, Michael Buerger, Thushara D. Abhayapala, Walter Kellermann:
Online secondary path modelling in wave-domain active noise control. 116-120 - Rui Lu, Kailun Wu, Zhiyao Duan, Changshui Zhang:
Deep ranking: Triplet MatchNet for music metric learning. 121-125 - Juncheng Li, Wei Dai, Florian Metze, Shuhui Qu, Samarjit Das:
A comparison of Deep Learning methods for environmental sound detection. 126-130 - Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, Kevin W. Wilson:
CNN architectures for large-scale audio classification. 131-135 - Huy Phan, Philipp Koch, Lars Hertel, Marco Maaß, Radoslaw Mazur, Alfred Mertins:
CNN-LTE: A class of 1-X pooling convolutional neural networks on label tree embeddings for audio scene classification. 136-140 - Justin Salamon, Juan Pablo Bello, Andrew Farnsworth, Steve Kelling:
Fusing shallow and deep learning for bioacoustic bird species classification. 141-145 - Revathy Narasimhan, Xiaoli Z. Fern, Raviv Raich:
Simultaneous segmentation and classification of bird song using CNN. 146-150 - Vincent Mohammad Tavakoli, Jesper Rindom Jensen, Richard Heusdens, Jacob Benesty, Mads Græsbøll Christensen:
Distributed max-SINR speech enhancement with ad hoc microphone arrays. 151-155 - Jörn Anemüller, Hendrik Kayser:
Multi-channel signal enhancement with speech and noise covariance estimates computed by a probabilistic localization model. 156-160 - Antoine Deleforge, Yann Traonmilin:
Phase unmixing: Multichannel source separation with magnitude constraints. 161-165 - Antonio Canclini, Massimo Varini, Fabio Antonacci, Augusto Sarti:
Dictionary-based Equivalent Source Method for Near-Field Acoustic Holography. 166-170 - Christoph Böddeker, Patrick Hanebrink, Lukas Drude, Jahn Heymann, Reinhold Haeb-Umbach:
Optimizing neural-network supported acoustic beamforming by algorithmic differentiation. 171-175 - Yuji Koyano, Kohei Yatabe, Yasuhiro Oikawa:
Infinite-dimensional SVD for analyzing microphone array. 176-180 - Johanna Devaney, Michael I. Mandel:
An evaluation of score-informed methods for estimating fundamental frequency and power from polyphonic audio. 181-185 - Martin Weiss Hansen, Jesper Rindom Jensen, Mads Græsbøll Christensen:
Estimation of multiple pitches in stereophonic mixtures using a codebook-based approach. 186-190 - Rachel M. Bittner, Avery Wang, Juan Pablo Bello:
Pitch contour tracking in music using Harmonic Locked Loops. 191-195 - Stefan Balke, Christian Dittmar, Jakob Abeßer, Meinard Müller:
Data-driven solo voice enhancement for jazz music retrieval. 196-200 - Richard Vogl, Matthias Dorfer, Peter Knees:
Drum transcription from polyphonic music with recurrent neural networks. 201-205 - Ekaterina A. Krymova, Anil M. Nagathil, Denis Belomestny, Rainer Martin:
Segmentation of music signals based on explained variance ratio for applications in spectral complexity reduction. 206-210 - Linh Thi Thuc Tran, Henning F. Schepker, Simon Doclo, Hai Huyen Dam, Sven Nordholm:
Proportionate NLMS for adaptive feedback control in hearing aids. 211-215 - Masahiro Sunohara, Chiho Haruta, Nobutaka Ono:
Low-latency real-time blind source separation for hearing aids based on time-domain implementation of online independent vector analysis with truncation of non-causal components. 216-220 - Luca Giuliani, Luca Giulio Brayda, Sara Sansalone, Stefania Repetto, Michele Ricchetti:
Evaluation of a complementary hearing aid for spatial sound segregation. 221-225 - Saurabh Kataria, Clément Gaultier, Antoine Deleforge:
Hearing in a shoe-box: Binaural source position and wall absorption estimation using virtually supervised learning. 226-230 - Tobias May, Borys Kowalewski, Michal Fereczkowski, Ewen N. MacDonald:
Assessment of broadband SNR estimation for hearing aid applications. 231-235 - Elior Hadad, Daniel Marquardt, Wenqiang Pu, Sharon Gannot, Simon Doclo, Zhi-Quan Luo, Ivo Merks, Tao Zhang:
Comparison of two binaural beamforming approaches for hearing aids. 236-240 - Dong Yu, Morten Kolbæk, Zheng-Hua Tan, Jesper Jensen:
Permutation invariant training of deep models for speaker-independent multi-talker speech separation. 241-245 - Zhuo Chen, Yi Luo, Nima Mesgarani:
Deep attractor network for single-microphone speaker separation. 246-250 - Keisuke Kinoshita, Marc Delcroix, Atsunori Ogawa, Takuya Higuchi, Tomohiro Nakatani:
Deep mixture density network for statistical model-based feature enhancement. 251-255 - Enea Ceolini, Shih-Chii Liu:
Impact of low-precision deep regression networks on single-channel source separation. 256-260 - Stefan Uhlich, Marcello Porcu, Franck Giron, Michael Enenkl, Thomas Kemp, Naoya Takahashi, Yuki Mitsufuji:
Improving music source separation based on deep neural networks through data augmentation and network blending. 261-265 - Kenta Niwa, Yuma Koizumi, Tomoko Kawase, Kazunori Kobayashi, Yusuke Hioka:
Supervised source enhancement composed of nonnegative auto-encoders and complementarity subtraction. 266-270 - Zhong Meng, Shinji Watanabe, John R. Hershey, Hakan Erdogan:
Deep long short-term memory adaptive beamforming networks for multichannel robust speech recognition. 271-275 - Xueliang Zhang, Zhong-Qiu Wang, DeLiang Wang:
A speech enhancement algorithm by iterating single- and multi-microphone processing and its application to robust ASR. 276-280 - Yuan-Shan Lee, Chien-Yao Wang, Shu-Fan Wang, Jia-Ching Wang, Chung-Hsien Wu:
Fully complex deep neural network for phase-incorporating monaural source separation. 281-285 - Tomohiro Nakatani, Nobutaka Ito, Takuya Higuchi, Shoko Araki, Keisuke Kinoshita:
Integrating DNN-based and spatial clustering-based mask estimation for robust MVDR beamforming. 286-290 - Lufei Gao, Li Su, Yi-Hsuan Yang, Tan Lee:
Polyphonic piano note transcription with non-negative matrix factorization of differential spectrogram. 291-295 - Bilei Zhu, Fuzhang Wu, Ke Li, Yongjian Wu, Feiyue Huang, Yunsheng Wu:
Fusing transcription results from polyphonic and monophonic audio for singing melody transcription in polyphonic music. 296-300 - Luwei Yang, Akira Maezawa, Jordan B. L. Smith, Elaine Chew:
Probabilistic transcription of sung melody using a pitch dynamic model. 301-305 - Ken O'Hanlon, Sebastian Ewert, Johan Pauwels, Mark B. Sandler:
Improved template based chord recognition using the CRP feature. 306-310 - Chih Yi Kuan, Li Su, Yu-Hao Chin, Jia-Ching Wang:
Multi-pitch streaming of interwoven streams. 311-315 - Gilberto Bernardes, Matthew E. P. Davies, Carlos Guedes:
Automatic musical key estimation with adaptive mode bias. 316-320 - Georgina Tryfou, Maurizio Omologo:
A reassigned based singing voice pitch contour extraction method. 321-325 - Dairoku Kawai, Kazumasa Yamamoto, Seiichi Nakagawa:
Lyric recognition in monophonic singing using pitch-dependent DNN. 326-330 - Filip Elvander, Stefan Ingi Adalbjornsson, Johan Karlsson, Andreas Jakobsson:
Using optimal transport for estimating inharmonic pitch signals. 331-335 - Bin Liu, Jianhua Tao, Dawei Zhang, Yibin Zheng:
A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features. 336-340 - Henning F. Schepker, Linh Thi Thuc Tran, Sven Nordholm, Simon Doclo:
Null-steering beamformer for acoustic feedback cancellation in a multi-microphone earpiece optimizing the maximum stable gain. 341-345 - Anil M. Nagathil, Jan-Willem Schlattmann, Katrin Neumann, Rainer Martin:
A feature-based linear regression model for predicting perceptual ratings of music by cochlear implant listeners. 346-350 - Prasanga N. Samarasinghe, Thushara D. Abhayapala:
Blind estimation of directional properties of room reverberation using a spherical microphone array. 351-355 - Sahar Hashemgeloogerdi, Mark Bocko:
High precision robust modeling of long room responses using wavelet transform. 356-360 - Kenji Aono, Shantanu Chakrabartty, Toshihiko Yamasaki:
Infrasonic scene fingerprinting for authenticating speaker location. 361-365 - Mario Coutino, Martin Bo Møller, Jesper Kjær Nielsen, Richard Heusdens:
Greedy alternative for room geometry estimation from acoustic echoes: A subspace-based method. 366-370 - Mehrdad Heydarzadeh, Mehrdad Nourani, John Hansen, Shahin Hedayati Kia:
Non-invasive gearbox fault diagnosis using scattering transform of acoustic emission. 371-375 - Tatsuya Komatsu, Reishi Kondo:
Detection of anomaly acoustic scenes based on a temporal dissimilarity model. 376-380 - Hamza A. Javed, Benjamin Cauchi, Simon Doclo, Patrick A. Naylor, Stefan Goetze:
Measuring, modelling and predicting perceived reverberation. 381-385 - Charlotte Sorensen, Angeliki Xenaki, Jesper Bünsow Boldt, Mads Græsbøll Christensen:
Pitch-based non-intrusive objective intelligibility prediction. 386-390 - Sanna Wager, Liang Chen, Minje Kim, Christopher Raphael:
Towards expressive instrument synthesis through smooth frame-by-frame reconstruction: From string to woodwind. 391-395 - Derry FitzGerald, Zafar Rafii, Antoine Liutkus:
User assisted separation of repeating patterns in time and frequency using magnitude projections. 396-400 - Costas Yiallourides, Victoria Manning-Eid, Alastair H. Moore, Patrick A. Naylor:
A dynamic programming approach for automatic stride detection and segmentation in acoustic emission from the knee. 401-405 - Amit Das, Ivan Tashev, Shoaib Mohammed:
Ultrasound based gesture recognition. 406-410 - Shih-Yang Su, Cheng-Kai Chiu, Li Su, Yi-Hsuan Yang:
Automatic conversion of Pop music into chiptunes for 8-bit pixel art. 411-415 - Jesper Kjær Nielsen, Tobias Lindstrøm Jensen, Jesper Rindom Jensen, Mads Græsbøll Christensen, Søren Holdt Jensen:
Fast harmonic chirp summation. 416-420 - Wei Dai, Chia Dai, Shuhui Qu, Juncheng Li, Samarjit Das:
Very deep convolutional neural networks for raw waveforms. 421-425 - Feng-Xiang Ge, Ying Chen, Weichang Li:
Target detecton and tracking via structured convex optimization. 426-430 - Mattia Verasani, Alberto Bernardini, Augusto Sarti:
Modeling Sallen-Key audio filters in the Wave Digital domain. 431-435 - Song Wang, Ruimin Hu, Shihong Chen, Xiaochen Wang, Bo Peng, Yuhong Yang, Weiping Tu:
Sound physical property matching between non central listening point and central listening point for NHK 22.2 system reproduction. 436-440 - Naoki Murata, Shoichi Koyama, Norihiro Takamune, Hiroshi Saruwatari:
Spatio-temporal sparse sound field decomposition considering acoustic source signal characteristics. 441-445 - Alejandro Cohen, Georg Stemmer, Seppo Ingalsuo, Shmulik Markovich Golan:
Combined Weighted Prediction Error and Minimum Variance Distortionless Response for dereverberation. 446-450 - Dmitry N. Zotkin, Nail A. Gumerov, Ramani Duraiswami:
Incident field recovery for an arbitrary-shaped scatterer. 451-455 - Jacob Donley, Christian H. Ritz, W. Bastiaan Kleijn:
Active speech control using wave-domain processing with a linear wall of dipole secondary sources. 456-460 - Hannes Gamper, David Johnston, Ivan J. Tashev:
Interaural time delay personalisation using incomplete head scans. 461-465 - Panji Setiawan, Wenyu Jin:
Compressing higher order ambisonics of a multizone soundfield. 466-470 - Kainan Chen, Jürgen T. Geiger, Walter Kellermann:
Robust audio localization with phase unwrapping. 471-475 - Toru Taniguchi, Taro Masuda:
Linear demixed domain multichannel nonnegative matrix factorization for speech enhancement. 476-480 - Reza Zolfaghari, Nicolas Epain, Craig T. Jin, Joan Alexis Glaunès, Anthony I. Tew:
Kernel principal component analysis of the ear morphology. 481-485 - Gongping Huang, Jacob Benesty, Jingdong Chen:
Study of the frequency-domain multichannel noise reduction problem with the householder transformation. 486-490 - Christopher Schymura, Juan Diego Rios Grajales, Dorothea Kolossa:
Monte Carlo exploration for active binaural localization. 491-495 - Lin Wang, Andrea Cavallaro:
Time-frequency processing for sound source localization from a micro aerial vehicle. 496-500 - Jesper Rindom Jensen, Mads Græsbøll Christensen, Andreas Jakobsson:
Harmonic minimum mean squared error filters for multichannel speech enhancement. 501-505 - Wenyu Jin, Mohammad Javad Taghizadeh, Kainan Chen, Wei Xiao:
Multi-channel noise reduction for hands-free voice communication on mobile phones. 506-510 - Vishnuvardhan Varanasi, Rajesh M. Hegde:
Robust online direction of arrival estimation using low dimensional spherical harmonic features. 511-515 - Sina Hafezi, Alastair H. Moore, Patrick A. Naylor:
Multiple source localization using Estimation Consistency in the Time-Frequency domain. 516-520 - Alastair H. Moore, Mike Brookes, Patrick A. Naylor:
Robust spherical harmonic domain interpolation of spatially sampled array manifolds. 521-525 - Symeon Delikaris-Manias, Despoina Pavlidi, Athanasios Mouchtaris, Ville Pulkki:
DOA estimation with histogram analysis of spatially constrained active intensity vectors. 526-530 - Paul Magron, Roland Badeau, Bertrand David:
Phase-dependent anisotropic Gaussian model for audio source separation. 531-535 - Francesco Nesta, Zbynek Koldovský:
Supervised independent vector analysis through pilot dependent components. 536-540 - Xiaofei Li, Laurent Girin, Radu Horaud:
Audio source separation based on convolutive transfer function and frequency-domain lasso optimization. 541-545 - Shoaib Mohammed, Ivan Tashev:
A statistical approach to semi-supervised speech enhancement with low-order non-negative matrix factorization. 546-550 - Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara:
Bayesian multichannel nonnegative matrix factorization for audio source separation and localization. 551-555 - Baldwin Dumortier, Emmanuel Vincent, Madalina Deaconu:
Recursive Bayesian estimation of the acoustic noise emitted by wind farms. 556-560 - Hideaki Kagami, Hirokazu Kameoka, Masahiro Yukawa:
A majorization-minimization algorithm with projected gradient updates for time-domain spectrogram factorization. 561-565 - Fatemeh Pishdadian, Bryan Pardo, Antoine Liutkus:
A Multi-resolution approach to Common Fate-based audio separation. 566-570