


default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 23
Volume 23, Number 1, January 2015
- Yong Xu, Jun Du, Li-Rong Dai, Chin-Hui Lee:

A Regression Approach to Speech Enhancement Based on Deep Neural Networks. 7-19 - Huy Phan, Marco Maaß

, Radoslaw Mazur, Alfred Mertins:
Random Regression Forests for Acoustic Event Detection and Classification. 20-31 - Yuntao Wu, Amir Leshem

, Jesper Rindom Jensen
, Guisheng Liao:
Joint Pitch and DOA Estimation Using the ESPRIT Method. 32-45 - Remi Decorsiere, Peter L. Søndergaard

, Ewen N. MacDonald
, Torsten Dau
:
Inversion of Auditory Spectrograms, Traditional Spectrograms, and Other Envelope Representations. 46-56 - Johann Poignant, Laurent Besacier, Georges Quénot:

Unsupervised Speaker Identification in TV Broadcast Based on Written Names. 57-68 - Renjie Tong, Yingyue Zhou, Long Zhang, Guangzhao Bao, Zhongfu Ye:

A Robust Time-Frequency Decomposition Model for Suppression of Mixed Gaussian-Impulse Noise in Audio Signals. 69-79 - Soodeh Ahani, Shahrokh Ghaemmaghami, Z. Jane Wang:

A Sparse Representation-Based Wavelet Domain Speech Steganography Method. 80-91 - Arun Narayanan, DeLiang Wang:

Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training. 92-101 - Rongfeng Su, Xunying Liu, Lan Wang:

Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition. 102-114 - Zixing Zhang, Eduardo Coutinho

, Jun Deng, Björn W. Schuller
:
Cooperative Learning and its Application to Emotion Recognition from Speech. 115-126 - Pei-hao Su, Chuan-Hsun Wu, Lin-Shan Lee:

A Recursive Dialogue Game for Personalized Computer-Aided Pronunciation Training. 127-141 - Alain Rakotomamonjy, Gilles Gasso:

Histogram of Gradients of Time-Frequency Representations for Audio Scene Classification. 142-153 - Soudeh A. Khoubrouy

, Issa M. S. Panahi, John H. L. Hansen:
Howling Detection in Hearing Aids Based on Generalized Teager-Kaiser Operator. 154-161 - Jens Brehm Bagger Nielsen, Jakob Nielsen, Jan Larsen

:
Perception-Based Personalization of Hearing Aids Using Gaussian Processes and Active Learning. 162-173 - Jesper Rindom Jensen

, Mads Græsbøll Christensen
, Jacob Benesty
, Søren Holdt Jensen:
Joint Spatio-Temporal Filtering Methods for DOA and Fundamental Frequency Estimation. 174-185 - Jesper Jensen, Zheng-Hua Tan

:
Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features-A Theoretically Consistent Approach. 186-197 - Carlos D. Martínez-Hinarejos

, José-Miguel Benedí, Vicent Tamarit:
Unsegmented Dialogue Act Annotation and Decoding With N-Gram Transducers. 198-211 - Lin Wang, Zhe Chen, Fuliang Yin:

A Novel Hierarchical Decomposition Vector Quantization Method for High-Order LPC Parameters. 212-221
Volume 23, Number 2, February 2015
- Guang Hua

, Jonathan Goh, Vrizlynn L. L. Thing:
Time-Spread Echo-Based Audio Watermarking With Optimized Imperceptibility and Robustness. 227-239 - Ofer Schwartz, Sharon Gannot

, Emanuël A. P. Habets
:
Multi-Microphone Speech Dereverberation and Noise Reduction Using Relative Early Transfer Functions. 240-251 - Emilio Molina, Lorenzo J. Tardón, Ana M. Barbancho, Isabel Barbancho

:
SiPTH: Singing Transcription Based on Hysteresis Defined on the Pitch-Time Curve. 252-263 - Haipeng Wang, Tan Lee

, Cheung-Chi Leung, Bin Ma, Haizhou Li
:
Acoustic Segment Modeling with Spectral Clustering Methods. 264-277 - Vipul Arora

, Laxmidhar Behera
:
Multiple F0 Estimation and Source Clustering of Polyphonic Music Audio Using PLCA and HMRFs. 278-287 - Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada

, Hirokazu Kameoka, Takehiro Moriya:
Resolution Warped Spectral Representation for Low-Delay and Low-Bit-Rate Audio Coder. 288-299 - Chao Weng, Biing-Hwang Fred Juang:

Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech. 300-312 - Yoichi Matsuyama, Akihiro Saito, Shinya Fujie, Tetsunori Kobayashi:

Automatic Expressive Opinion Sentence Generation for Enjoyable Conversational Systems. 313-326 - Petko Nikolov Petkov, W. Bastiaan Kleijn

:
Spectral Dynamics Recovery for Enhanced Speech Intelligibility in Noise. 327-338 - Ergun Biçici

, Deniz Yuret:
Optimizing Instance Selection for Statistical Machine Translation with Feature Decay Algorithms. 339-350 - Mengqiu Zhang, Rodney A. Kennedy

, Thushara D. Abhayapala
:
Empirical Determination of Frequency Representation in Spherical Harmonics-Based HRTF Functional Modeling. 351-360 - Zuren Feng, Qing Zhou, Jun Zhang, Ping Jiang, Xuewen Yang

:
A Target Guided Subband Filter for Acoustic Event Detection in Noisy Environments Using Wavelet Packets. 361-372 - Naoki Hirayama, Koichiro Yoshino, Katsutoshi Itoyama, Shinsuke Mori, Hiroshi G. Okuno

:
Automatic Speech Recognition for Mixed Dialect Utterances by Mixing Dialect Language Models. 373-382 - Alexander Schasse, Timo Gerkmann

, Rainer Martin
, Wolfgang Sörgel, Thomas Pilgrim, Henning Puder:
Two-Stage Filter-Bank System for Improved Single-Channel Noise Reduction in Hearing Aids. 383-393 - Boaz Schwartz, Sharon Gannot

, Emanuël A. P. Habets:
Online Speech Dereverberation Using Kalman Filter and EM Algorithm. 394-406 - Branislav Gerazov

, Zoran A. Ivanovski:
Kernel Power Flow Orientation Coefficients for Noise-Robust Speech Recognition. 407-419
Volume 23, Number 3, March 2015
- Haizhou Li

, Marcello Federico, Xiaodong He, Helen M. Meng, Isabel Trancoso
:
Introduction to the Special Section on Continuous Space and Related Methods in Natural Language Processing. 427-430 - Heike Adel, Ngoc Thang Vu, Katrin Kirchhoff, Dominic Telaar, Tanja Schultz

:
Syntactic and Semantic Features For Code-Switching Factored Language Models. 431-440 - Xiaodong Zeng, Derek F. Wong

, Lidia S. Chao, Isabel Trancoso
:
Graph-Based Lexicon Regularization for PCFG With Latent Annotations. 441-450 - Wenliang Chen, Min Zhang, Yue Zhang:

Distributed Feature Representations for Dependency Parsing. 451-460 - Ruiji Fu, Jiang Guo, Bing Qin

, Wanxiang Che, Haifeng Wang
, Ting Liu:
Learning Semantic Hierarchies: A Continuous Vector Space Approach. 461-471 - Rafael E. Banchs, Luis F. D'Haro

, Haizhou Li
:
Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework. 472-482 - Deyi Xiong

, Min Zhang, Xing Wang:
Topic-Based Coherence Modeling for Statistical Machine Translation. 483-493 - Brian Hutchinson

, Mari Ostendorf, Maryam Fazel:
A Sparse Plus Low-Rank Exponential Language Model for Limited Resource Scenarios. 494-504 - Mohsen A. Rashwan

, Ahmad A. Al Sallab, Hazem M. Raafat
, Ahmed Rafea
:
Deep Learning Framework with Confused Sub-Set Resolution Architecture for Automatic Arabic Diacritization. 505-516 - Martin Sundermeyer, Hermann Ney, Ralf Schlüter

:
From Feedforward to Recurrent LSTM Neural Networks for Language Modeling. 517-529 - Grégoire Mesnil, Yann N. Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Hakkani-Tür

, Xiaodong He, Larry P. Heck, Gökhan Tür
, Dong Yu, Geoffrey Zweig:
Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding. 530-539 - Ian McLoughlin

, Haomin Zhang, Zhipeng Xie, Yan Song, Wei Xiao:
Robust Sound Event Classification Using Deep Neural Networks. 540-552 - Dusan Zahoransky, Ivan Polásek

:
Text Search of Surnames in Some Slavic and Other Morphologically Rich Languages Using Rule Based Phonetic Algorithms. 553-563 - Yow-Bang Wang

, Lin-Shan Lee:
Supervised Detection and Unsupervised Discovery of Pronunciation Error Patterns for Computer-Assisted Language Learning. 564-579 - Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki:

Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines. 580-587 - Nicolas Obin, Pierre Lanchantin:

Symbolic Modeling of Prosody: From Linguistics to Statistics. 588-599
Volume 23, Number 4, April 2015
- Langzhou Chen, Norbert Braunschweiler, Mark J. F. Gales:

Speaker and Expression Factorization for Audiobook Data: Expressiveness and Transplantation. 605-618 - Xinjie Zhou, Xiaojun Wan, Jianguo Xiao:

CLOpinionMiner: Opinion Target Extraction in a Cross-Language Scenario. 619-630 - Pan Zhou, Hui Jiang, Li-Rong Dai, Yu Hu, Qingfeng Liu:

State-Clustering Based Multiple Deep Neural Networks Modeling Approach for Speech Recognition. 631-642 - Ying Hu, Guizhong Liu:

Separation of Singing Voice Using Nonnegative Matrix Partial Co-Factorization for Singer Identification. 643-653 - Daichi Kitamura, Hiroshi Saruwatari, Hirokazu Kameoka, Yu Takahashi

, Kazunobu Kondo, Satoshi Nakamura:
Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration. 654-669 - Van-Khanh Mai, Dominique Pastor

, Abdeldjalil Aïssa-El-Bey
, Raphaël Le Bidan
:
Robust Estimation of Non-Stationary Noise Power Spectrum for Speech Enhancement. 670-682 - Eduardo Blanco, Dan I. Moldovan:

A Semantic Logic-Based Approach to Determine Textual Similarity. 683-693 - Myung Jong Kim, Younggwan Kim, Hoirin Kim:

Automatic Intelligibility Assessment of Dysarthric Speech Using Phonologically-Structured Sparse Linear Model. 694-704 - G. Aneeja, B. Yegnanarayana:

Single Frequency Filtering Approach for Discriminating Speech and Nonspeech. 705-717 - Antoine Deleforge

, Radu Horaud, Yoav Y. Schechner, Laurent Girin:
Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression. 718-731 - David Dov, Ronen Talmon, Israel Cohen:

Audio-Visual Voice Activity Detection Using Diffusion Maps. 732-745 - Maryam Habibi, Andrei Popescu-Belis

:
Keyword Extraction and Clustering for Document Recommendation in Conversations. 746-759 - Nursadul Mamun, Wissam A. Jassim

, Muhammad S. A. Zilany
:
Prediction of Speech Intelligibility Using a Neurogram Orthogonal Polynomial Measure (NOPM). 760-773 - Enzo De Sena

, Niccolò Antonello
, Marc Moonen, Toon van Waterschoot
:
On the Modeling of Rectangular Geometries in Room Acoustic Simulations. 774-786 - Hao Huang, Haihua Xu, Xianhui Wang, Wushour Silamu:

Maximum F1-Score Discriminative Training Criterion for Automatic Mispronunciation Detection. 787-797 - Chung-Che Wang, Jyh-Shing Roger Jang

:
Improving Query-by-Singing/Humming by Combining Melody and Lyric Information. 798-806
Volume 23, Number 5, May 2015
- Florian Krebs, Andre Holzapfel, Ali Taylan Cemgil

, Gerhard Widmer
:
Inferring Metrical Structure in Music Using Particle Filters. 817-827 - Janghoon Cho

, Chang D. Yoo:
Underdetermined Convolutive BSS: Bayes Risk Minimization Based on a Mixture of Super-Gaussian Posterior Approximation. 828-839 - Hao Mu, Woon-Seng Gan

, Ee-Leng Tan:
An Objective Analysis Method for Perceptual Quality of a Virtual Bass System. 840-850 - Richard C. Hendriks, Joao B. Crespo, Jesper Jensen, Cees H. Taal:

Optimal Near-End Speech Intelligibility Improvement Incorporating Additive Noise and Late Reverberation Under an Approximation of the Short-Time SII. 851-862 - Ahmed Hussen Abdelaziz, Steffen Zeiler, Dorothea Kolossa

:
Learning Dynamic Stream Weights For Coupled-HMM-Based Audio-Visual Speech Recognition. 863-876 - Reuven Berkun, Israel Cohen, Jacob Benesty

:
Combined Beamformers for Robust Broadband Regularized Superdirective Beamforming. 877-886 - Jeroen Breebaart:

Evaluation of Statistical Inference Tests Applied to Subjective Audio Quality Data With Small Sample Size. 887-897 - Miroslav Zivanovic

:
Harmonic Bandwidth Companding for Separation of Overlapping Harmonics in Pitched Signals. 898-908 - Jen-Tzung Chien

:
Laplace Group Sensing for Acoustic Models. 909-922 - Ying Wei, Yinfeng Wang:

Design of Low Complexity Adjustable Filter Bank for Personalized Hearing Aid Solutions. 923-931 - Alfonso Pérez Carrillo

, Marcelo M. Wanderley
:
Indirect Acquisition of Violin Instrumental Controls from Audio Signal with Hidden Markov Models. 932-940 - André Mansikkaniemi, Mikko Kurimo:

Adaptation of Morph-Based Speech Recognition for Foreign Names and Acronyms. 941-950
Volume 23, Number 6, June 2015
- Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang

, Hsu-Chun Yen, Wen-Lian Hsu:
Combining Relevance Language Modeling and Clarity Measure for Extractive Speech Summarization. 957-969 - Maciej Niedzwiecki, Marcin Ciolek, Krzysztof Cisowski:

Elimination of Impulsive Disturbances From Stereo Audio Recordings Using Vector Autoregressive Modeling and Variable-order Kalman Filtering. 970-981 - Kun Han, Yuxuan Wang, DeLiang Wang, William S. Woods, Ivo Merks, Tao Zhang:

Learning Spectral Mapping for Speech Dereverberation and Denoising. 982-992 - Peter Foster, Simon Dixon, Anssi Klapuri:

Identifying Cover Songs Using Information-Theoretic Measures of Similarity. 993-1005 - Andreas Schwarz

, Walter Kellermann:
Coherent-to-Diffuse Power Ratio Estimation for Dereverberation. 1006-1018 - Milos Cernak

, Philip N. Garner
, Alexandros Lazaridis, Petr Motlícek
, Xingyu Na:
Incremental Syllable-Context Phonetic Vocoding. 1019-1030 - Mickael Rouvier, Stanislas Oger, Georges Linarès, Driss Matrouf, Bernard Mérialdo, Yingbo Li:

Audio-Based Video Genre Identification. 1031-1041 - Hirokazu Kameoka, Kota Yoshizato, Tatsuma Ishihara, Kento Kadowaki, Yasunori Ohishi, Kunio Kashino:

Generative Modeling of Voice Fundamental Frequency Contours. 1042-1053 - Dejan Markovic, Fabio Antonacci

, Augusto Sarti, Stefano Tubaro:
Multiview Soundfield Imaging in the Projective Ray Space. 1054-1067 - Alice P. Bates, Zubair Khalid, Rodney A. Kennedy

:
Novel Sampling Scheme on the Sphere for Head-Related Transfer Function Measurements. 1068-1081 - Mao-shen Jia

, Ziyu Yang, Changchun Bao, Xiguang Zheng
, Christian H. Ritz
:
Encoding Multiple Audio Objects Using Intra-Object Sparsity. 1082-1095
Volume 23, Number 7, July 2015
- Matt McVicar, Satoru Fukayama

, Masataka Goto
:
AutoGuitarTab: Computer-Aided Composition of Rhythm and Lead Guitar Parts in the Tablature Space. 1105-1117 - Maarten Van Segbroeck, Ruchir Travadi, Shrikanth S. Narayanan:

Rapid Language Identification. 1118-1129 - Damián Marelli, Robert Baumgartner

, Piotr Majdak
:
Efficient Approximation of Head-Related Transfer Functions in Subbands for Accurate Sound Localization. 1130-1143 - Ching-feng Yeh, Lin-Shan Lee:

An Improved Framework for Recognizing Highly Imbalanced Bilingual Code-Switched Lectures with Cross-Language Acoustic Modeling and Frame-Level Language Identification. 1144-1159 - Dogac Basaran, Ali Taylan Cemgil

, Emin Anarim
:
A Probabilistic Model-Based Approach for Aligning Multiple Audio Sequences. 1160-1171 - Dongpeng Chen, Brian Kan-Wing Mak

:
Multitask Learning of Deep Neural Networks for Low-Resource Speech Recognition. 1172-1183 - Thomas Meyer

, Najeh Hajlaoui, Andrei Popescu-Belis
:
Disambiguating Discourse Connectives for Statistical Machine Translation. 1184-1197 - Ulpu Remes

, Ana Ramírez López, Kalle J. Palomäki, Mikko Kurimo:
Bounded Conditional Mean Imputation with Observation Uncertainties and Acoustic Model Adaptation. 1198-1208 - Rui Wang

, Hai Zhao, Bao-Liang Lu
, Masao Utiyama, Eiichiro Sumita:
Bilingual Continuous-Space Language Model Growing for Statistical Machine Translation. 1209-1220 - Tze Yuang Chong, Rafael E. Banchs, Engsiong Chng

, Haizhou Li
:
Decoupling Word-Pair Distance and Co-occurrence Information for Effective Long History Context Language Modeling. 1221-1232 - Meng Sun

, Yinan Li, Jort F. Gemmeke, Xiongwei Zhang:
Speech Enhancement Under Low SNR Conditions Via Noise Estimation Using Sparse and Low-Rank NMF with Kullback-Leibler Divergence. 1233-1242
Volume 23, Number 8, August 2015
- Hajar Momeni

, Hamid Reza Abutalebi
, Aliakbar Tadaion:
Joint Detection and Estimation of Speech Spectral Amplitude Using Noncontinuous Gain Functions. 1249-1258 - Jen-Tzung Chien

:
Hierarchical Pitman-Yor-Dirichlet Language Model. 1259-1272 - Mehdi Fallahpour, David Megías

:
Audio Watermarking Based on Fibonacci Numbers. 1273-1282 - Pejman Mowlaee

, Josef Kulmer:
Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential. 1283-1294 - Mohamed Morchid, Mohamed Bouallegue, Richard Dufour, Georges Linarès, Driss Matrouf, Renato De Mori:

Compact Multiview Representation of Documents Based on the Total Variability Space. 1295-1308 - Ryosuke Sugiura, Yutaka Kamamoto, Noboru Harada

, Hirokazu Kameoka, Takehiro Moriya:
Optimal Coding of Generalized-Gaussian-Distributed Frequency Spectra for Low-Delay Audio Coder With Powered All-Pole Spectrum Estimation. 1309-1321 - Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang

, Ea-Ee Jan, Wen-Lian Hsu, Hsin-Hsi Chen
:
Extractive Broadcast News Summarization Leveraging Recurrent Neural Network Language Modeling Techniques. 1322-1334 - Zbynek Koldovský

, Jirí Málek, Sharon Gannot
:
Spatial Source Subtraction Based on Incomplete Measurements of Relative Transfer Function. 1335-1347 - Dimitrios Dimitriadis, Enrico Bocchieri:

Use of Micro-Modulation Features in Large Vocabulary Continuous Speech Recognition Tasks. 1348-1357 - Xun Wang, Yasuhisa Yoshida, Tsutomu Hirao, Katsuhito Sudoh

, Masaaki Nagata:
Summarization Based on Task-Oriented Discourse Parsing. 1358-1367 - Carlos Spa, Antón Rey, Erwin Hernández:

A GPU Implementation of an Explicit Compact FDTD Algorithm with a Digital Impedance Filter for Room Acoustics Applications. 1368-1380
Volume 23, Number 9, September 2015
- Lin-Shan Lee, James R. Glass, Hung-yi Lee

, Chun-an Chan:
Spoken Content Retrieval - Beyond Cascading Speech Recognition with Text Retrieval. 1389-1420 - Yishan Jiao, Visar Berisha

, Ming Tu, Julie Liss:
Convex Weighting Criteria for Speaking Rate Estimation. 1421-1430 - Jianjun He, Woon-Seng Gan

, Ee-Leng Tan:
Primary-Ambient Extraction Using Ambient Spectrum Estimation for Immersive Spatial Audio Reproduction. 1431-1444 - Qing Shen

, Wei Liu
, Wei Cui, Siliang Wu, Yimin D. Zhang
, Moeness G. Amin
:
Low-Complexity Direction-of-Arrival Estimation Based on Wideband Co-Prime Arrays. 1445-1456 - Yu-Ren Chien, Hsin-Min Wang

, Shyh-Kang Jeng:
An Acoustic-Phonetic Model of F0 Likelihood for Vocal Melody Extraction. 1457-1468 - Xiaodong Cui, Vaibhava Goel

, Brian Kingsbury:
Data Augmentation for Deep Neural Network Acoustic Modeling. 1469-1477 - Enzo De Sena

, Hüseyin Hacihabiboglu
, Zoran Cvetkovic, Julius O. Smith III
:
Efficient Synthesis of Room Acoustics via Scattering Delay Networks. 1478-1492 - Lin Wang, Timo Gerkmann

, Simon Doclo
:
Noise Power Spectral Density Estimation Using MaxNSR Blocking Matrix. 1493-1508 - Ante Jukic, Toon van Waterschoot

, Timo Gerkmann
, Simon Doclo
:
Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors. 1509-1520 - Pejman Mowlaee

, Josef Kulmer:
Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information. 1521-1532
Volume 23, Number 10, October 2015
- Sakari Tervo, Archontis Politis

:
Direction of Arrival Estimation of Reflections from Room Impulse Responses Using a Spherical Microphone Array. 1539-1551 - Jia-Ching Wang, Yu-Hao Chin, Bo-Wei Chen, Chang-Hong Lin, Chung-Hsien Wu

:
Speech Emotion Verification Using Emotion Variance Modeling and Discriminant Scale-Frequency Maps. 1552-1562 - Antonio Canclini, Paolo Bestagini

, Fabio Antonacci
, Marco Compagnoni, Augusto Sarti, Stefano Tubaro:
A Robust and Low-Complexity Source Localization Algorithm for Asynchronous Distributed Microphone Networks. 1563-1575 - Jianjun He, Woon-Seng Gan

, Ee-Leng Tan:
Time-Shifting Based Primary-Ambient Extraction for Spatial Audio Reproduction. 1576-1588 - Pratik Shah

, Ian Lewis, Steven L. Grant, Sylvain Angrignon:
Nonlinear Acoustic Echo Cancellation Using Voltage and Current Feedback. 1589-1599 - Li Su

, Yi-Hsuan Yang:
Combining Spectral and Temporal Representations for Multipitch Estimation of Polyphonic Music. 1600-1612 - Toyota Fujioka, Yoshifumi Nagata, Masato Abe:

High-Precision Harmonic Distortion Level Measurement of a Loudspeaker Using Adaptive Filters in a Noisy Environment. 1613-1622 - Tsz-Kin Hon, Lin Wang, Joshua D. Reiss, Andrea Cavallaro:

Audio Fingerprinting for Multi-Device Self-Localization. 1623-1636 - Ye Tian, Zhe Chen, Fuliang Yin:

Distributed IMM-Unscented Kalman Filter for Speaker Tracking in Microphone Array Networks. 1637-1647 - Na Li, Man-Wai Mak:

SNR-Invariant PLDA Modeling in Nonparametric Subspace for Robust Speaker Verification. 1648-1659 - Juha Vilkamo, Symeon Delikaris-Manias:

Perceptual Reproduction of Spatial Sound Using Loudspeaker-Signal-Domain Parametrization. 1660-1669 - Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo

:
Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition. 1670-1679 - Marco Ruhland, Jörg Bitzer

, Matthias Brandt
, Stefan Goetze
:
Reduction of Gaussian, Supergaussian, and Impulsive Noise by Interpolation of the Binary Mask Residual. 1680-1691 - Yuval Dorfan, Sharon Gannot

:
Tree-Based Recursive Expectation-Maximization Algorithm for Localization of Acoustic Sources. 1692-1703
Volume 23, Number 11, November 2015
- Auxiliadora Sarmiento

, Iván Durán-Díaz
, Andrzej Cichocki
, Sergio Cruces
:
A Contrast Function Based on Generalized Divergences for Solving the Permutation Problem in Convolved Speech Mixtures. 1713-1726 - Xiaojia Zhao, Yuxuan Wang, DeLiang Wang:

Cochannel Speaker Identification in Anechoic and Reverberant Conditions. 1727-1736 - Liang-Yu Chen, Jyh-Shing Roger Jang

:
Automatic Pronunciation Scoring with Score Combination by Learning to Rank and Class-Normalized DP-Based Quantization. 1737-1749 - Duyu Tang, Bing Qin

, Furu Wei, Li Dong, Ting Liu, Ming Zhou:
A Joint Segmentation and Classification Framework for Sentence Level Sentiment Classification. 1750-1761 - Falk-Martin Hoffmann

, Filippo Maria Fazi
:
Theoretical Study of Acoustic Circular Arrays With Tangential Pressure Gradient Sensors. 1762-1774 - Nathan Souviraà-Labastie, Anaïk Olivero, Emmanuel Vincent, Frédéric Bimbot:

Multi-Channel Audio Source Separation Using Multiple Deformed References. 1775-1787 - Deepak Baby

, Tuomas Virtanen
, Jort F. Gemmeke, Hugo Van hamme
:
Coupled Dictionaries for Exemplar-Based Speech Enhancement and Automatic Speech Recognition. 1788-1799 - Md Tauhidul Islam, Celia Shahnaz

, Wei-Ping Zhu
, M. Omair Ahmad:
Speech Enhancement Based on Student t Modeling of Teager Energy Operated Perceptual Wavelet Packet Coefficients and a Custom Thresholding Function. 1800-1811 - Quynh Ngoc Thi Do, Steven Bethard

, Marie-Francine Moens:
Domain Adaptation in Semantic Role Labeling Using a Neural Language Model and Linguistic Resources. 1812-1823 - Haricharan Aragonda, Chandra Sekhar Seelamantula:

Demodulation of Narrowband Speech Spectrograms Using the Riesz Transform. 1824-1834 - Dung T. Tran, Emmanuel Vincent, Denis Jouvet:

Nonparametric Uncertainty Estimation and Propagation for Noise Robust ASR. 1835-1846 - Mei Tu, Yu Zhou, Chengqing Zong

:
Exploring Diverse Features for Statistical Machine Translation Model Pruning. 1847-1857 - Greg Okopal, Scott Wisdom, Les Atlas:

Speech Analysis With the Strong Uncorrelating Transform. 1858-1868 - Marcos F. Simón Gálvez, Stephen J. Elliott, Jordan Cheer

:
Time Domain Optimization of Filters Used in a Loudspeaker Array for Personal Audio. 1869-1878 - Mohammad Hadi Bokaei, Hossein Sameti, Yang Liu:

Linear Discourse Segmentation of Multi-Party Meetings Based on Local and Global Information. 1879-1891 - Chung-Hsien Wu

, Han-Ping Shen, Chun-Shan Hsu:
Code-Switching Event Detection by Using a Latent Language Space Model and the Delta-Bayesian Information Criterion. 1892-1903 - Zhangli Chen, Volker Hohmann:

Online Monaural Speech Enhancement Based on Periodicity Analysis and A Priori SNR Estimation. 1904-1916 - Saeed Sarreshtedari, Mohammad Ali Akhaee, Aliazam Abbasfar:

A Watermarking Method for Digital Speech Self-Recovery. 1917-1925 - Niko Moritz, Jörn Anemüller, Birger Kollmeier:

An Auditory Inspired Amplitude Modulation Filter Bank for Robust Feature Extraction in Automatic Speech Recognition. 1926-1937 - Yajie Miao, Hao Zhang, Florian Metze:

Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors. 1938-1949 - Veronica Morfi

, Gilles Degottex
, Athanasios Mouchtaris:
Speech Analysis and Synthesis with a Computationally Efficient Adaptive Harmonic Model. 1950-1962 - Jonathan William Dennis, Tran Huy Dat, Haizhou Li

:
Generalized Hough Transform for Speech Pattern Classification. 1963-1972 - Feng Deng, Changchun Bao, W. Bastiaan Kleijn

:
Sparse Hidden Markov Models for Speech Enhancement in Non-Stationary Noise Environments. 1973-1987 - Rishabh Ranjan, Woon-Seng Gan

:
Natural Listening over Headphones in Augmented Reality Using Adaptive Filtering Techniques. 1988-2002 - Ling-Hui Chen, Tuomo Raitio, Cassia Valentini-Botinhao, Zhen-Hua Ling, Junichi Yamagishi:

A Deep Generative Architecture for Postfiltering in Statistical Parametric Speech Synthesis. 2003-2014 - Ho Seon Shin, Tim Fingscheidt

, Hong-Goo Kang:
A Priori SNR Estimation Using Air- and Bone-Conduction Microphones. 2015-2025 - Ji Wu

, Miao Li
, Chin-Hui Lee:
A Probabilistic Framework for Representing Dialog Systems and Entropy-Based Dialog Management Through Dynamic Stochastic State Evolution. 2026-2035 - Sandro Cumani:

Fast Scoring of Full Posterior PLDA Models. 2036-2045 - Vladimir Tourbabin, Boaz Rafaely

:
Direction of Arrival Estimation Using Microphone Array Processing for Moving Humanoid Robots. 2046-2058 - Y. J. Chu, S. C. Chan:

A New Local Polynomial Modeling-Based Variable Forgetting Factor RLS Algorithm and Its Acoustic Applications. 2059-2069 - Fernando de-la-Calle-Silos, Francisco J. Valverde-Albacete, Ascensión Gallardo-Antolín

, Carmen Peláez-Moreno
:
Morphologically Filtered Power-Normalized Cochleograms as Robust, Biologically Inspired Features for ASR. 2070-2080 - Tsutomu Hirao, Masaaki Nishino, Yasuhisa Yoshida, Jun Suzuki

, Norihito Yasuda, Masaaki Nagata:
Summarizing a Document by Trimming the Discourse Tree. 2081-2092 - Chao Pan, Jingdong Chen, Jacob Benesty

:
Theoretical Analysis of Differential Microphone Array Beamforming and an Improved Solution. 2093-2105
Volume 23, Number 12, December 2015
- Wanxiang Che, Yanyan Zhao

, Honglei Guo, Zhong Su, Ting Liu:
Sentence Compression for Aspect-Based Sentiment Analysis. 2111-2124 - Jonathan Sheaffer, Maarten van Walstijn, Boaz Rafaely

, Konrad Kowalczyk
:
Binaural Reproduction of Finite Difference Simulations Using Spherical Array Processing. 2125-2135 - Po-Sen Huang, Minje Kim

, Mark Hasegawa-Johnson, Paris Smaragdis:
Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation. 2136-2147 - Aaron Heidel, Hsiang-Hung Lu, Lin-Shan Lee:

Finding Complex Features for Guest Language Fragment Recovery in Resource-Limited Code-Mixed Speech Recognition. 2148-2161 - Daniel Marquardt, Volker Hohmann, Simon Doclo

:
Interaural Coherence Preservation in Multi-Channel Wiener Filtering-Based Noise Reduction for Binaural Hearing Aids. 2162-2176 - Kai Yu, Kai Sun, Lu Chen, Su Zhu:

Constrained Markov Bayesian Polynomial for Efficient Dialogue State Tracking. 2177-2188 - Craig A. Anderson, Paul D. Teal

, Mark A. Poletti
:
Spatially Robust Far-field Beamforming Using the von Mises(-Fisher) Distribution. 2189-2197 - Jens Schröder, Stefan Goetze

, Jörn Anemüller:
Spectro-Temporal Gabor Filterbank Features for Acoustic Event Detection. 2198-2208 - Inseok Heo, William A. Sethares

:
Classification Based on Speech Rhythm via a Temporal Alignment of Spoken Sentences. 2209-2216 - Prasanga N. Samarasinghe

, Thushara D. Abhayapala
, Mark A. Poletti
, Terence Betlehem:
An Efficient Parameterization of the Room Transfer Function. 2217-2227 - Yong Xiang, Iynkaran Natgunanathan, Yue Rong, Song Guo

:
Spread Spectrum-Based High Embedding Capacity Watermarking Method for Audio Signals. 2228-2237 - In-Chul Yoo, Hyeontaek Lim, Dongsuk Yook:

Formant-Based Robust Voice Activity Detection. 2238-2245 - Thomas Hueber, Laurent Girin, Xavier Alameda-Pineda, Gérard Bailly

:
Speaker-Adaptive Acoustic-Articulatory Inversion Using Cascaded Gaussian Mixture Regression. 2246-2259 - Hequn Bai, Gaël Richard, Laurent Daudet:

Late Reverberation Synthesis: From Radiance Transfer to Feedback Delay Networks. 2260-2271 - Ilker Bayram:

A Multichannel Audio Denoising Formulation Based on Spectral Sparsity. 2272-2285 - Héctor Delgado

, Xavier Anguera, Corinne Fredouille, Javier Serrano
:
Fast Single- and Cross-Show Speaker Diarization Using Binary Key Speaker Modeling. 2286-2297 - Winston S. Percybrooks

, Elliot Moore:
A New HMM-Based Voice Conversion Methodology Evaluated on Monolingual and Cross-Lingual Conversion Tasks. 2298-2310 - Marwa Graja, Maher Jaoua

, Lamia Hadrich Belguith
:
Statistical Framework with Knowledge Base Integration for Robust Speech Understanding of the Tunisian Dialect. 2311-2321 - Falco Strasser, Henning Puder:

Adaptive Feedback Cancellation for Realistic Hearing Aid Applications. 2322-2333 - Yu Ting Yeung, Tan Lee

, Cheung-Chi Leung:
Supervised Single-Microphone Multi-Talker Speech Separation with Conditional Random Fields. 2334-2342 - Wenyu Jin, W. Bastiaan Kleijn

:
Theory and Design of Multizone Soundfield Reproduction Using Sparse Methods. 2343-2355 - Xionghu Zhong, James R. Hopgood

:
A Time-Frequency Masking Based Random Finite Set Particle Filtering Method for Multiple Acoustic Source Detection and Tracking. 2356-2370 - Karthika Vijayan

, K. Sri Rama Murty
:
Analysis of Phase Spectrum of Speech Signals Using Allpass Modeling. 2371-2383 - Daniel Marquardt, Elior Hadad

, Sharon Gannot
, Simon Doclo
:
Theoretical Analysis of Linearly Constrained Multi-Channel Wiener Filtering Algorithms for Combined Noise Reduction and Binaural Cue Preservation in Binaural Hearing Aids. 2384-2397 - Matthias Zöhrer, Robert Peharz

, Franz Pernkopf
:
Representation Learning for Single-Channel Source Separation and Bandwidth Extension. 2398-2409 - Hao Fang, Mari Ostendorf, Peter Baumann, Janet B. Pierrehumbert:

Exponential Language Modeling Using Morphological Features and Multi-Task Learning. 2410-2421 - Michael A. Carlin, Mounya Elhilali

:
A Framework for Speech Activity Detection Using Adaptive Auditory Receptive Fields. 2422-2433 - Shinya Saito, Kunio Oishi, Toshihiro Furukawa:

Convolutive Blind Source Separation Using an Iterative Least-Squares Algorithm for Non-Orthogonal Approximate Joint Diagonalization. 2434-2448 - Elior Hadad

, Daniel Marquardt, Simon Doclo
, Sharon Gannot
:
Theoretical Analysis of Binaural Transfer Function MVDR Beamformers with Interference Cue Preservation Constraints. 2449-2464 - Guang Yang, Richard F. Lyon, Emmanuel M. Drakakis:

Psychophysical Evaluation of An Ultra-Low Power, Analog Biomimetic Cochlear Implant Processor Filterbank Architecture With Across Channels AGC. 2465-2473

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














