default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 22
Volume 22, Number 1, January 2014
- Li Deng, Steve Renals, Marcello Federico, Mari Ostendorf:
Editorial: Expanding the Technical Reach of our Transactions. 5 - Jalal Taghia, Rainer Martin:
Objective Intelligibility Measures Based on Mutual Information for Speech Subjected to Speech Enhancement Processing. 6-16 - Liang Lu, Arnab Ghoshal, Steve Renals:
Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition. 17-27 - Milica Gasic, Steve J. Young:
Gaussian Processes for POMDP-Based Dialogue Manager Optimization. 28-40 - Imen Marrakchi-Mezghani, Gaël Mahé, Sonia Djaziri Larbi, Meriem Jaïdane, Monia Turki-Hadj Alouane:
Nonlinear Audio Systems Identification Through Audio Input Gaussianization. 41-53 - Joao B. Crespo, Richard C. Hendriks:
Multizone Speech Reinforcement. 54-66 - Chao Pan, Jingdong Chen, Jacob Benesty:
Performance Study of the MVDR Beamformer as a Function of the Source Incidence Angle. 67-79 - Hung-yi Lee, Lin-Shan Lee:
Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk Over Acoustic Similarity Graphs. 80-94 - Volker Leutnant, Alexander Krueger, Reinhold Haeb-Umbach:
A New Observation Model in the Logarithmic Mel Power Spectral Domain for the Automatic Recognition of Noisy Reverberant Speech. 95-109 - Nancy F. Chen, Sharon W. Tam, Wade Shen, Joseph P. Campbell:
Characterizing Phonetic Transformations and Acoustic Differences Across English Dialects. 110-124 - Dejan Markovic, Konrad Kowalczyk, Fabio Antonacci, Christian Hofmann, Augusto Sarti, Walter Kellermann:
Estimation of Acoustic Reflection Coefficients Through Pseudospectrum Matching. 125-137 - Zhiyao Duan, Jinyu Han, Bryan Pardo:
Multi-pitch Streaming of Harmonic Sound Mixtures. 138-150 - Shilin Liu, Khe Chai Sim:
Temporally Varying Weight Regression: A Semi-Parametric Trajectory Model for Automatic Speech Recognition. 151-160 - Vikrant Singh Tomar, Richard C. Rose:
A Family of Discriminative Manifold Learning Algorithms and Their Application to Speech Recognition. 161-171 - Hironori Doi, Tomoki Toda, Keigo Nakamura, Hiroshi Saruwatari, Kiyohiro Shikano:
Alaryngeal Speech Enhancement Based on One-to-Many Eigenvoice Conversion. 172-183 - Ebru Arisoy, Stanley F. Chen, Bhuvana Ramabhadran, Abhinav Sethy:
Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition. 184-192 - Craig T. Jin, Nicolas Epain, Abhaya Parthy:
Design, Optimization and Evaluation of a Dual-Radius Spherical Microphone Array. 193-204 - Rémi Mignot, Gilles Chardon, Laurent Daudet:
Low Frequency Interpolation of Room Impulse Responses Using Compressed Sensing. 205-216 - Mohammed Senoussaoui, Patrick Kenny, Themos Stafylakis, Pierre Dumouchel:
A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization. 217-227 - Hideyuki Tachibana, Nobutaka Ono, Shigeki Sagayama:
Singing Voice Enhancement in Monaural Music Signals Based on Two-stage Harmonic/Percussive Sound Separation on Multiple Resolution Spectrograms. 228-237 - Noam R. Shabtai, Boaz Rafaely:
Generalized Spherical Array Beamforming for Binaural Speech Reproduction. 238-247 - Sandro Cumani, Pietro Laface:
Factorized Sub-Space Estimation for Fast and Memory Effective I-vector Extraction. 248-259 - Yuan Zeng, Richard C. Hendriks:
Distributed Delay and Sum Beamformer for Speech Enhancement via Randomized Gossip. 260-273 - Zhenghua Li, Min Zhang, Wanxiang Che, Ting Liu, Wenliang Chen:
Joint Optimization for Chinese POS Tagging and Dependency Parsing. 274-286
Volume 22, Number 2, February 2014
- Dehong Gao, Wenjie Li, Xiaoyan Cai, Renxian Zhang, Ouyang You:
Sequential Summarization: A Full View of Twitter Trending Topics. 293-302 - Peter W. J. van Hengel, Johannes D. Krijnders:
A Comparison of Spectro-Temporal Representations of Audio Signals. 303-313 - Imed Zitouni, Yassine Benajiba:
Aligned-Parallel-Corpora Based Semi-Supervised Learning for Arabic Mention Detection. 314-324 - Emilio Molina, Ana M. Barbancho, Lorenzo J. Tardón, Isabel Barbancho:
Dissonance Reduction In Polyphonic Audio Using Harmonic Reorganization. 325-334 - Daniel Pak-Kong Lun, Tak-Wai Shen, K. C. Ho:
A Novel Expectation-Maximization Framework for Speech Enhancement in Non-Stationary Noise Environments. 335-346 - Stefano Cosentino, Tiago H. Falk, David McAlpine, Torsten Marquardt:
Cochlear Implant Filterbank Design and Optimization: A Simulation Study. 347-353 - Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:
Location Feature Integration for Clustering-Based Speech Separation in Distributed Microphone Arrays. 354-367 - Heikki Kallasjoki, Jort F. Gemmeke, Kalle J. Palomäki:
Estimating Uncertainty to Improve Exemplar-Based Feature Enhancement for Noise Robust Speech Recognition. 368-380 - Taufiq Hasan, John H. L. Hansen:
Maximum Likelihood Acoustic Factor Analysis Models for Robust Speaker Verification in Noise. 381-391 - Ofer Schwartz, Sharon Gannot:
Speaker Tracking Using Recursive EM Algorithms. 392-402 - Yu Tsao, Shigeki Matsuda, Chiori Hori, Hideki Kashioka, Chin-Hui Lee:
A MAP-based Online Estimation Approach to Ensemble Speaker and Speaking Environment Modeling. 403-416 - Pui-Yu Hui, Helen Meng:
Latent Semantic Analysis for Multimodal User Input With Speech and Gestures. 417-429 - Jesper Jensen, Cees H. Taal:
Speech Intelligibility Prediction Based on Mutual Information. 430-440 - Andrea Primavera, Stefania Cecchi, Junfeng Li, Francesco Piazza:
Objective and Subjective Investigation on a Novel Method for Digital Reverberator Parameters Estimation. 441-452 - Matt Speed, Damian T. Murphy, David M. Howard:
Modeling the Vocal Tract Transfer Function Using a 3D Digital Waveguide Mesh. 453-464 - Hüseyin Hacihabiboglu:
Theoretical Analysis of Open Spherical Microphone Arrays for Acoustic Intensity Measurements. 465-476 - Taemin Cho, Juan Pablo Bello:
On the Relative Importance of Individual Components of Chord Recognition Systems. 477-492 - Takuma Otsuka, Katsuhiko Ishiguro, Hiroshi Sawada, Hiroshi G. Okuno:
Bayesian Nonparametrics for Microphone Array Processing. 493-504 - Jianjun He, Ee-Leng Tan, Woon-Seng Gan:
Linear Estimation Based Primary-Ambient Extraction for Stereo Audio Signals. 505-517 - Sira Gonzalez, Mike Brookes:
PEFAC - A Pitch Estimation Algorithm Robust to High Levels of Noise. 518-530 - Min Zhang, Xiangyu Duan, Wenliang Chen:
Bayesian Constituent Context Model for Grammar Induction. 531-541 - Dah-Chung Chang, Fei-Tao Chu:
Feedforward Active Noise Control With a New Variable Tap-Length and Step-Size Filtered-X LMS Algorithm. 542-555 - Matt McVicar, Raúl Santos-Rodriguez, Yizhao Ni, Tijl De Bie:
Automatic Chord Estimation from Audio: A Review of the State of the Art. 556-575
Volume 22, Number 3, March 2014
- Chung-Hsien Wu, Yi-Chin Huang, Chung-Han Lee, Jun-Cheng Guo:
Synthesis of Spontaneous Speech With Syllable Contraction Using State-Based Context-Dependent Voice Transformation. 585-595 - Manu Airaksinen, Tuomo Raitio, Brad H. Story, Paavo Alku:
Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction. 596-607 - Jae-Mo Yang, Hong-Goo Kang:
Online Speech Dereverberation Algorithm Based on Adaptive Multichannel Linear Prediction. 608-619 - Afsaneh Asaei, Mohammad Golbabaee, Hervé Bourlard, Volkan Cevher:
Structured Sparsity Models for Reverberant Speech Separation. 620-633 - Rajan S. Rashobh, Andy W. H. Khong, Di Liu:
Multichannel Equalization in the KLT and Frequency Domains With Application to Speech Dereverberation. 634-646 - Prasanga N. Samarasinghe, Thushara D. Abhayapala, Mark A. Poletti:
Wavefield Analysis Over Large Areas Using Distributed Higher Order Microphones. 647-658 - Wen-Li Wei, Chung-Hsien Wu, Jen-Chun Lin, Han Li:
Exploiting Psychological Factors for Interaction Style Recognition in Spoken Conversation. 659-671 - Stanislaw Andrzej Raczynski, Emmanuel Vincent:
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation. 672-681 - Dalei Wu, Wei-Ping Zhu, M. N. S. Swamy:
The Theory of Compressive Sensing Matching Pursuit Considering Time-domain Noise with Application to Speech Enhancement. 682-696 - Tejaswi Nanjundaswamy, Kenneth Rose:
Cascaded Long Term Prediction for Enhanced Compression of Polyphonic Audio Signals. 697-710 - Kartik Audhkhasi, Andreas M. Zavou, Panayiotis G. Georgiou, Shrikanth S. Narayanan:
Theoretical Analysis of Diversity in an Ensemble of Automatic Speech Recognition Systems. 711-726 - Joonas Nikunen, Tuomas Virtanen:
Direction of Arrival Based Spatial Covariance Model for Blind Sound Source Separation. 727-739
Volume 22, Number 4, April 2014
- Jinyu Li, Li Deng, Yifan Gong, Reinhold Haeb-Umbach:
An Overview of Noise-Robust Automatic Speech Recognition. 745-777 - Ruhi Sarikaya, Geoffrey E. Hinton, Anoop Deoras:
Application of Deep Belief Networks for Natural Language Understanding. 778-784 - Romain Serizel, Marc Moonen, Bas van Dijk, Jan Wouters:
Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants. 785-799 - Marco Crocco, Andrea Trucco:
Design of Superdirective Planar Arrays With Sparse Aperiodic Layouts for Processing Broadband Signals via 3-D Beamforming. 800-815 - José Ricardo Zapata, Matthew E. P. Davies, Emilia Gómez:
Multi-Feature Beat Tracking. 816-825 - Arun Narayanan, DeLiang Wang:
Investigation of Speech Separation as a Front-End for Noise Robust Speech Recognition. 826-835 - Xiaojia Zhao, Yuxuan Wang, DeLiang Wang:
Robust Speaker Identification in Noisy and Reverberant Conditions. 836-845 - Sandro Cumani, Oldrich Plchot, Pietro Laface:
On the use of i-vector posterior distributions in Probabilistic Linear Discriminant Analysis. 846-857 - Chung-Hsien Wu, Han-Ping Shen, Yan-Ting Yang:
Chinese-English Phone Set Construction for Code-Switching ASR Using Acoustic and DNN-Extracted Articulatory Features. 858-862
Volume 22, Number 5, May 2014
- Weibin Zhang, Pascale Fung:
Discriminatively Trained Sparse Inverse Covariance Matrices for Speech Recognition. 871-880 - Hung-yi Lee, Sz-Rung Shiang, Ching-feng Yeh, Yun-Nung Chen, Yu Huang, Sheng-yi Kong, Lin-Shan Lee:
Spoken Knowledge Organization by Semantic Structuring and a Prototype Course Lecture System for Personalized Learning. 881-896 - Leonardo Zão, Rosângela Coelho, Patrick Flandrin:
Speech Enhancement with EMD and Hurst-Based Mode Selection. 897-909 - Daniele Giacobello, Mads Græsbøll Christensen, Tobias Lindstrøm Jensen, Manohar N. Murthi, Søren Holdt Jensen, Marc Moonen:
Stable 1-Norm Error Minimization Based Linear Predictors for Speech Modeling. 910-920 - Yesenia Lacouture-Parodi, Emanuël A. P. Habets, Jingdong Chen, Jacob Benesty:
Multichannel Noise Reduction in the Karhunen-Loève Expansion Domain. 921-934 - Seyed Omid Sadjadi, John H. L. Hansen:
Blind Spectral Weighting for Robust Speaker Identification under Reverberation Mismatch. 935-943 - Gautam Varma Mantena, Sivanand Achanta, Kishore Prahallad:
Query-by-Example Spoken Term Detection using Frequency Domain Linear Prediction and Non-Segmental Dynamic Time Warping. 944-953 - Christopher Osterwise, Steven L. Grant:
On Over-Determined Frequency Domain BSS. 954-964 - Daniel P. Jarrett, Maja Taseska, Emanuël A. P. Habets, Patrick A. Naylor:
Noise Reduction in the Spherical Harmonic Domain Using a Tradeoff Beamformer and Narrowband DOA Estimates. 965-976 - Verena Rieser, Oliver Lemon, Simon Keizer:
Natural Language Generation as Incremental Planning Under Uncertainty: Adaptive Information Presentation for Statistical Dialogue Systems. 979-993 - Jordan Cheer, Stephen J. Elliott:
Comments on "Complete Parallel Narrowband Active Noise Control Systems". 993-994
Volume 22, Number 6, June 2014
- Vipul Arora, Laxmidhar Behera:
Musical Source Clustering and Identification in Polyphonic Audio. 1003-1012 - Rajeev C. Nongpiur:
Design of Minimax Broadband Beamformers that are Robust to Microphone Gain, Phase, and Position Errors. 1013-1022 - Arun Venkitaraman, Chandra Sekhar Seelamantula:
Binaural Signal Processing Motivated Generalized Analytic Signal Construction and AM-FM Demodulation. 1023-1036 - Jürgen T. Geiger, Felix Weninger, Jort F. Gemmeke, Martin Wöllmer, Björn W. Schuller, Gerhard Rigoll:
Memory-Enhanced Neural Networks and NMF for Robust ASR. 1037-1046 - Haiquan Zhao, Yi Yu, Shibin Gao, Xiangping Zeng, Zhengyou He:
Memory Proportionate APA with Individual Activation Factors for Acoustic Echo Cancellation. 1047-1055 - Mehrdad J. Gangeh, Pouria Fewzee, Ali Ghodsi, Mohamed S. Kamel, Fakhri Karray:
Multiview Supervised Dictionary Learning in Speech Emotion Recognition. 1056-1068 - Jae-Hun Choi, Joon-Hyuk Chang:
Dual-Microphone Voice Activity Detection Technique Based on Two-Step Power Level Difference Ratio. 1069-1081 - Xavier Alameda-Pineda, Radu Horaud:
A Geometric Approach to Sound Source Localization from Time-Delay Estimates. 1082-1095 - Klaus Reindl, Stefan Meier, Hendrik Barfuss, Walter Kellermann:
Minimum Mutual Information-Based Linearly Constrained Broadband Signal Extraction. 1096-1108
Volume 22, Number 7, July 2014
- Mohamad Hasan Bahari, Najim Dehak, Hugo Van hamme, Lukás Burget, Ahmed Ali, Jim Glass:
Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition. 1117-1129 - Guangzhao Bao, Yangfei Xu, Zhongfu Ye:
Learning a Discriminative Dictionary for Single-Channel Speech Separation. 1130-1138 - Ian J. Kelly, Francis M. Boland:
Detecting Arrivals in Room Impulse Responses With Dynamic Time Warping. 1139-1147 - Markus Guldenschuh, Raymond A. de Callafon:
Detection of Secondary-Path Irregularities in Active Noise Control Headphones. 1148-1157 - Sin-Horng Chen, Chiao-Hua Hsieh, Chen-Yu Chiang, Hsi-Chun Hsiao, Yih-Ru Wang, Yuan-Fu Liao, Hsiu-Min Yu:
Modeling of Speaking Rate Influences on Mandarin Speech Prosody and Its Application to Speaking Rate-controlled TTS. 1158-1171 - Danilo Comminiello, Michele Scarpiniti, Luis Antonio Azpicueta-Ruiz, Jerónimo Arenas-García, Aurelio Uncini:
Nonlinear Acoustic Echo Cancellation Based on Sparse Functional Link Representations. 1172-1183 - Wen Zhang, Thushara D. Abhayapala:
Three Dimensional Sound Field Reproduction using Multiple Circular Loudspeaker Arrays: Functional Analysis Guided Approach. 1184-1194 - Maja Taseska, Emanuël A. P. Habets:
Informed Spatial Filtering for Sound Extraction Using Distributed Microphone Arrays. 1195-1207 - Mo Shen, Daisuke Kawahara, Sadao Kurohashi:
Dependency Parse Reranking with Rich Subtree Features. 1208-1218
Volume 22, Number 8, August 2014
- Zhibao Li, Ka Fai Cedric Yiu, Sven Nordholm:
On the Indoor Beamformer Design With Reverberation. 1225-1235 - Matthew B. Hawes, Wei Liu:
Sparse Array Design for Wideband Beamforming With Reduced Complexity in Tapped Delay-Lines. 1236-1247 - Yi FanChiang, Cheng-Wen Wei, Yi-Le Meng, Yu-Wen Lin, Shyh-Jye Jou, Tian-Sheuan Chang:
Low Complexity Formant Estimation Adaptive Feedback Cancellation for Hearing Aids Using Pitch Based Processing. 1248-1259 - Simon Conan, Olivier Derrien, Mitsuko Aramaki, Sølvi Ystad, Richard Kronland-Martinet:
A Synthesis Model With Intuitive Control Capabilities for Rolling Sounds. 1260-1273 - Christian Schüldt, Peter Händel:
Decay Rate Estimators and Their Performance for Blind Reverberation Time Estimation. 1274-1284 - Sriram Ganapathy, Sri Harish Reddy Mallidi, Hynek Hermansky:
Robust Feature Extraction Using Modulation Filtering of Autoregressive Models. 1285-1295 - Bo Li, Khe Chai Sim:
A Spectral Masking Approach to Noise-Robust Speech Recognition Using Deep Neural Networks. 1296-1305 - Emre Yilmaz, Jort Florent Gemmeke, Hugo Van hamme:
Noise Robust Exemplar Matching Using Sparse Representations of Speech. 1306-1319 - Dominic Schmid, Gerald Enzner, Sarmad Malik, Dorothea Kolossa, Rainer Martin:
Variational Bayesian Inference for Multichannel Dereverberation and Noise Reduction. 1320-1335
Volume 22, Number 9, September 2014
- Bruno S. Masiero, Michael Vorländer:
A Framework for the Calculation of Dynamic Crosstalk Cancellation Filters. 1345-1354 - Alexander Schasse, Rainer Martin:
Estimation of Subband Speech Correlations for Noise Reduction via MVDR Processing. 1355-1365 - Michal Novotny, Jan Rusz, Roman Cmejla, Evzen Ruzicka:
Automatic Evaluation of Articulatory Disorders in Parkinson's Disease. 1366-1378 - Felicia Lim, Wancheng Zhang, Emanuël A. P. Habets, Patrick A. Naylor:
Robust Multichannel Dereverberation using Relaxed Multichannel Least Squares. 1379-1390 - Sina Hamidi Ghalehjegh, Richard C. Rose:
Linear Regression Based Acoustic Adaptation for the Subspace Gaussian Mixture Model. 1391-1402 - Jonathan Botts, Lauri Savioja:
Spectral and Pseudospectral Properties of Finite Difference Models Used in Audio and Room Acoustics. 1403-1412 - Yong Xiang, Iynkaran Natgunanathan, Song Guo, Wanlei Zhou, Saeid Nahavandi:
Patchwork-Based Audio Watermarking Method Robust to De-synchronization Attacks. 1413-1423 - Ian Vince McLoughlin:
Super-Audible Voice Activity Detection. 1424-1433 - Atiyeh Alinaghi, Philip J. B. Jackson, Qingju Liu, Wenwu Wang:
Joint Mixing Vector and Binaural Model Based Stereo Source Separation. 1434-1448
Volume 22, Number 10, October 2014
- Liheng Zhao, Jacob Benesty, Jingdong Chen:
Design of Robust Differential Microphone Arrays. 1455-1466