default search action
IEEE Transactions on Audio, Speech & Language Processing, Volume 14
Volume 14, Number 1, January 2006
- Lie Lu, Dan Liu, HongJiang Zhang:
Automatic mood detection and tracking of music audio signals. 5-18 - Ning Ma, Martin Bouchard, Rafik A. Goubran:
Speech enhancement using a masking threshold constrained Kalman filter and its heuristic implementations. 19-32 - James D. Gordy, Rafik A. Goubran:
On the perceptual performance limitations of echo cancellers in wideband telephony. 33-42 - Marcus Holmberg, David Gelbart, Werner Hemmert:
Automatic speech recognition with an adaptation model motivated by auditory processing. 43-49 - Thomas Blumensath, Mike E. Davies:
Sparse and shift-Invariant representations of music. 50-57 - Sue Harding, Jon P. Barker, Guy J. Brown:
Mask estimation for missing data speech recognition based on statistics of binaural interaction. 58-67 - Slim Essid, Gaël Richard, Bertrand David:
Instrument recognition in polyphonic music based on automatic taxonomies. 68-80 - Fabian Mörchen, Alfred Ultsch, Michael Thies, Ingo Lohken:
Modeling timbre distance with temporal statistics from polyphonic music. 81-90 - Emmanuel Vincent:
Musical source separation using time-frequency source priors. 91-98 - Mads Græsbøll Christensen, Søren Holdt Jensen:
On perceptual distortion minimization and nonlinear least-squares frequency estimation. 99-109 - Alberto González, Maria de Diego, Miguel Ferrer, Gema Pinero:
Multichannel active noise equalization of interior noise. 110-122 - Yoichi Hinamoto, Hideaki Sakai:
Analysis of the filtered-X LMS algorithm and a related new algorithm for active control of multitonal noise. 123-130 - Norman H. Adams, Mark A. Bartsch, Gregory H. Wakefield:
Note segmentation and quantization for music information retrieval. 131-141 - Norman D. Cook, Takashi X. Fujisawa, Kazuaki Takami:
Evaluation of the affective valence of speech using pitch substructure. 142-151 - Anand D. Subramaniam, William R. Gardner, Bhaskar D. Rao:
Iterative joint source-channel decoding of speech spectrum parameters over an additive white Gaussian noise channel. 152-162 - Sriram Srinivasan, Jonas Samuelsson, W. Bastiaan Kleijn:
Codebook driven short-term predictor parameter estimation for speech enhancement. 163-176 - Yoshifumi Nagata, Toyota Fujioka, Masato Abe:
Speech enhancement based on auto gain control. 177-190 - Laurent Benaroya, Frédéric Bimbot, Rémi Gribonval:
Audio source separation with a single sensor. 191-199 - Kostas Kokkinakis, Asoke K. Nandi:
Multichannel blind deconvolution for source separation in convolutive mixtures of speech. 200-212 - Narendra K. Gupta, Gökhan Tür, Dilek Hakkani-Tür, Srinivas Bangalore, Giuseppe Riccardi, Mazin Gilbert:
The AT&T spoken language understanding system. 213-222 - Ben Milner, Alastair Bruce James:
Robust speech recognition over mobile and IP networks in burst-like packet loss. 223-231 - Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen, Sarah Borys, Sung-Suk Kim, Jennifer Cole, Jeung-Yoon Choi:
Prosody dependent speech recognition on radio news corpus of American English. 232-245 - Néstor Becerra Yoma, Carlos Molina, Jorge F. Silva, Carlos Busso:
Modeling, estimating, and compensating low-bit rate coding distortion in speech recognition. 246-255 - Li Deng, Dong Yu, Alex Acero:
A bidirectional target-filtering model of speech coarticulation and reduction: two-stage implementation for phonetic recognition. 256-265 - Chung-Hsien Wu, Yu-Hsien Chiu, Chi-Jiun Shia, Chun-Yu Lin:
Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. 266-276 - Tomi Kinnunen, Evgeny Karpov, Pasi Fränti:
Real-time speaker identification and verification. 277-288 - Yang Shao, DeLiang Wang:
Model-based sequential organization in cochannel speech. 289-298 - Christof Faller:
Parametric multichannel audio coding: synthesis of coherence cues. 299-310 - Renat Vafin, W. Bastiaan Kleijn:
Rate-distortion optimized quantization in multistage audio coding. 311-320 - Antti J. Eronen, Vesa T. Peltonen, Juha T. Tuomi, Anssi Klapuri, Seppo Fagerlund, Timo Sorsa, Gaëtan Lorho, Jyri Huopaniemi:
Audio-based context recognition. 321-329 - Wei-Ho Tsai, Hsin-Min Wang:
Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. 330-341 - Anssi Klapuri, Antti J. Eronen, Jaakko Astola:
Analysis of the meter of acoustic musical signals. 342-355 - Vaibhava Goel, Shankar Kumar, William Byrne:
Corrections to "Segmental minimum Bayes-risk decoding for automatic speech recognition". 356-357
Volume 14, Number 2, March 2006
- Satoshi Nakamura, Konstantin Markov, Hiromi Nakaiwa, Gen-ichiro Kikui, Hisashi Kawai, Takatoshi Jitsuhiro, Jinsong Zhang, Hirofumi Yamamoto, Eiichiro Sumita, Seiichi Yamamoto:
The ATR multilingual speech-to-speech translation system. 365-376 - Liang Gu, Yuqing Gao, Fu-Hua Liu, Michael Picheny:
Concept-based speech-to-speech translation using maximum entropy models for statistical natural concept generation. 377-392 - Yasuhiro Akiba, Kenji Imamura, Eiichiro Sumita, Hiromi Nakaiwa, Shun'ichi Yamamoto, Hiroshi G. Okuno:
Using multiple edit distances to automatically grade outputs from Machine translation systems. 393-402 - Tanja Schultz, Alan W. Black, Stephan Vogel, Monika Woszczyna:
Flexible speech translation systems. 403-411 - Alan Davis, Sven Nordholm, Roberto Togneri:
Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. 412-424 - Li Deng, Alex Acero, Issam Bazzi:
Tracking vocal tract resonances using a quantized nonlinear function embeddedin a temporal constraint. 425-434 - Kamran Mustafa, Ian C. Bruce:
Robust formant tracking for continuous speech with speaker variability. 435-444 - Huiqun Deng, Rabab K. Ward, Michael P. Beddoes, Murray Hodgson:
A new method for obtaining accurate estimates of vocal-tract filters and glottal waves from vowel sounds. 445-455 - Mike Brookes, Patrick A. Naylor, Jón Guðnason:
A quantitative assessment of group delay methods for identifying glottal closures in voiced speech. 456-466 - Ran D. Zilca, Brian Kingsbury, Jirí Navrátil, Ganesh N. Ramaswamy:
Pseudo pitch synchronous analysis of speech with applications to speaker recognition. 467-478 - Saeed Gazor, Reza Rashidi Far:
Adaptive maximum windowed likelihood multicomponent AM-FM signal decomposition. 479-491 - Qiang Fu, Peter Murphy:
Robust glottal source estimation based on joint source-filter model optimization. 492-501 - Etan Fisher, Joseph Tabrikian, Shlomo Dubnov:
Generalized likelihood ratio test for voiced-unvoiced decision in noisy speech using the harmonic model. 502-510 - Doroteo T. Toledano, Jesús Gómez Villardebó, Luis A. Hernández Gómez:
Initialization, training, and context-dependency in HMM-based formant tracking. 511-523 - Anand D. Subramaniam, William R. Gardner, Bhaskar D. Rao:
Low-complexity source coding using Gaussian mixture models, lattice vector quantization, and recursive coding with application to speech spectrum quantization. 524-532 - Thomas F. Quatieri, Kevin Brady, D. Messing, Joseph P. Campbell, William M. Campbell, Michael S. Brandstein, Clifford J. Weinstein, John D. Tardelli, Paul D. Gatewood:
Exploiting nonacoustic sensors for speech encoding. 533-544 - Hui Dong, Jerry D. Gibson:
Structures for SNR scalable speech coding. 545-557 - Udaya Bhaskar, Kumar Swaminathan:
Low bit-rate voice compression based on frequency domain interpolative techniques. 558-576 - Harald Gustafsson, Ulf A. Lindgren, Ingvar Claesson:
Low-complexity feature-mapped speech bandwidth extension. 577-588 - Olivier Pietquin, Thierry Dutoit:
A probabilistic framework for dialog simulation and optimal strategy learning. 589-599 - Bojana Gajic, Kuldip K. Paliwal:
Robust speech recognition in noisy environments based on subband spectral centroid histograms. 600-608 - Hossein Najaf-Zadeh, Peter Kabal:
Perceptual coding of narrow-band audio signals at low rates. 609-622 - Ashish Aggarwal, Shankar L. Regunathan, Kenneth Rose:
A trellis-based optimal parameter value selection for audio coding. 623-633 - Pongtep Angkititrakul, John H. L. Hansen:
Advances in phone-based modeling for automatic accent classification. 634-646 - Chung-Hsien Wu, Chia-Hsin Hsieh:
Multiple change-point audio segmentation and classification using an MDL-based Gaussian model. 647-657 - Ngwa A. Shusina, Boaz Rafaely:
Unbiased adaptive feedback cancellation in hearing aids by closed-loop identification. 658-665 - Hiroshi Saruwatari, Toshiya Kawamura, Tsuyoki Nishikawa, Akinobu Lee, Kiyohiro Shikano:
Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. 666-678 - Ali Taylan Cemgil, Hilbert J. Kappen, David Barber:
A generative model for music transcription. 679-694 - Mitsuko Aramaki, Richard Kronland-Martinet:
Analysis-synthesis of impact sounds by real-time dynamic filtering. 695-705 - Kelvin Chee-Mun Lee, Woon-Seng Gan:
Bandwidth-efficient recursive pth-order equalization for correcting baseband distortion in parametric loudspeakers. 706-710 - L. E. Rees, Stephen J. Elliott:
Adaptive algorithms for active sound-profiling. 711-719 - Muhammad Tahir Akhtar, Masahide Abe, Masayuki Kawamata:
A new variable step size LMS algorithm-based method for improved online secondary path modeling in active noise control systems. 720-726 - Thomas Hain, Philip C. Woodland, Gunnar Evermann, Mark J. F. Gales, Xunying Liu, Gareth L. Moore, Daniel Povey, Lan Wang:
Corrections to "Automatic Transcription of Conversational Telephone Speech". 727-727
Volume 14, Number 3, May 2006
- S. Ramamohan, Samarendra Dandapat:
Sinusoidal model-based analysis and classification of stressed speech. 737-746 - Joon-Hyuk Chang, Nam Soo Kim:
A new structural approach in system identification with generalized analysis-by-synthesis for robust speech coding. 747-751 - Christoffer Asgaard Rødbro, Jesper Jensen, Richard Heusdens:
Rate-distortion optimal time-segmentation and redundancy selection for VoIP. 752-763 - Volodya Grancharov, Jonas Samuelsson, W. Bastiaan Kleijn:
On causal algorithms for speech enhancement. 764-773 - Mingyang Wu, DeLiang Wang:
A two-stage algorithm for one-microphone reverberant speech enhancement. 774-784 - Andy W. H. Khong, Patrick A. Naylor:
Stereophonic acoustic echo cancellation employing selective-tap adaptive algorithms. 785-796 - Jen-Tzung Chien, Chih-Hsien Huang:
Aggregate a posteriori linear regression adaptation. 797-807 - Jeih-Weih Hung, Lin-Shan Lee:
Optimization of temporal filters for constructing robust features in speech recognition. 808-832 - Ji Ming:
Noise compensation for speech recognition with arbitrary additive noise. 833-844 - Florian Hilger, Hermann Ney:
Quantile based histogram equalization for noise robust large vocabulary speech recognition. 845-854 - Shinji Watanabe, Atsushi Sako, Atsushi Nakamura:
Automatic determination of acoustic model topology using variational Bayesian estimation and clustering for large vocabulary continuous speech recognition. 855-872 - Hong-Kwang Jeff Kuo, Yuqing Gao:
Maximum entropy direct models for speech recognition. 873-881 - Khe Chai Sim, Mark J. F. Gales:
Minimum phone error training of precision matrix models. 882-889 - Jorge F. Silva, Shrikanth S. Narayanan:
Average divergence distance as a statistical discrimination measure for hidden Markov models. 890-906 - Rongqing Huang, John H. L. Hansen:
Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora. 907-919 - Nima Mesgarani, Malcolm Slaney, Shihab A. Shamma:
Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations. 920-930 - R. Sant'Ana, Rosangela Coelho, Abraham Alcaim:
Text-independent speaker recognition based on the Hurst parameter and the multidimensional fractional Brownian motion model. 931-940 - Enrique Vidal, Francisco Casacuberta, Luis Rodríguez, Jorge Civera, Carlos D. Martínez-Hinarejos:
Computer-assisted translation using speech recognition. 941-951 - Athanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller:
Nonparallel training for voice conversion based on a parameter adaptation approach. 952-963 - Jack Mullen, David M. Howard, Damian T. Murphy:
Waveguide physical modeling of vocal tract acoustics: flexible formant bandwidth control from increased model dimensionality. 964-971 - K. Sreenivasa Rao, B. Yegnanarayana:
Prosody modification using instants of significant excitation. 972-980 - Ki-Seung Lee:
MLP-based phone boundary refining for a TTS database. 981-989 - Jerome R. Bellegarda:
A global, boundary-centric framework for unit selection text-to-speech synthesis. 990-997 - Cheng-Han Yang, Hsueh-Ming Hang:
Cascaded trellis-based rate-distortion control algorithm for MPEG-4 advanced audio coding. 998-1007 - Ben Supper, Tim Brookes, Francis Rumsey:
An auditory onset detection algorithm for improved automatic source localization. 1008-1017 - Woon-Seng Gan, Jun Yang, Khim Sia Tan, Meng Hwa Er:
A digital beamsteerer for difference frequency in a parametric array. 1018-1025 - Rui Cai, Lie Lu, Alan Hanjalic, HongJiang Zhang, Lian-Hong Cai:
A flexible framework for key audio effects detection and auditory context inference. 1026-1039 - Dimitrios K. Fragoulis, Constantin Papaodysseus, Mihalis Exarhos, George Roussopoulos, Thanasis Panagopoulos, Dimitrios Kamarotos:
Automated classification of piano-guitar notes. 1040-1050 - Harald Viste, Gianpaolo Evangelista:
A method for separation of overlapping partials based on similarity of temporal envelopes in multichannel mixtures. 1051-1061 - Serkan Kiranyaz, Ahmad Farooq Qureshi, Moncef Gabbouj:
A generic audio classification and segmentation approach for multimedia indexing and retrieval. 1062-1081 - Timothy J. Hazen:
Visual model structures and synchrony constraints for audio-visual speech recognition. 1082-1089
Volume 14, Number 4, July 2006
- John F. Pitrelli, Raimo Bakis, Ellen Eide, Raul Fernandez, Wael Hamza, Michael A. Picheny:
The IBM expressive text-to-speech synthesis system for American English. 1099-1108 - Chung-Hsien Wu, Chi-Chun Hsia, Te-Hsien Liu, Jhing-Fa Wang:
Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis. 1109-1116 - Marc Schröder:
Expressing degree of activation in synthetic speech. 1128-1136 - Mariët Theune, K. Meijs, Dirk Heylen, Roeland Ordelman:
Generating expressive speech for storytelling applications. 1137-1144 - Jianhua Tao, Yongguo Kang, Aijun Li:
Prosody conversion from neutral speech to emotional speech. 1145-1154 - Wentao Gu, Keikichi Hirose, Hiroya Fujisaki:
Modeling the effects of emphasis and question on fundamental frequency contours of Cantonese utterances. 1155-1170 - N. Campbell:
Conversational speech synthesis and the need for some laughter. 1171-1178 - Taishih Chi, Shihab A. Shamma:
Spectrum restoration from multiscale auditory phase singularities by generalized projections. 1179-1192 - Akira Watanabe, Tadashi Sakata:
Reliable methods for estimating relative vocal tract lengths from formant trajectories of common words. 1193-1204 - W. C. Chu:
Embedded quantization of line spectral frequencies using a multistage tree-structured vector quantizer. 1205-1217 - Jingdong Chen, Jacob Benesty, Yiteng Arden Huang, Simon Doclo:
New insights into the noise reduction Wiener filter. 1218-1234 - Yunxin Zhao, Rong Hu, Xiaolong Li:
Speedup convergence and reduce noise for enhanced speech separation and recognition. 1235-1244 - Jen-Tzung Chien, Bo-Cheng Chen:
A new independent component analysis for speech recognition and separation. 1245-1254 - Satya Dharanipragada, Karthik Visweswariah:
Gaussian mixture models with covariances or precisions in shared multiple subspaces. 1255-1266 - Brian Kan-Wing Mak, Roger Wend-Huu Hsiao, Simon Ka-Lung Ho, James T. Kwok:
Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting. 1267-1280 - Diamantino Caseiro, Isabel Trancoso:
A specialized on-the-fly algorithm for lexicon and language model composition. 1281-1291 - Toshihiko Abe, Masaaki Honda:
Sinusoidal model based on instantaneous frequency attractors. 1292-1300 - Hui Ye, Steve J. Young:
Quality-enhanced voice morphing using maximum likelihood transformations. 1301-1312 - Ashish Aggarwal, Shankar L. Regunathan, Kenneth Rose:
Efficient bit-rate scalability for weighted squared error optimization in audio coding. 1313-1327 - Olivier Derrien, Pierre Duhamel, Maurice Charbit, Gaël Richard:
A new quantization optimization algorithm for the MPEG advanced audio coder using a statistical subband model of the quantization noise. 1328-1339