default search action
ICASSP 1995: Detroit, Michigan, USA
- 1995 International Conference on Acoustics, Speech, and Signal Processing, ICASSP '95, Detroit, Michigan, USA, May 08-12, 1995. IEEE Computer Society 1995, ISBN 0-7803-2431-5
Volume 1
CELP Coding
- Masahiro Serizawa, Kazunori Ozawa:
4 kbps improved pitch prediction CELP speech coding with 20 ms frame. 1-4 - Peter Kroon, Michael C. Recchione:
A low-complexity toll-quality variable bit rate coder for CDMA cellular systems. 5-8 - Juin-Hwey Chen:
Toll-quality 16 kb/s CELP speech coding with very low complexity. 9-12 - Andrei Popescu, Nicolas Moreau, Claude Lamblin:
CELP coding using trellis-coded vector quantization of the excitation. 13-16 - Per Hedelin, Thomas Eriksson:
Interpolating the history improved excitation coding for high quality CELP coding. 17-20 - Cheung-Fat Chan:
Fast stochastic codebook search through the use of odd-symmetric crosscorrelation basis vectors. 21-24 - Torbjörn Wigren, Anders Bergström, Susanne Harrysson, Fredrik Jansson, Hans Nilson:
Improvements of background sound coding in linear predictive speech coders. 25-28 - Akitoshi Kataoka, Sachiko Hosaka, Jotaro Ikedo, Takehiro Moriya, Shinji Hayashi:
Improved CS-CELP speech coding in a noisy environment using a trained sparse conjugate codebook. 29-32 - Kazuhito Koishida, Keiichi Tokuda, Takao Kobayashi, Satoshi Imai:
CELP coding based on mel-cepstral analysis. 33-36 - Shude Zhang, Gordon B. Lockhart:
An embedded scheme for regular pulse excited (RPE) linear predictive coding. 37-40
Recognition: Large Vocabulary
- Lalit R. Bahl, S. Balakrishnan-Aiyer, Jerome R. Bellegarda, Martin Franz, P. S. Gopalakrishnan, David Nahamoo, Miroslav Novak, Mukund Padmanabhan, Michael A. Picheny, Salim Roukos:
Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task. 41-44 - Douglas B. Paul:
New developments in the Lincoln stack-decoder based large-vocabulary CSR system. 45-48 - Xavier L. Aubert, Hermann Ney:
Large vocabulary continuous speech recognition using word graphs. 49-52 - Philippe Jeanrenaud, Ellen Eide, Upendra V. Chaudhari, John W. McDonough, Kenney Ng, Man-Hung Siu, Herbert Gish:
Reducing word error rate on conversational speech from the Switchboard corpus. 53-56 - Ren-Yuan Lyu, Lee-Feng Chien, Shiao-Hong Hwang, Hung-Yun Hsieh, Rung-Chiuan Yang, Bo-Ren Bai, Jia-Chi Weng, Yen-Ju Yang, Shi-Wei Lin, Keh-Jiann Chen, Chiu-yu Tseng, Lin-Shan Lee:
Golden Mandarin (III)-a user-adaptive prosodic-segment-based Mandarin dictation machine for Chinese language with very large vocabulary. 57-60 - Hsin-Min Wang, Jia-Lin Shen, Yen-Ju Yang, Chiu-yu Tseng, Lin-Shan Lee:
Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data. 61-64 - Jean-Luc Gauvain, Lori Lamel, Martine Adda-Decker:
Developments in continuous speech dictation using the ARPA WSJ task. 65-68 - Michael M. Hochberg, Steve Renals, Anthony J. Robinson, Gary D. Cook:
Recent improvements to the ABBOT large vocabulary CSR system. 69-72 - Philip C. Woodland, C. J. Leggetter, Julian Odell, Valtcho Valtchev, Steve J. Young:
The 1994 HTK large vocabulary speech recognition system. 73-76 - Yuqing Gao, Hsiao-Wuen Hon, Zhiwei Lin, Gareth Loudon, S. Yogananthan, Baosheng Yuan:
Tangerine: a large vocabulary Mandarin dictation system. 77-80
ASR System & Corpora
- Tony Robinson, Jeroen Fransen, David Pye, Jonathan Foote, Steve Renals:
WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition. 81-84 - Yeshwant K. Muthusamy, Edward Holliman, Barbara Wheatley, Joseph Picone, John J. Godfrey:
Voice across Hispanic America: a telephone speech corpus of American Spanish. 85-88 - Yeonja Lim, Youngjik Lee:
Implementation of the POW (phonetically optimized words) algorithm for speech database. 89-92 - Xuedong Huang, Alex Acero, Fil Alleva, Mei-Yuh Hwang, Li Jiang, Milind Mahajan:
Microsoft Windows highly intelligent speech recognizer: Whisper. 93-96 - Laura Mayfield, Marsal Gavaldà, Wayne H. Ward, Alex Waibel:
Concept-based speech translation. 97-100 - John F. Pitrelli, Cynthia Fong, Suk Wong, Judith Spitz, Hong C. Leung:
PhoneBook: a phonetically-rich isolated-word telephone-speech database. 101-104 - Kathy L. Brown, E. Bryan George:
CTIMIT: a speech corpus for the cellular environment with applications to automatic speech recognition. 105-108 - Paul Duchnowski, Martin Hunke, Dietrich Büsching, Uwe Meier, Alex Waibel:
Toward movement-invariant automatic lip-reading and speech recognition. 109-112 - Víctor M. Jiménez, Antonio Castellanos, Enrique Vidal:
Some results with a trainable speech translation and understanding system. 113-116 - Nam-Yong Han, Hoi-Rin Kim, Kyu-Woong Hwang, Young-Mok Ahn, Joon-Hyung Ryoo:
A continuous speech recognition system using finite state network and Viterbi beam search for the automatic interpretation. 117-120
Robust Speech Recognition
- Ananth Sankar, Chin-Hui Lee:
Robust speech recognition based on stochastic matching. 121-124 - Olivier Siohan:
On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition. 125-128 - Yasuhiro Minami, Sadaoki Furui:
A maximum likelihood procedure for a universal adaptation method based on HMM composition. 129-132 - Mark John Francis Gales, Steve J. Young:
A fast and flexible implementation of parallel model combination. 133-136 - Pedro J. Moreno, Bhiksha Raj, Evandro B. Gouvêa, Richard M. Stern:
Multivariate-Gaussian-based cepstral normalization for robust speech recognition. 137-140 - Leonardo Neumeyer, Mitchel Weintraub:
Robust speech recognition in noise using adaptation and mapping techniques. 141-144 - Seokyong Moon, Jenq-Neng Hwang:
Noisy speech recognition using robust inversion of hidden Markov models. 145-148 - Keizaburo Takagi, Hiroaki Hattori, Takao Watanabe:
Rapid environment adaptation for robust speech recognition. 149-152 - Hans-Günter Hirsch, Christoph Ehrlicher:
Noise estimation techniques for robust speech recognition. 153-156 - Devang Naik:
Pole-filtered cepstral mean subtraction. 157-160
Language Modeling
- P. Srinivasa Rao, Michael D. Monkowski, Salim Roukos:
Language model adaptation via minimum discrimination information. 161-164 - Masafumi Tamoto, Takeshi Kawabata:
Clustering word category based on binomial posteriori co-occurrence distribution. 165-168 - Sabine Deligne, Frédéric Bimbot:
Language modeling by variable length sequences: theoretical formulation and evaluation of multigrams. 169-172 - Harvey Lloyd-Thomas, Jerry H. Wright, Gareth J. F. Jones:
An integrated grammar/bigram language model using path scores. 173-176 - Sheryl R. Young:
Discourse structure for multi-speaker spontaneous spoken dialogs: incorporating heuristics into stochastic RTNs. 177-180 - Reinhard Kneser, Hermann Ney:
Improved backing-off for M-gram language modeling. 181-184 - Germán Bordel, M. Inés Torres, Enrique Vidal:
QWI: a method for improved smoothing in language modelling. 185-188 - Daniel Jurafsky, Chuck Wooters
, Jonathan Segal, Andreas Stolcke, Eric Fosler, Gary N. Tajchman, Nelson Morgan:
Using a stochastic context-free grammar as a language model for speech recognition. 189-192 - Klaus Ries, Finn Dag Buø, Ye-Yi Wang:
Improved language modelling by unsupervised acquisition of structure. 193-196 - Claudia Pateras, Gregory Dudek, Renato De Mori:
Understanding referring expressions in a person-machine spoken dialogue. 197-200
Use of Knowledge in ASR
- Don X. Sun, Li Deng:
Analysis of acoustic-phonetic variations in fluent speech using TIMIT. 201-204 - Joerg P. Ueberla:
Analysing weaknesses of language models for speech recognition. 205-208 - Francis Jack Smith, Ji Ming, Peter O'Boyle, A. D. Irvine:
A hidden Markov model with optimized inter-frame dependence. 209-212 - Shigeki Sagayama, Satoshi Takahashi:
On the use of scalar quantization for fast HMM computation. 213-216 - Haakon Chevalier, Chuck Ingold, Carol Kunz, Chip Moore, Crispen Roven, Jon Yamron, Bradley Baker, Paul G. Bamberg, Sarah Bridle, Tracy Bruce, Amy Weader:
Large-vocabulary speech recognition in specialized domains. 217-220 - Ellen Eide, Herbert Gish, Philippe Jeanrenaud, Angela Mielke:
Understanding and improving speech recognition performance through the use of diagnostic tools. 221-224 - Egidio P. Giachin:
Phrase bigrams for continuous speech recognition. 225-228 - Carl D. Mitchell, Mary P. Harper, Leah H. Jamieson:
Using explicit segmentation to improve HMM phone recognition. 229-232 - Marco Saerens:
Viterbi algorithm for acoustic vectors generated by a linear stochastic differential equation on each state. 233-236 - Giuseppe Riccardi, Enrico Bocchieri, Roberto Pieraccini:
Non-deterministic stochastic language models for speech recognition. 237-240
Topics in Speech Coding
- Craig R. Watkins, Juin-Hwey Chen:
Improving 16 kb/s G.728 LD-CELP speech coder for frame erasure channels. 241-244 - Aamir Husain, Vladimir Cuperman:
Reconstruction of missing packets for CELP-based speech coders. 245-248 - Albert Shen, Benjamim Tang, Abeer Alwan, Greg Pottie:
A robust variable-rate speech coder. 249-252 - Ciará McElroy, Brian P. Murray, Anthony D. Fagan:
Wideband speech coding using multiple codebooks and glottal pulses. 253-256 - Nam Phamdo, Cheng-Chieh Lee, Rajiv Laroia:
Speech coding using ISI coded quantization. 257-260 - Ian S. Burnett
, G. J. Bradley:
New techniques for multi-prototype waveform coding at 2.84 kb/s. 261-264 - Jes Thyssen, Henrik Nielsen, Steffen Duus Hansen:
Quantization of non-linear predictors in speech coding. 265-268 - Balázs Kövesi, Samir Saoudi, Jean-Marc Boucher, Z. Reguly:
A fast robust stochastic algorithm for vector quantizer design for nonstationary channels. 269-272 - Spiros Dimolitsas, Franklin L. Corcoran, Channasandra Ravishankar, Marion Baraniecki:
Voice quality of interconnected PCS, Japanese cellular, and public switched telephone networks. 273-276 - Kai Hung Lam, Oscar C. Au, C. C. Chan, K. F. Hui, S. F. Lau:
Objective speech measure for Chinese in wireless environment. 277-280
Wordspotting, Rejection, and Topic Identification
- Richard C. Rose, Biing-Hwang Juang, Chin-Hui Lee:
A training procedure for verifying string hypotheses in continuous speech recognition. 281-284 - Mazin G. Rahim, Chin-Hui Lee, Biing-Hwang Juang:
Robust utterance verification for connected digits recognition. 285-288 - Hiroshi Kanazawa, Mitsuyoshi Tachimori, Yoichi Takebayashi:
A hybrid wordspotting method for spontaneous speech understanding using word-based pattern matching and phoneme-based HMM. 289-292 - Tanja Schultz, Ivica Rogina:
Acoustic and language modeling of human and nonhuman noises for human-to-human spontaneous speech recognition. 293-296 - Mitchel Weintraub:
LVCSR log-likelihood ratio scoring for keyword spotting. 297-300 - Chakib Tadj, Franck Poirier:
Keyword spotting using supervised/unsupervised competitive learning. 301-304 - Stephen V. Kosonocky, Richard J. Mammone:
A continuous density neural tree network word spotting system. 305-308 - Gareth J. F. Jones, J. T. Foote, Karen Spärck Jones, Steve J. Young:
Video mail retrieval: the effect of word spotting accuracy on precision. 309-312 - Jerry H. Wright, Michael J. Carey, Eluned S. Parris:
Improved topic spotting through statistical modelling of keyword dependencies. 313-316 - Takeshi Kawabata:
Topic focusing mechanism for speech recognition based on probabilistic grammar and topic Markov model. 317-320
Speaker Recognition
- Shoji Hayakawa, Fumitada Itakura:
The influence of noise on the speaker recognition performance using the higher frequency band. 321-324 - Charles R. Jankowski Jr., Thomas F. Quatieri, Douglas A. Reynolds:
Measuring fine structure in speech: application to speaker identification. 325-328 - Douglas A. Reynolds, Marc A. Zissman, Thomas F. Quatieri, Gerald C. O'Leary, Beth A. Carlson:
The effects of telephone transmission degradations on speaker recognition performance. 329-332 - Michael Schmidt, Herbert Gish, Angela Mielke:
Covariance estimation methods for channel robust text-independent speaker identification. 333-336 - William Y. Hueng, Bhaskar D. Rao:
Channel and noise compensation for text dependent speaker verification over telephone. 337-340 - Joseph P. Campbell:
Testing with the YOHO CD-ROM voice verification corpus. 341-344 - Chi-Shi Liu, Hsiao-Chuan Wang, Frank K. Soong, Chao-Shih Huang:
An orthogonal polynomial representation of speech signals and its probabilistic model for text independent speaker verification. 345-348 - Kevin R. Farrell:
Text-dependent speaker verification using data fusion. 349-352 - Mohammad Mehdi Homayounpour, Gérard Chollet:
Neural net approaches to speaker verification: comparison with second order statistic measures. 353-356 - Han-Sheng Liou, Richard J. Mammone:
A subword neural tree network approach to text-dependent speaker verification. 357-360
Recognition: Feature Analysis
- Mark M. Thomson:
Statistical modeling of speech feature vector trajectories based on a piecewise continuous mean path. 361-364 - Euvaldo F. Cabral Jr., Graham Tattersall:
Trace-segmentation of isolated utterances for speech recognition. 365-368 - Ernst Günter Schukat-Talamazzini, Joachim Hornegger, Heinrich Niemann:
Optimal linear feature transformations for semi-continuous hidden Markov models. 369-372 - Rathinavelu Chengalvarayan, Li Deng:
Use of generalized dynamic feature parameters for speech recognition: maximum likelihood and minimum classification error approaches. 373-376 - Milan Z. Markovic, Branko D. Kovacevic, Milan M. Milosavljevic:
A statistical pattern recognition approach to robust recursive identification of non-stationary AR model of speech production system. 377-380 - Joseph Pencak, Douglas J. Nelson:
The NP speech activity detection algorithm. 381-384 - Li Deng, J. Nu, Hossein Sameti:
Improved speech modeling and recognition using multi-dimensional articulatory states as primitive speech units. 385-388 - Christophe Ris, Vincent Fontaine, Henri Leich:
Speech analysis based on Malvar wavelet transform. 389-392 - Samel Çelebi, José C. Príncipe:
Magnitude spectral estimation via Poisson moments with application to speech recognition. 393-396 - Nelson Morgan, Hervé Bourlard, Steven Greenberg, Hynek Hermansky, Su-Lin Wu:
Stochastic perceptual models of speech. 397-400
Topics in Noise and Recognition
- Phil D. Green, Martin P. Cooke, M. D. Crawford:
Auditory scene analysis and hidden Markov model recognition of speech in noise. 401-404 - Hynek Hermansky, Eric A. Wan, Carlos Avendaño:
Speech enhancement based on temporal processing. 405-408 - Sumeet Sandhu, Oded Ghitza:
A comparative study of mel cepstra and EIH for phone classification under adverse conditions. 409-412 - Khaled T. Assaleh:
Supplementary orthogonal cepstral features. 413-416 - Engin Erzin, A. Enis Çetin, Yasemin Yardimci:
Subband analysis for robust speech recognition in the presence of car noise. 417-420 - Shoji Kajita, Fumitada Itakura:
Robust speech feature extraction using SBCOR analysis. 421-424 - Asunción Moreno, Sergio Tortola, Josep Vidal, José A. R. Fonollosa:
New HOS-based parameter estimation methods for speech recognition in noisy environments. 429-432 - Ruikang Yang, Petri Haavisto:
Noise compensation for speech recognition in car noise environments. 433-436 - Saeed Vaseghi, Ben P. Milner:
Speech recognition in impulsive noise. 437-440
Recognition: Training Techniques
- Tetsuo Kosaka, Shoichi Matsunaga, Mikio Kuraoka:
Speaker-independent phone modeling based on speaker-dependent HMMs' composition and clustering. 441-444 - Petra Geutner:
Using morphology towards better large-vocabulary speech recognition systems. 445-448 - Yves Normandin:
Optimal splitting of HMM Gaussian mixture components with MMIE training. 449-452 - Tilo Sloboda:
Dictionary learning: performance through consistency. 453-456 - Yoshihoko Goto, Michael M. Hochberg, Daniel J. Mashao, Harvey F. Silverman:
Incremental MAP estimation of HMMs for efficient training and improved performance. 457-460 - J. T. Foote:
Discrete MMI probability models for HMM speech recognition. 461-464 - Abdelhamid Mellouk, Patrick Gallinari:
Global discrimination for neural predictive systems based on N-best algorithm. 465-468 - Jianming Song:
Enhancement of discriminative capabilities of HMM based recognizer through modification of Viterbi algorithm. 469-472 - Dimitri Kanevsky:
A generalization of the Baum algorithm to functions on non-linear manifolds. 473-476 - Thomas Kemp:
Data-driven codebook adaptation in phonetically tied SCHMMs. 477-479
Speech Coding Below 4 kb/s
- Benoit M. Mouy, Pierre E. de la Noue, G. Goudezeune:
NATO STANAG 4479: a standard for an 800 bps vocoder and channel coding in HF-ECCM system. 480-483