


default search action
INTERSPEECH 2007: Antwerp, Belgium
- 8th Annual Conference of the International Speech Communication Association, INTERSPEECH 2007, Antwerp, Belgium, August 27-31, 2007. ISCA 2007

Keynotes 1-4
- Victor Zue:

On organic interfaces. 1-8 - Sophie K. Scott:

The neural basis of speech perception - a view from functional imaging. 9-13 - Alex Waibel, Keni Bernardin, Matthias Wölfel:

Computer-supported human-human multilingual communication. 14-21 - Pierre-Yves Oudeyer:

Self-organization in the evolution of shared systems of speech sounds: a computational study. 22-29
Discriminative and Large Margin Techniques in Acoustic Modeling
- Jinyu Li, Chin-Hui Lee:

Soft margin feature extraction for automatic speech recognition. 30-33 - Yan Yin, Hui Jiang:

A fast optimization method for large margin estimation of HMMs based on second order cone programming. 34-37 - Hao-Zheng Li, Douglas D. O'Shaughnessy:

Frame margin probability discriminative training algorithm for noisy speech recognition. 38-41 - Fabio Valente, Jithendra Vepa, Christian Plahl, Christian Gollan, Hynek Hermansky, Ralf Schlüter:

Hierarchical neural networks feature extraction for LVCSR system. 42-45 - Peder A. Olsen, John R. Hershey:

Bhattacharyya error and divergence using variational importance sampling. 46-49 - Tingyao Wu, Jacques Duchateau, Dirk Van Compernolle:

Phoneme dependent frame selection preference. 50-53
Speech Production I, II
- Xinhui Zhou, Carol Y. Espy-Wilson, Mark Tiede, Suzanne Boyce:

An articulatory and acoustic study of "retroflex" and "bunched" american English rhotic sound based on MRI. 54-57 - Paula Martins, Inês Carbone, Augusto Silva, António J. S. Teixeira:

An MRI study of european portuguese nasals. 58-61 - Sayoko Takano, Hiroki Matsuzaki, Kunitoshi Motoki:

A four-cube FEM model of the extrinsic and intrinsic tongue muscles to simulate the production of vowel /i/. 62-65 - Juan F. Torres, Elliot Moore:

Performance evaluation of glottal quality measures from the perspective of vocal tract filter consistency. 66-69 - Veena D. Singampalli, Philip J. B. Jackson:

Statistical identification of critical, dependent and redundant articulators. 70-73 - Chao Qin, Miguel Á. Carreira-Perpiñán:

An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping. 74-77
Phonetic Segmentation and Classification I, II
- Peter Karsmakers, Kristiaan Pelckmans, Johan A. K. Suykens, Hugo Van hamme

:
Fixed-size kernel logistic regression for phoneme classification. 78-81 - Seung Seop Park, Jong Won Shin, Jong Kyu Kim, Nam Soo Kim:

A multiple-model based framework for automatic speech segmentation. 82-85 - Aren Jansen, Partha Niyogi:

Semi-supervised learning of speech sounds. 86-89 - Abhinav Parate, Ashish Verma, Jayanta Basak:

Evaluation of syllable stress using single class classifier. 90-93 - Mohammad Nurul Huda, Muhammad Ghulam, Junsei Horikawa, Tsuneo Nitta:

Distinctive phonetic feature (DPF) based phone segmentation using hybrid neural networks. 94-97 - Jean-Philippe Goldman, Mathieu Avanzi, Anne-Catherine Simon, Anne Lacheret, Antoine Auchlin:

A methodology for the automatic detection of perceived prominent syllables in spoken French. 98-101
Discourse, Dialog and Conversation
- Hiroki Mori, Hideki Kasuya:

Voice source and vocal tract variations as cues to emotional states perceived from expressive conversational speech. 102-105 - Fan Yang, Peter A. Heeman:

Exploring initiative strategies using computer simulation. 106-109 - Chiu-yu Tseng, Zhao-yu Su:

From one base form to multiple output styles - predicting stylistic dynamics of discourse prosody. 110-113 - Claudia Crocco, Renata Savy:

Topic in dialogue: prosodic and syntactic features. 114-117 - Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Shusaku Miwa, Nobuaki Minematsu:

Features of pauses and conjunctions at syntactic and discourse boundaries in Japanese monologues. 118-121
Spoken Dialog Systems I, II
- Craig Wootton, Michael F. McTear, Terry Anderson:

Utilizing online content as domain knowledge in a multi-domain dynamic dialogue system. 122-125 - Boris W. van Schooten, Sophie Rosset, Olivier Galibert, Aurélien Max, Rieks op den Akker, Gabriel Illouz:

Handling speech input in the ritel QA dialogue system. 126-129 - Woosung Kim:

Online call quality monitoring for automating agent-based call centers. 130-133 - Sebastian Möller, Klaus-Peter Engelbrecht, Antti Oulasvirta:

Analysis of communication failures for spoken dialogue systems. 134-137 - Sandra Mann, André Berton, Ute Ehrlich:

How to access audio files of large data bases using in-car speech dialogue systems. 138-141 - Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno:

Analyzing temporal transition of real user's behaviors in a spoken dialogue system. 142-145 - J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero:

Voicepedia: towards speech-based access to unstructured information. 146-149 - Vivek Kumar Rangarajan Sridhar, Srinivas Bangalore, Shrikanth S. Narayanan:

Exploiting prosodic features for dialog act tagging in a discriminative modeling framework. 150-153 - Hua Ai, Antonio Roque, Anton Leuski, David R. Traum:

Using information state to improve dialogue move identification in a spoken dialogue system. 154-157 - Shiu-Wah Chu, Ian M. O'Neill, Philip Hanna:

Using multiple strategies to manage spoken dialogue. 158-161 - Marcelo Quinderé, Luís Seabra Lopes

, António J. S. Teixeira:
An information state based dialogue manager for a mobile robot. 162-165
Accent and Language Identification I, II
- Josef G. Bauer, Bernt Andrassy, Ekaterina Timoshenko:

Discriminative optimization of language adapted HMMs for a language identification system based on parallel phoneme recognizers. 166-169 - Khe Chai Sim, Haizhou Li:

Fusion of contrastive acoustic models for parallel phonotactic spoken language identification. 170-173 - Liang Wang, Eliathamby Ambikairajah, Eric H. C. Choi:

Multi-layer kohonen self-organizing feature map for language identification. 174-177 - Bo Yin, Eliathamby Ambikairajah, Fang Chen:

Hierarchical language identification based on automatic language clustering. 178-181 - Ekaterina Timoshenko, Harald Höge:

Using speech rhythm for acoustic language identification. 182-185 - Kakeung Wong, Man-Hung Siu, Brian Mak:

A model-based estimation of phonotactic language verification performance. 186-189 - Mike Rosner, Paulseph-John Farrugia:

A tagging algorithm for mixed language identification in a noisy domain. 190-193 - Doroteo T. Toledano, Javier Gonzalez-Dominguez, Alejandro Abejón-Gonzalez, Danilo Spada, Ismael Mateos-Garcia, Joaquin Gonzalez-Rodriguez:

Improved language recognition using better phonetic decoders and fusion with MFCC and SDC features. 194-197
Education and Training
- Daniel Bolaños, Wayne H. Ward, Sarel van Vuuren, Javier Garrido Salas:

Syllable lattices as a basis for a children's speech reading tracker. 198-201 - Fuping Pan, Qingwei Zhao, Yonghong Yan:

Mandarin vowel pronunciation quality evaluation by using formant pattern recognition. 202-205 - Matthew Black, Joseph Tepperman, Sungbok Lee, Patti Price, Shrikanth S. Narayanan:

Automatic detection and classification of disfluent reading miscues in young children's speech for the purpose of assessment. 206-209 - Nobuaki Minematsu, K. Kamata, Satoshi Asakawa, Takehiko Makino, Tazuko Nishimura, Keikichi Hirose:

Structural assessment of language learners' pronunciation. 210-213 - Abdurrahman Samir, Sherif Mahdy Abdou, Ahmed Husien Khalil, Mohsen A. Rashwan:

Enhancing usability of CAPL system for qur'an recitation learning. 214-217 - Febe de Wet, Christa van der Walt, Thomas Niesler:

Automatic large-scale oral language proficiency assessment. 218-221
Robust ASR I, II
- Yuki Denda, Takamasa Tanaka, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita:

Noise-robust hands-free voice activity detection with adaptive zero crossing detection using talker direction estimation. 222-225 - Agustín Álvarez-Marquina, Rafael Martínez, Pedro Gómez, Victor Nieto Lluis, V. Rodellar:

A robust mel-scale subband voice activity detector for a car platform. 226-229 - Kentaro Ishizuka, Tomohiro Nakatani, Masakiyo Fujimoto, Noboru Miyazaki:

Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio. 230-233 - A. M. Toh, Roberto Togneri, Sven Nordholm:

Feature and distribution normalization schemes for statistical mismatch reduction in reverberant speech recognition. 234-237 - Matthew Gibson, Thomas Hain:

Temporal masking for unsupervised minimum Bayes risk speaker adaptation. 238-241 - Tsung-hsueh Hsieh, Jeih-Weih Hung:

Speech feature compensation based on pseudo stereo codebooks for robust speech recognition in additive noise environments. 242-245 - Dimitrios Dimitriadis, Petros Maragos, Stamatios Lefkimmiatis:

Multiband, multisensor robust features for noisy speech recognition. 246-249 - Akira Sasou, Hiroaki Kojima:

Noise robust speech recognition for voice driven wheelchair. 250-253
Adaptation in ASR I, II
- Yun Tang, Richard C. Rose:

Clustered maximum likelihood linear basis for rapid speaker adaptation. 254-257 - Wen Xuan Teng, Guillaume Gravier, Frédéric Bimbot, Frédéric Soufflet:

Rapid speaker adaptation by reference model interpolation. 258-261 - Randy Gomez, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:

Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection. 262-265 - Brian Kan-Wing Mak, Roger Wend-Huu Hsiao:

Robustness of several kernel-based fast adaptation methods on noisy LVCSR. 266-269 - Janne Pylkkönen:

Estimating VTLN warping factors by distribution matching. 270-273 - Ming Liu, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang, Zhengyou Zhang:

Frequency domain correspondence for speaker normalization. 274-277 - Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:

Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition. 278-281 - Martin Karafiát, Lukás Burget, Jan Cernocký, Thomas Hain

:
Application of CMLLR in narrow band wide band adapted systems. 282-285 - Christophe Lévy, Georges Linarès, Jean-François Bonastre:

Fast adaptation of GMM-based compact models. 286-289
Speaker Verification & Identification I-IV
- Zahi N. Karam, William M. Campbell:

A new kernel for SVM MLLR based speaker recognition. 290-293 - Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen:

A GMM-based probabilistic sequence kernel for speaker verification. 294-297 - Hagai Aronowitz:

Speaker recognition using kernel-PCA and intersession variability modeling. 298-301 - Réda Dehak, Najim Dehak

, Patrick Kenny, Pierre Dumouchel:
Linear and non linear kernel GMM supervector machines for speaker verification. 302-305 - Ignacio López-Moreno, Ismael Mateos-Garcia, Daniel Ramos, Joaquin Gonzalez-Rodriguez:

Support vector regression for speaker verification. 306-309 - Chris Longworth, Mark J. F. Gales:

Derivative and parametric kernels for speaker verification. 310-313
Spoken Data Retrieval I, II
- David R. H. Miller, Michael Kleber, Chia-Lin Kao, Owen Kimball, Thomas Colthurst, Stephen A. Lowe, Richard M. Schwartz, Herbert Gish:

Rapid and accurate spoken term detection. 314-317 - Yi-Cheng Pan, Hung-lin Chang, Berlin Chen, Lin-Shan Lee:

Subword-based position specific posterior lattices (s-PSPL) for indexing speech information. 318-321 - Andreas Merkel, Dietrich Klakow:

Improved methods for language model based question classification. 322-325 - Tomoyosi Akiba, Hirofumi Tsujimura:

Error-tolerant question answering for spoken documents. 326-329 - Dilek Hakkani-Tür

, Gökhan Tür, Michael Levit:
Exploiting information extraction annotations for document retrieval in distillation tasks. 330-333 - Kishan Thambiratnam, Frank Seide:

Learning spoken document similarity and recommendation using supervised probabilistic latent semantic analysis. 334-337
Accent and Language Identification I, II
- David A. van Leeuwen, Khiet P. Truong:

An open-set detection evaluation methodology applied to language and emotion recognition. 338-341 - Xi Yang, Man-Hung Siu, Herbert Gish, Brian Mak:

Boosting with anti-models for automatic language identification. 342-345 - Fabio Castaldo, Daniele Colibro, Emanuele Dalmasso, Pietro Laface, Claudio Vair:

Acoustic language identification using fast discriminative training. 346-349 - Ming Li, Hongbin Suo, Xiao Wu, Ping Lu, Yonghong Yan:

Spoken language identification using score vector modeling and support vector machine. 350-353 - Ricardo de Córdoba, Luis Fernando D'Haro, Fernando Fernández Martínez

, Javier Macías Guarasa, Javier Ferreiros
:
Language identification based on n-gram frequency ranking. 354-357 - Wade Shen, Douglas A. Reynolds:

Improving phonotactic language recognition with acoustic adaptation. 358-361
Speech Perception I, II
- Michael C. W. Yip:

Spoken word recognition of Chinese homophones: a further investigation. 362-365 - Maria K. Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, David Owens:

The role of outer hair cell function in the perception of synthetic versus natural speech. 366-369 - Akiko Kusumoto, Alexander Kain, John-Paul Hosom, Jan P. H. van Santen:

Hybridizing conversational and clear speech. 370-373 - Sophie Dufour, Ulrich H. Frauenfelder:

Neighborhood density and neighborhood frequency effects in French spoken word recognition. 374-377 - Toshio Irino, Yoshie Aoki, Yoshie Hayashi, Hideki Kawahara, Roy D. Patterson:

Discrimination and recognition of scaled word sounds. 378-381 - László Tóth:

Benchmarking human performance on the acoustic and linguistic subtasks of ASR systems. 382-385 - Lin Yang, Jianping Zhang, Yonghong Yan:

Contributions of temporal fine structure cues to Chinese speech recognition in cochlear implant simulation. 386-389 - Xihong Wu, Jing Chen, Zhigang Yang, Qiang Huang, Mengyuan Wang, Liang Li:

Effect of number of masking talkers on speech-on-speech masking in Chinese. 390-393 - Odile Bagou, Sophie Dufour, Cécile Fougeron, Alain Content, Ulrich H. Frauenfelder:

Do different boundary types induce subtle acoustic cues to which French listeners are sensitive? 394-397 - Svante Stadler, Arne Leijon, Björn Hagerman:

An information theoretic approach to predict speech intelligibility for listeners with normal and impaired hearing. 398-401 - Travis Wade, Bernd Möbius:

Speaking rate effects in a landmark-based phonetic exemplar model. 402-405 - Kazumi Maniwa, Allard Jongman, Travis Wade:

Acoustic correlates of intelligibility enhancements in clearly produced fricatives. 406-409 - Tim Jürgens, Thomas Brand, Birger Kollmeier:

Modelling the human-machine gap in speech reception: microscopic speech intelligibility prediction for normal-hearing subjects with an auditory model. 410-413 - Ayako Ikeno, John H. L. Hansen:

Lombard speech impact on perceptual speaker recognition. 414-417 - Huiwen Goy, Kathleen Pichora-Fuller, Pascal van Lieshout, Gurjit Singh, Bruce Schneider:

Effect of within- and between-talker variability on word identification in noise by younger and older adults. 418-421 - H. Timothy Bunnell, N. Carolyn Schanen, Linda D. Vallino, Thierry G. Morlet, James B. Polikoff, Jennette D. Driscoll, James T. Mantell:

Speech perception in children with speech sound disorder. 422-425 - Huan Wang, Werner Hemmert:

Speech coding and information processing by auditory neurons. 426-429 - Annie C. Gilbert, Victor J. Boucher:

What do listeners attend to in hearing prosodic structures? investigating the human speech-parser using short-term recall. 430-433
Prosody: Prosodic Structure
- Yosuke Igarashi:

Pitch pattern alternation in goshogawara Japanese: evidence for a prosodic phrase above the domain for downstep. 434-437 - Irina Nesterenko, Pavel A. Skrelin:

Some evidence on the phonetics and phonology of prosodic phrasing in Russian. 438-441 - Jan Volín

, Radek Skarnitzl:
Temporal downtrends in Czech read speech. 442-445 - Hyongsil Cho, Daniel Hirst:

Empirical evidence for prosodic phrasing: pauses as linguistic annotation in Korean read speech. 446-449 - Markus Dreyer, Izhak Shafran:

Exploiting prosody for PCFGs with latent annotations. 450-453 - Qin Shi, Danning Jiang, Fanping Meng, Yong Qin:

Combining length distribution model with decision tree in prosodic phrase prediction. 454-457 - Li-chiung Yang:

Duration and pauses as boundary-markers in speech: a cross-linguistic study. 458-461
Prosodic Modeling I, II
- Jian Yu, Lixing Huang, Jianhua Tao, Xia Wang:

Modeling incompletion phenomenon in Mandarin dialog prosody. 462-465 - Anne Tamm, Kálmán Abari, Gábor Olaszy:

Accent assignment algorithm in Hungarian, based on syntactic analysis. 466-469 - Cheng-Yuan Lin, Pei-Chi Jao, Jyh-Shing Roger Jang:

An effective initial/final duration prediction method for corpus-based singing voice synthesis of Mandarin Chinese. 470-473 - Géza Németh, Márk Fék, Tamás Gábor Csapó:

Increasing prosodic variability of text-to-speech synthesizers. 474-477 - Damien Lolive, Nelly Barbot, Olivier Boëffard:

Unsupervised HMM classification of F0 curves. 478-481 - Ian Read, Stephen Cox:

Automatic pitch accent prediction for text-to-speech synthesis. 482-485 - Xinqiang Ni, Yining Chen, Frank K. Soong, Min Chu, Ping Zhang:

An unsupervised approach to automatic prosodic annotation. 486-489 - Zeynep Inanoglu, Steve J. Young:

A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality. 490-493 - Chen-Yu Chiang, Hsiu-Min Yu, Yih-Ru Wang, Sin-Horng Chen:

An automatic prosody labeling method for Mandarin speech. 494-497
Speech Analysis
- Koby Crammer:

A conservative aggressive subspace tracker. 498-501 - Mattias Nilsson, W. Bastiaan Kleijn

:
Mutual information and the speech signal. 502-505 - Tony Ezzat, Jake V. Bouvrie, Tomaso A. Poggio:

Spectro-temporal analysis of speech using 2-d Gabor filters. 506-509 - Tomas Dekens, Mike Demol, Werner Verhelst, Piet Verhoeve:

A comparative study of speech rate estimation techniques. 510-513 - Tiago H. Falk, Hua Yuan, Wai-Yip Chan:

Spectro-temporal processing for blind estimation of reverberation time and single-ended quality measurement of reverberant speech. 514-517
Spectral Analysis, Formants and Vocal Tract Models
- Toon van Waterschoot, Marc Moonen:

Linear prediction of audio signals. 518-521 - Carlo Magi, Tom Bäckström, Paavo Alku:

Stabilised weighted linear prediction - a robust all-pole method for speech processing. 522-525 - Daniel Rudoy, Daniel N. Spendley, Patrick J. Wolfe:

Conditionally linear Gaussian models for estimating vocal tract resonances. 526-529 - Karl Schnell, Arild Lacroix:

Time-varying pre-emphasis and inverse filtering of speech. 530-533 - Joachim Thiemann, Peter Kabal:

Reconstructing audio signals from modified non-coherent hilbert envelopes. 534-537 - Binh Phu Nguyen, Masato Akagi:

A flexible spectral modification method based on temporal decomposition and Gaussian mixture model. 538-541 - Jonathan Darch, Ben Milner:

A comparison of estimated and MAP-predicted formants and fundamental frequencies with a speech reconstruction application. 542-545 - Huiqun Deng, Douglas D. O'Shaughnessy:

Effect of incomplete glottal closures on estimates of glottal waves via inverse filtering of vowel sounds. 546-549 - Kaustubh Kalgaonkar, Mark A. Clements:

Vocal tract and area function estimation with both lip and glottal losses. 550-553 - Sunitha Guruprasad, B. Yegnanarayana, K. Sri Rama Murty:

Detection of instants of glottal closure using characteristics of excitation source. 554-557 - Nicolas Sturmel, Christophe d'Alessandro, Boris Doval:

A comparative evaluation of the zeros of z transform representation for voice source estimation. 558-561
Speech and Audio Processing for Intelligent Environments
- Aki Härmä

:
Ambient telephony: scenarios and research challenges. 562-565 - Yasunari Obuchi, Akio Amano:

Always listening to you: creating exhaustive audio database in home environments. 566-569 - Joerg Schmalenstroeer, Reinhold Haeb-Umbach:

Joint speaker segmentation, localization and identification for streaming audio. 570-573 - Yan-Chen Lu, Martin Cooke, Heidi Christensen:

Active binaural distance estimation for dynamic sources. 574-577 - Bengt J. Borgström, Abeer Alwan:

A packetization and variable bitrate interframe compression scheme for vector quantizer-based distributed speech recognition. 578-581 - Matthias Wölfel:

Channel selection by class separability measures for automatic transcriptions on distant microphones. 582-585 - Danny Wyatt, Tanzeem Choudhury, Jeff A. Bilmes:

Conversation detection and speaker segmentation in privacy-sensitive situated speech data. 586-589 - Alberto Abad, Carlos Segura, Climent Nadeu, Javier Hernando:

Audio-based approaches to head orientation estimation in a smart-room. 590-593 - Valentin Ion, Reinhold Haeb-Umbach:

Multi-resolution soft features for channel-robust distributed speech recognition. 594-597
Language Modeling I, II
- Yi Su, Frederick Jelinek, Sanjeev Khudanpur:

Large-scale random forest language models for speech recognition. 598-601 - Yuya Akita, Yusuke Nemoto, Tatsuya Kawahara:

PLSA-based topic detection in meetings for adaptation of lexicon and language model. 602-605 - Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki:

Language modeling using PLSA-based topic HMM. 606-609 - Yi-Cheng Pan, Lin-Shan Lee:

Lexicon adaptation with reduced character error (LARCE) - a new direction in Chinese language modeling. 610-613 - Meng-Sung Wu, Jen-Tzung Chien:

Minimum rank error training for language modeling. 614-617 - Wen Wang, Andreas Stolcke:

Integrating MAP, marginals, and unsupervised language model adaptation. 618-621
Prosody Production and Perception
- Sasha Calhoun:

Predicting focus through prominence structure. 622-625 - Murtaza Bulut, Sungbok Lee, Shrikanth S. Narayanan:

Analysis of emotional speech prosody in terms of part of speech tags. 626-629 - Fang Liu, Yi Xu:

The neutral tone in question intonation in Mandarin. 630-633 - Amélie Rochet-Capellan, Jean-Luc Schwartz, Rafael Laboissière, Arturo Galvàn:

Pointing to a target while naming it with /pata/ or /tapa/: the effect of consonants and stress position on jaw-finger coordination. 634-637 - Øydis Hide, Steven Gillis, Paul Govaerts:

Suprasegmental aspects of pre-lexical speech in cochlear implanted children. 638-641 - Oliver Niebuhr:

Categorical perception in intonation: a matter of signal dynamics? 642-645
Multimodal Speech Recognition
- Noureddine Aboutabit

, Denis Beautemps, Jeanne Clarke, Laurent Besacier:
A HMM recognition of consonant-vowel syllables from lip contours: the cued speech case. 646-649 - Patrick Lucey, Gerasimos Potamianos, Sridha Sridharan:

A unified approach to multi-pose audio-visual ASR. 650-653 - Rowan Seymour, Darryl Stewart, Ji Ming:

Audio-visual integration for robust speech recognition using maximum weighted stream posteriors. 654-657 - Thomas Hueber, Gérard Chollet, Bruce Denby, Gérard Dreyfus, Maureen Stone:

Continuous-speech phone recognition from ultrasound and optical images of the tongue and lips. 658-661 - Bo Zhu, Timothy J. Hazen, James R. Glass:

Multimodal speech recognition with ultrasonic sensors. 662-665 - David Dean, Patrick Lucey, Sridha Sridharan, Tim Wark:

Fused HMM-adaptation of multi-stream HMMs for audio-visual speech recognition. 666-669
Speech and Other Modalities
- Carlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita:

Analysis of head motions and speech in spoken dialogue. 670-673 - Lars Bo Larsen, Kasper Løvborg Jensen, Søren Larsen, Morten Højfeldt Rasmussen:

A paradigm for mobile speech-centric services. 674-677 - Pavel Campr, Marek Hrúz, Milos Zelezný:

Design and recording of Czech sign language corpus for automatic sign language recognition. 678-681 - Jens Edlund, Jonas Beskow:

Pushy versus meek - using avatars to influence turn-taking behaviour. 682-685 - Michael Wand, Szu-Chen Stan Jou, Tanja Schultz:

Wavelet-based front-end for electromyographic speech recognition. 686-689 - Gaëlle Ferré, Roxane Bertrand, Philippe Blache, Robert Espesser, Stéphane Rauzy:

Intensive gestures in French and their multimodal correlates. 690-693 - Slim Ouni, Kaïs Ouni:

Aspects of visual speech in Arabic. 694-697 - Denis Burnham, Jessica Reynolds, Guillaume Vignali, Sandra Bollwerk, Caroline Jones:

Rigid vs non-rigid face and head motion in phone and tone perception. 698-701
Multimodal/Multimedia Signal Processing
- Hedvig Kjellström, Olov Engwall, Sherif Mahdy Abdou, Olle Bälter:

Audio-visual phoneme classification for pronunciation training applications. 702-705 - Katja Grauwinkel, Britta Dewitt, Sascha Fagel:

Visual information and redundancy conveyed by internal articulator dynamics in synthetic audiovisual speech. 706-709 - Wei Zhou, Zengfu Wang:

A speech rate related lip movement model for speech animation. 710-713 - Guanyong Wu, Jie Zhu:

An extension 2DPCA based visual feature extraction method for audio-visual speech recognition. 714-717 - Soo-Jong Lee, Jun Park, Eung-Kyeu Kim:

Preventing an external acoustic noise from being misrecognized as a speech recognition object by confirming the lip movement image signal. 718-721 - Gregor Hofer, Hiroshi Shimodaira:

Automatic head motion prediction from speech data. 722-725 - Yuki Denda, Takanobu Nishiura, Yoichi Yamashita:

Omnidirectional audio-visual talker localizer with dynamic feature fusion based on validity and reliability criteria. 726-729 - Nick Campbell, Damien Douxchamps:

Processing image and audio information for recognising discourse participation status through features of face and voice. 730-733
Speaker Verification & Identification I-IV
- José R. Calvo, Rafael Fernández, Gabriel Hernández:

Application of shifted delta cepstral features in speaker verification. 734-737 - Luciana Ferrer, M. Kemal Sönmez, Elizabeth Shriberg:

A smoothing kernel for spatially related features and its application to speaker verification. 738-741 - Delphine Charlet, Mikaël Collet, Frédéric Bimbot:

VZ-norm: an extension of z-norm to the multivariate case for anchor model based speaker verification. 742-745 - Howard Lei, Nikki Mirghafori:

Word-conditioned HMM supervectors for speaker recognition. 746-749 - Wei-Ho Tsai:

Speaker clustering using direct maximization of a BIC-based score. 750-753 - Alexandre Preti, Jean-François Bonastre, Driss Matrouf, François Capman, Bertrand Ravera:

Confidence measure based unsupervised target model adaptation for speaker verification. 754-757 - Huanjun Bao, Ming-Xing Xu, Thomas Fang Zheng:

Emotion attribute projection for speaker recognition on emotional speech. 758-761 - Shi-Xiong Zhang, Man-Wai Mak, Helen M. Meng:

High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling. 762-765 - T. Yingthawornsuk, H. Kaymaz Keskinpala, D. Mitchell Wilkes, Richard G. Shiavi, Ronald M. Salomon:

Direct acoustic feature using iterative EM algorithm and spectral energy for classifying suicidal speech. 766-769 - Claudio Garretón, Néstor Becerra Yoma, Fernando Huenupán, Carlos Molina:

On comparing and combining intra-speaker variability compensation and unsupervised model adaptation in speaker verification. 770-773 - Xianyu Zhao, Yuan Dong, Hao Yang, Jian Zhao, Liang Lu, Haila Wang:

Comparison of two kinds of speaker location representation for SVM-based speaker verification. 774-777 - Mireia Farrús, Javier Hernando, Pascual Ejarque:

Jitter and shimmer measurements for speaker recognition. 778-781 - Zhenyu Shan, Yingchun Yang, Ruizhi Ye:

Natural-emotion GMM transformation algorithm for emotional speaker recognition. 782-785 - Ivy H. Tseng, Olivier Verscheure, Deepak S. Turaga, Upendra V. Chaudhari:

Optimized one-bit quantization for adapted GMM-based speaker verification. 786-789 - Mitchell McLaren, Robbie Vogt, Brendan Baker, Sridha Sridharan:

A comparison of session variability compensation techniques for SVM-based speaker recognition. 790-793 - Benoit G. B. Fauve, Nicholas W. D. Evans, Neil Pearson, Jean-François Bonastre, John S. D. Mason:

Influence of task duration in text-independent speaker verification. 794-797
Speech Enhancement
- Kamil K. Wójcicki, Stephen So, Kuldip K. Paliwal:

The effect of the additivity assumption on time and frequency domain wiener filtering for speech enhancement. 798-801 - Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:

Noise reduction based on adaptive β-order generalized spectral subtraction for speech enhancement. 802-805 - Amit Das, John H. L. Hansen:

Class constrained ROVER based speech enhancement. 806-809 - Erhan Deger, Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan:

EMD based soft-thresholding for speech enhancement. 810-813 - Adam Borowicz, Alexander A. Petrovsky:

An approximate solution for perceptually constrained signal subspace speech enhancement method. 814-817 - Tim Fingscheidt, Suhadi Suhadi:

Quality assessment of speech enhancement systems by separation of enhanced speech, noise, and echo. 818-821 - Anis Ben Aicha, Sofia Ben Jebara:

Perceptual musical noise reduction using critical bands tonality coefficients and masking thresholds. 822-825 - Dirk Mauler, Anil M. Nagathil, Rainer Martin:

On optimal estimation of compressed speech for hearing aids. 826-829 - Richard C. Hendriks, Jesper Jensen, Richard Heusdens:

DFT domain subspace based noise tracking for speech enhancement. 830-833 - Nitish Krishnamurthy, John H. L. Hansen:

Noise tracking for speech systems in adverse environments. 834-837 - Abderrahman Essebbar, Tristan Poinsard:

Speech enhancement using multi-reference noise reduction in a vehicle environment. 838-841 - Ernst Warsitz, Reinhold Haeb-Umbach, Dang Hai Tran Vu:

Blind adaptive principal eigenvector beamforming for acoustical source separation. 842-845 - Zbynek Koldovský, Petr Tichavský:

Time-domain blind audio source separation using advanced ICA methods. 846-849 - Siu Wa Lee, Frank K. Soong, Pak-Chung Ching:

Model-based speech separation with single-microphone input. 850-853 - Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Masato Miyoshi:

Multi-step linear prediction based speech dereverberation in noisy reverberant environment. 854-857 - Seung Yeol Lee, Jong Won Shin, Hwan Sik Yun, Nam Soo Kim:

A statistical model based post-filtering algorithm for residual echo suppression. 858-861 - Xiaoshan Huang, Xiaoqun Zhao:

An optimal speech enhancement under speech uncertainty probability and masking property of auditory system. 862-865
Structure-based and Template-based Automatic Speech Recognition
- Viktoria Maier, Roger K. Moore:

Temporal episodic memory model: an evolution of minerva2. 866-869 - Gianpaolo Coro

, Francesco Cutugno, Fulvio Caropreso:
Speech recognition with factorial-HMM syllabic acoustic models. 870-873 - Mathias De Wachter, Kris Demuynck, Patrick Wambacq, Dirk Van Compernolle:

Evaluating acoustic distance measures for template based recognition. 874-877 - Yan Han, Lou Boves:

Hierarchical acoustic modeling based on random-effects regression for automatic speech recognition. 878-881 - Annika Hämäläinen, Louis ten Bosch, Lou Boves:

Construction and analysis of multiple paths in syllable models. 882-885 - Carol Y. Espy-Wilson, Tarun Pruthi, Amit Juneja, Om Deshmukh:

Landmark-based approach to speech recognition: an alternative to HMMs. 886-889 - Satoshi Asakawa, Nobuaki Minematsu, Keikichi Hirose:

Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics. 890-893 - Roberto Togneri, Li Deng:

A structured speech model parameterized by recursive dynamics and neural networks. 894-897 - Li Deng, Helmer Strik:

Structure-based and template-based automatic speech recognition - comparing parametric and non-parametric approaches. 898-901 - David Grangier, Samy Bengio:

Learning the inter-frame distance for discriminative template-based keyword detection. 902-905 - Dong Yu, Li Deng, Alex Acero:

Handling phonetic context and speaker variation in a structure-based speech recognizer. 906-909
Robust ASR Against Noise and Reverberation
- Maarten Van Segbroeck, Hugo Van hamme

:
Vector-quantization based mask estimation for missing data automatic speech recognition. 910-913 - Sébastien Demange, Christophe Cerisara, Jean Paul Haton:

Accurate marginalization range for missing data recognition. 914-917 - Marco Kühne, Roberto Togneri, Sven Nordholm:

Smooth soft mel-spectrographic masks based on blind sparse source separation. 918-921 - Jonathan Laidler, Martin Cooke, Neil D. Lawrence

:
Model-driven detection of clean speech patches in noise. 922-925 - Richard M. Stern, Evandro B. Gouvêa, Govindarajan Thattai:

"polyaural" array processing for automatic speech recognition in degraded environments. 926-929 - Nicolás Morales, Liang Gu, Yuqing Gao:

Adding noise to improve noise robustness in speech recognition. 930-933
Language Resources and Tools
- Eric Fosler-Lussier, Laura Dilley, Na'im R. Tyson, Mark A. Pitt:

The buckeye corpus of speech: updates and enhancements. 934-937 - Nora Barroso, Aitzol Ezeiza, N. Gilisagasti, Karmele López de Ipiña, A. López, Juan Miguel López

:
Development of multimodal resources for multilingual information retrieval in the basque context. 938-941 - Reva Schwartz

, Wade Shen, Joseph P. Campbell, Shelley Paget, Julie Vonwiller, Dominique Estival, Christopher Cieri:
Construction of a phonotactic dialect corpus using semiautomatic annotation. 942-945 - Slim Abdennadher, Mohamed Aly, Dirk Bühler, Wolfgang Minker, Johannes Pittermann:

BECAM tool - a semi-automatic tool for bootstrapping emotion corpus annotation and management. 946-949 - Christopher Cieri, Linda Corson, David Graff, Kevin Walker:

Resources for new research directions in speaker recognition: the mixer 3, 4 and 5 corpora. 950-953 - Peter A. Heeman, Andy McMillin, J. Scott Yaruss:

Intercoder reliability in annotating complex disfluencies. 954-957
Single-channel Speech Enhancement
- Mohammad H. Radfar, Richard M. Dansereau:

Single channel speech separation using maximum a posteriori estimation. 958-961 - Suhadi Suhadi, Tim Fingscheidt:

Speech enhancement with improved a posteriori SNR computation. 962-965 - Thang Tat Vu, Germine Seide, Masashi Unoki, Masato Akagi:

Method of LP-based blind restoration for improving intelligibility of bone-conducted speech. 966-969 - Tiago H. Falk, Svante Stadler, W. Bastiaan Kleijn

, Wai-Yip Chan:
Noise suppression based on extending a speech-dominated modulation band. 970-973 - Amin Haji Abolhassani, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy, Mohamed Faouzi Harkat:

Speech enhancement using PCA and variance of the reconstruction error model identification. 974-977 - Jong Won Shin, Woohyung Lim, June Sig Sung, Nam Soo Kim:

Speech reinforcement based on partial specific loudness. 978-981
Phonetics and Phonology
- Tamara Rathcke, Jonathan Harrington:

The phonetics and phonology of high and low tones in two falling f0-contours in standard German. 982-985 - Tina John, Jonathan Harrington:

Temporal alignment of creaky voice in neutralised realisations of an underlying, post-nasal voicing contrast in German. 986-989 - Mike Demol, Werner Verhelst, Piet Verhoeve:

The duration of speech pauses in a multilingual environment. 990-993 - Dafydd Gibbon, Jolanta Bachan, Grazyna Demenko:

Syllable timing patterns in Polish: results from annotation mining. 994-997 - Constandinos Kalimeris, Stelios Bakamidis:

Minimal pairs and functional loads of sound contrasts obtained from a list of modern greek words. 998-1001 - Daan Wissing:

More on acoustic correlates of stress. 1002-1005 - Cécile Woehrling, Philippe Boula de Mareüil:

Comparing praat and snack formant measurements on two large corpora of northern and southern French. 1006-1009 - William J. Barry, Bistra Andreeva, Ingmar Steiner:

The phonetic exponency of phrasal accentuation in French and German. 1010-1013 - Christiana Christodoulou:

Phonetic geminates in cypriot greek: the case of voiceless plosives. 1014-1017 - Darcie Williams, François Poiré:

Predicting vowel duration in spontaneous canadian French speech. 1018-1021 - Ivan Chow, François Poiré:

Rhotic variation and schwa epenthesis in windsor French. 1022-1025 - Audrey Bürki, Cécile Fougeron, Cédric Gendrot:

On the categorical nature of the process involved in schwa elision in French. 1026-1029 - Yue-Ning Hu, Min Chu, Chao Huang, Yan-Ning Zhang:

Exploring tonal variations via context-dependent tone models. 1030-1033 - Philippe Martin, Jun Li:

Acoustic analysis of the neutral tone in Mandarin. 1034-1037 - Rerrario Shui-Ching Ho, Yoshinori Sagisaka:

F0 analysis of perceptual distance among Cantonese level tones. 1038-1041
Robust ASR I, II
- Yu Hu, Qiang Huo:

Irrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions. 1042-1045 - Luis Buera, Antonio Miguel, Eduardo Lleida, Oscar Saz, Alfonso Ortega

:
On the jointly unsupervised feature vector normalization and acoustic model compensation for robust speech recognition. 1046-1049 - Yu Tsao, Chin-Hui Lee:

An ensemble modeling approach to joint characterization of speaker and speaking environments. 1050-1053 - Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen:

Cluster-based polynomial-fit histogram equalization (CPHEQ) for robust speech recognition. 1054-1057 - Pedro M. Martinez, José C. Segura

, Luz García:
Robust distributed speech recognition using histogram equalization and correlation information. 1058-1061 - Jen-Tzung Chien

, Koichi Shinoda, Sadaoki Furui:
Predictive minimum Bayes risk classification for robust speech recognition. 1062-1065 - Ning Ma, Jon Barker, Phil D. Green:

Applying word duration constraints by using unrolled HMMs. 1066-1069 - Xiong Xiao, Engsiong Chng, Haizhou Li:

Evaluating the temporal structure normalisation technique on the Aurora-4 task. 1070-1073 - Hynek Boril, Petr Fousek, Harald Höge:

Two-stage system for robust neutral/lombard speech recognition. 1074-1077 - Takatoshi Jitsuhiro, Tomoji Toriyama, Kiyoshi Kogure:

Noise suppression using search strategy with multi-model compositions. 1078-1081 - Takanobu Nishiura, Yoshiki Hirano, Yuki Denda, Masato Nakayama:

Investigations into early and late reflections on distant-talking speech recognition toward suitable reverberation criteria. 1082-1085 - Stefan Windmann, Reinhold Haeb-Umbach:

An approach to iterative speech feature enhancement and recognition. 1086-1089 - Jeih-Weih Hung:

Optimization of temporal filters in the modulation frequency domain for constructing robust features in speech recognition. 1090-1093 - Rico Petrick, Kevin Lohde, Matthias Wolff, Rüdiger Hoffmann:

The harming part of room acoustics in automatic speech recognition. 1094-1097 - Yuan-Fu Liao, Yh-Her Yang, Chi-Hui Hsu, Cheng-Chang Lee, Jing-Teng Zeng:

A reference model weighting-based method for robust speech recognition. 1098-1101 - Babak Nasersharif, Ahmad Akbari, Mohammad Mehdi Homayounpour:

Mel sub-band filtering and compression for robust speech recognition. 1102-1105
Features for ASR
- Chang-Wen Hsu, Lin-Shan Lee:

Extended powered cepstral normalization (p-CN) with range equalization for robust features in speech recognition. 1106-1109 - Makoto Sakai, Norihide Kitaoka, Seiichi Nakagawa:

Selection of optimal dimensionality reduction method using chernoff bound for segmental unit input HMM. 1110-1113 - Vivek Tyagi:

Fepstrum: an improved modulation spectrum for ASR. 1114-1117 - Dusan Macho:

Narrowband to wideband feature expansion for robust multilingual ASR. 1118-1121 - Weifeng Li, Hervé Bourlard:

Non-linear spectral contrast stretching for in-car speech recognition. 1122-1125 - Xiao-Bing Li, Douglas D. O'Shaughnessy:

Clustering-based two-dimensional linear discriminant analysis for speech recognition. 1126-1129 - Yotaro Kubo, Shigeki Okawa, Akira Kurematsu, Katsuhiko Shirai:

A study on temporal features derived by analytic signal. 1130-1133 - Stephen A. Zahorian, Tara Singh, Hongbing Hu:

Dimensionality reduction of speech features using nonlinear principal components analysis. 1134-1137 - D. Rama Sanand, D. Dinesh Kumar, Srinivasan Umesh:

Linear transformation approach to VTLN using dynamic frequency warping. 1138-1141 - Vladimir Fabregas Surigué de Alencar, Abraham Alcaim:

Features interpolation domain for distributed speech recognition and performance for ITU-t g.723.1 CODEC. 1142-1145 - Shoei Sato, Kazuo Onoe, Akio Kobayashi, Shinichi Homma, Toru Imai, Tohru Takagi, Tetsunori Kobayashi:

Dynamic integration of multiple feature streams for robust real-time LVCSR. 1146-1149 - Hironori Matsumasa, Tetsuya Takiguchi, Yasuo Ariki, Ichao Li, Toshitaka Nakabayashi:

PCA-based feature extraction for fluctuation in speaking style of articulation disorders. 1150-1153 - Fabio Valente, Jithendra Vepa, Hynek Hermansky:

Multi-stream features combination based on dempster-shafer rule for LVCSR system. 1154-1157 - Natasha Singh-Miller, Michael Collins, Timothy J. Hazen:

Dimensionality reduction for speech recognition using neighborhood components analysis. 1158-1161 - Dan Su, Xihong Wu, Huisheng Chi:

Probabilistic latent speaker analysis for large vocabulary speech recognition. 1162-1165 - S. R. Mahadeva Prasanna, Hynek Hermansky:

MRASTA and PLP in automatic speech recognition. 1166-1169
Objective Assessment of Voice and Speech Quality
- Markus Brckl:

Women's vocal aging: a longitudinal approach. 1170-1173 - Laurence Cnockaert, Jean Schoentgen, Canan Ozsancak, Pascal Auzou, Francis Grenez:

Effect of intensive voice therapy on vocal tremor for parkinson speakers. 1174-1177 - Ali Alpan, Abdellah Kacha, Francis Grenez, Jean Schoentgen:

Assessment of vocal dysperiodicities in connected disordered speech. 1178-1181 - Anne-Maria Laukkanen, Jaromír Horácek, Pavel Svancara, Elina Lehtinen:

Effects of FE modelled consequences of tonsillectomy on perceptual evaluation of voice. 1182-1185 - Irma Verdonck-de Leeuw, Louis ten Bosch, Li Ying Chao, Rico N. P. M. Rinkel, Pepijn A. Borggreven, Lou Boves, C. René Leemans:

Speech quality after major surgery of the oral cavity and oropharynx with microvascular soft tissue reconstruction. 1186-1189 - Christel G. de Bruijn, Sandra P. Whiteside:

Voice fatigue and use of speech recognition: a study of voice quality ratings. 1190-1193 - Jean-François Bonastre, Corinne Fredouille, Alain Ghio, Antoine Giovanni, Gilles Pouchoulin, Joana Revis, Bernard Teston, P. Yu:

Complementary approaches for voice disorder assessment. 1194-1197 - Gilles Pouchoulin, Corinne Fredouille, Jean-François Bonastre, Alain Ghio, Antoine Giovanni:

Frequency study for the characterization of the dysphonic voices. 1198-1201 - Victor J. Boucher:

Acoustic correlates of laryngeal-muscle fatigue: findings for a phonometric prevention of acquired voice pathologies. 1202-1205 - Andreas K. Maier, Maria Schuster, Anton Batliner, Elmar Nöth, Emeka Nkenke:

Automatic scoring of the intelligibility in patients with cancer of the oral cavity. 1206-1209 - Jacques Duchateau, Leen Cleuren, Hugo Van hamme

, Pol Ghesquière:
Automatic assessment of children's reading level. 1210-1213 - Carlos A. Ferrer-Riesgo, María Esperanza Hernández-Díaz, Eduardo González-Moreira:

Using waveform matching techniques in the measurement of shimmer in voiced signals. 1214-1217 - Rubén Fraile, Juan Ignacio Godino-Llorente, Nicolás Sáenz-Lechón, Víctor Osma-Ruiz, Pedro Gómez-Vilda:

Analysis of the impact of analogue telephone channel on MFCC parameters for voice pathology detection. 1218-1221 - Claudia Manfredi, Leonardo Bocchi, Giovanna Cantarella, Giorgio Peretti, Gabriele Guidi, Vincenzo Mezzatesta:

Objective parameters from videokymographic images: a user-friendly interface. 1222-1225
Speaker Verification & Identification I-IV
- Elizabeth Shriberg, Luciana Ferrer:

A text-constrained prosodic system for speaker verification. 1226-1229 - Asmaa El Hannani

, Dijana Petrovska-Delacrétaz:
Fusing acoustic, phonetic and data-driven systems for text-independent speaker verification. 1230-1233 - Najim Dehak

, Patrick Kenny, Pierre Dumouchel:
Continuous prosodic features and formant modeling with joint factor analysis for speaker verification. 1234-1237 - Claudio Vair, Daniele Colibro, Fabio Castaldo, Emanuele Dalmasso, Pietro Laface:

Loquendo - Politecnico di torino's 2006 NIST speaker recognition evaluation system. 1238-1241 - Driss Matrouf, Nicolas Scheffer, Benoit G. B. Fauve, Jean-François Bonastre:

A straightforward and efficient implementation of the factor analysis model for speaker verification. 1242-1245 - Timothy J. Hazen, Daniel Schultz:

Multi-modal user authentication from video for mobile or variable-environment applications. 1246-1249
Discourse, Dialog and Emotion Expression
- David House:

Integrating audio and visual cues for speaker friendliness in multimodal speech synthesis. 1250-1253 - Wieneke Wesseling, R. J. J. H. van Son, Louis C. W. Pols:

The influence of masking words on the prediction of TRPs in a shadowed dialog. 1254-1257 - Kornel Laskowski, Susanne Burger:

Analysis of the occurrence of laughter in meetings. 1258-1261 - Pashiera Barkhuysen, Emiel Krahmer, Marc Swerts:

Incremental perception of acted and real emotional speech. 1262-1265 - David Schlangen, Raquel Fernández:

Speaking through a noisy channel - experiments on inducing clarification behaviour in human-human dialogue. 1266-1269 - Christophe d'Alessandro, Albert Rilliard, Sylvain Le Beux:

Computerized chironomy: evaluation of hand-controlled intonation reiteration. 1270-1273
Prosodic Modeling I, II
- Keikichi Hirose, Keiko Ochi, Nobuaki Minematsu:

Corpus-based generation of prosodic features from text based on generation process model. 1274-1277 - Jilei Tian, Jani Nurminen, Imre Kiss:

Novel eigenpitch-based prosody model for text-to-speech synthesis. 1278-1281 - Volker Strom, Ani Nenkova, Robert A. J. Clark, Yolanda Vazquez-Alvarez, Jason M. Brenier, Simon King, Dan Jurafsky:

Modelling prominence and emphasis improves unit-selection synthesis. 1282-1285 - Seiya Takada, Yuji Yagi, Keikichi Hirose, Nobuaki Minematsu:

A framework of reply speech generation for concept-to-speech conversion in spoken dialogue systems. 1286-1289 - Thorsten Stocksmeier, Stefan Kopp, Dafydd Gibbon:

Synthesis of prosodic attitudinal variants in German backchannel ja. 1290-1293 - Ke Li, Yoko Greenberg, Yoshinori Sagisaka:

Inter-language prosodic style modification experiment using word impression vector for communicative speech generation. 1294-1297
Resource Acquisition and Preparation; Resource and System Evaluation
- Ivan Habernal, Miloslav Konopík:

JAAE: the java abstract annotation editor. 1298-1301 - Goshu Nagino, Makoto Shozakai, Kiyohiro Shikano:

How to judge reusability of existing speech corpora for target task by utilizing statistical multidimensional scaling. 1302-1305 - Peter Rutten:

Feasibility of constructing an expressive speech corpus from television soap opera dialogue. 1306-1309 - Rosemary Orr, Bernat González i Llinares, Françoise Petersen, Helge Hüttenrauch, Martin Böcker, Michael Tate:

Collection of empirical data for standardization of generic vocabularies in speech driven ICT devices and services. 1310-1313 - Antonio Marcos Selmini, Fábio Violaro:

Acoustic-phonetic features for refining the explicit speech segmentation. 1314-1317 - Benjamin Lecouteux, Georges Linarès, Frédéric Beaugendre, Pascal Nocera:

Text island spotting in large speech databases. 1318-1321 - Tim Paek, Yun-Cheng Ju, Christopher Meek:

People watcher: a game for eliciting human-transcribed data for automated directory assistance. 1322-1325 - Andrew L. Kun, Tim Paek, Zeljko Medenica:

The effect of speech interface accuracy on driving performance. 1326-1329 - Hua Zhang, Lijuan Wang, Frank K. Soong, Wenju Liu:

Context constrained-generalized posterior probability for verifying phone transcriptions. 1330-1333 - Pongtep Angkititrakul, DongGu Kwak, SangJo Choi, JeongHee Kim, Anh PhucPhan, Amardeep Sathyanarayana, John H. L. Hansen:

Getting start with UTDrive: driver-behavior modeling and assessment of distraction for in-vehicle speech systems. 1334-1337 - BalaKrishna Kolluru, Yoshihiko Gotoh:

Relative evaluation of informativeness in machine generated summaries. 1338-1341 - Toshiyuki Takezawa, Masahide Mizushima, Tohru Shimizu, Gen-ichiro Kikui:

A method for evaluating task-oriented spoken dialog translation systems based on communication efficiency. 1342-1345 - Charlotte van Hooijdonk, Edwin Commandeur, Reinier Cozijn, Emiel Krahmer, Erwin Marsi:

Using eye movements for online evaluation of speech synthesis. 1346-1349 - Jian Li, Dmitry Sityaev, Jie Hao:

Sentence level intelligibility evaluation for Mandarin text-to-speech systems using semantically unpredictable sentences. 1350-1353 - Judith M. Kessens, David A. van Leeuwen:

N-best: the northern- and southern-dutch benchmark evaluation of speech recognition technology. 1354-1357 - Trym Holter, Svein Srsdal:

A MAP based approach to adaptive speech intelligibility measurements. 1358-1361 - Sirinoot Boonsuk, Proadpran Punyabukkana, Atiwong Suchato:

Phone boundary detection using selective refinements and context-dependent acoustic features. 1362-1365
Speech Production I, II
- Sorin Dusan:

Vocal tract length during speech production. 1366-1369 - Nobuhiro Miki, Kyohei Hayashi:

Approximation method of subglottal system using ARMA filter. 1370-1373 - Asterios Toutios, Konstantinos G. Margaritis:

Enhancing acoustic-to-EPG mapping with lip position information. 1374-1377 - Tokihiko Kaburagi, Yosuke Tanabe:

A model of glottal flow incorporating viscous-inviscid interaction. 1378-1381 - Kilian G. Seeber:

Thinking outside the cube: modeling language processing tasks in a multiple resource paradigm. 1382-1385 - Julien Cisonni, Annemie Van Hirtum, Jan Willems, Xavier Pelorson:

Experimental validation of direct and inverse glottal flow models for unsteady flow conditions. 1386-1389 - Hideyuki Nomura, Tetsuo Funada:

Effect of unsteady glottal flow on the speech production process. 1390-1393 - Katrin Schneider, Bernd Möbius:

Word stress correlates in spontaneous child-directed speech in German. 1394-1397 - Michael Aron, Nicolas Ferveur, Erwan Kerrien, Marie-Odile Berger, Yves Laprie:

Acquisition and synchronization of multimodal articulatory data. 1398-1401 - Vincent Robert, Yves Laprie, Anne Bonneau:

A phonetic concatenative approach of labial coarticulation. 1402-1405 - Aseel Turkmani, Adrian Hilton, Philip J. B. Jackson, James D. Edge:

Visual analysis of lip coarticulation in VCV utterances. 1406-1409 - Matti Airas, Paavo Alku:

Comparison of multiple voice source parameters in different phonation types. 1410-1413 - Monja A. Knoll, Lisa Scharrer:

Acoustic and affective comparisons of natural and imaginary infant-, foreigner- and adult-directed speech. 1414-1417 - André Araújo, Luis M. T. Jesus, Isabel M. Costa:

Vowel production in two occlusal classes. 1418-1421 - Rajesh Khatiwada:

Nepalese retroflex stops: a static palatography study of inter- and intra-speaker variability. 1422-1425 - Charles A. Lamoureux, Victor J. Boucher:

Effects of testosterone levels on temporal and intonational aspects of speech: more exploratory data. 1426-1428
ASR: New Paradigms
- Tien Ping Tan, Laurent Besacier:

Modeling context and language variation for non-native speech recognition. 1429-1432 - Xufang Zhao, Douglas D. O'Shaughnessy:

An evaluation of cross-language adaptation and native speech training for rapid HMM construction based on very limited training data. 1433-1436 - Konstantin Markov, Satoshi Nakamura:

Never-ending learning with dynamic hidden Markov network. 1437-1440 - Catherine Breslin, Mark J. F. Gales:

Building multiple complementary systems using directed decision trees. 1441-1444 - Hiroaki Nanjo, Yuichi Oku, Takehiko Yoshimi:

Automatic speech recognition framework for multilingual audio contents. 1445-1448 - Ghazi Bouselmi, Dominique Fohr, Irina Illina:

Combined acoustic and pronunciation modelling for non-native speech recognition. 1449-1452 - Tadashi Emori, Yoshifumi Onishi, Koichi Shinoda:

Automatic estimation of scaling factors among probabilistic models in speech recognition. 1453-1456 - Emilian Stoimenov, John W. McDonough:

Memory efficient modeling of polyphone context with weighted finite-state transducers. 1457-1460 - Valeriy Pylypenko:

Extra large vocabulary continuous speech recognition algorithm based on information retrieval. 1461-1464 - I. Lee Hetherington:

PocketSUMMIT: small-footprint continuous speech recognition. 1465-1468 - Tobias Cincarek, Izumi Shindo, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:

Development of preschool children subsystem for ASR and q&a in a real-environment speech-oriented guidance task. 1469-1472 - Chengyuan Ma, Chin-Hui Lee:

A study on word detector design and knowledge-based pruning and rescoring. 1473-1476 - Thomas Colthurst, Tresi Arvizo, Chia-Lin Kao, Owen Kimball, Stephen A. Lowe, David R. H. Miller, Jim Van Sciver:

Parameter tuning for fast speech recognition. 1477-1480 - Louis ten Bosch, Bert Cranen:

A computational model for unsupervised word discovery. 1481-1484 - Bernd T. Meyer, Matthias Wächter, Thomas Brand, Birger Kollmeier:

Phoneme confusions in human and automatic speech recognition. 1485-1488 - Kengo Ohta, Masatoshi Tsuchiya, Seiichi Nakagawa:

Construction of spoken language model including fillers using filler prediction model. 1489-1492 - Raghunandan Kumaran, Jeff A. Bilmes, Katrin Kirchhoff:

Attention shift decoding for conversational speech recognition. 1493-1496
Speech and Language Technology for Less-resourced Languages
- Péter Mihajlik

, Tibor Fegyó, Zoltán Tüske, Pavel Ircing:
A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian. 1497-1500 - Mei Yang, Jing Zheng, Andreas Kathol:

A semi-supervised learning approach for morpheme segmentation for an Arabic dialect. 1501-1504 - Gerhard B. Van Huyssteen, Martin J. Puttkammer:

Accelerating the annotation of lexical data for less-resourced languages. 1505-1508 - Christoph Draxler:

On web-based creation of speech resources for less-resourced languages. 1509-1512 - Miroslav Martinovic, Srdjdan Vesic, Goran Rakic:

Building an information retrieval system for serbian - challenges and solutions. 1513-1516 - Guy De Pauw, Peter Waiganjo Wagacha:

Bootstrapping morphological analysis of gĩkũyũ using unsupervised maximum entropy learning. 1517-1520 - Jerneja Zganec-Gros, Stanislav Gruden:

The voiceTRAN machine translation system. 1521-1524 - Sérgio Paulo

, Luís C. Oliveira:
MuLAS: a framework for automatically building multi-tier corpora. 1525-1528 - Jacquelijn Ringersma, Marc Kemps-Snijders:

Creating multimedia dictionaries of endangered languages using LEXUS. 1529-1532 - Hrafn Loftsson, Eiríkur Rögnvaldsson:

IceNLP: a natural language processing toolkit for icelandic. 1533-1536 - Marius Peche, Marelie H. Davel, Etienne Barnard:

Phonotactic spoken language identification with limited training data. 1537-1540 - Solomon Teferra Abate, Wolfgang Menzel:

Automatic speech recognition for an under-resourced language - amharic. 1541-1544 - Abdillahi Nimaan, Pascal Nocera, Frédéric Béchet, Jean-François Bonastre:

Information retrieval strategies for accessing african audio corpora. 1545-1548 - Vesa Siivola, Mathias Creutz, Mikko Kurimo:

Morfessor and variKN machine learning tools for speech and language technology. 1549-1552 - Markpong Jongtaveesataporn, Issara Thienlikit, Chai Wutiwiwatchai, Sadaoki Furui:

Towards better language modeling for Thai LVCSR. 1553-1556
Adaptation in ASR I, II
- Jonas Lööf, Ralf Schlüter, Hermann Ney:

Efficient estimation of speaker-specific projecting feature transforms. 1557-1560 - Mohamed Kamal Omar:

Regularized feature-based maximum likelihood linear regression for speech recognition. 1561-1564 - Santiago Omar Caballero Morales, Stephen J. Cox:

Modelling confusion matrices to improve speech recognition accuracy, with an application to dysarthric speech. 1565-1568 - Qiang Huo, Wei Li:

An active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability. 1569-1572 - Jing Zheng, Andreas Stolcke:

fMPE-MAP: improved discriminative adaptation for modeling new domains. 1573-1576 - Timothy J. Hazen, Erik McDermott:

Discriminative MCE-based speaker adaptation of acoustic models for a spoken lecture processing task. 1577-1580
Speech Perception I, II
- Douglas Brungart, Nandini Iyer:

Time-compressed speech perception with speech and noise maskers. 1581-1584 - Anne Cutler, Martin Cooke, María Luisa García Lecumberri, Dennis Pasveer:

L2 consonant identification in noise: cross-language comparisons. 1585-1588 - Jennifer T. Le, Catherine T. Best, Michael D. Tyler, Christian Kroos:

Effects of non-native dialects on spoken word recognition. 1589-1592 - Julien Meyer, Fanny Meunier, Laure Dentel:

Identification of natural whistled vowels by non-whistlers. 1593-1596 - Alexandra Jesse, James M. McQueen:

Prelexical adjustments to speaker idiosyncrasies: are they position-specific? 1597-1600 - Holger Mitterer:

Top-down effects on compensation for coarticulation are not replicable. 1601-1604
Spoken Language Understanding
- Christian Raymond, Giuseppe Riccardi:

Generative and discriminative algorithms for spoken language understanding. 1605-1608 - Elias Iosif, Alexandros Potamianos:

A soft-clustering algorithm for automatic induction of semantic classes. 1609-1612 - Agustín Gravano, Stefan Benus, Julia Hirschberg, Shira Mitchell, Ilia Vovsha:

Classification of discourse functions of affirmative words in spoken dialogue. 1613-1616 - Bogdan Minescu, Géraldine Damnati, Frédéric Béchet, Renato de Mori:

Conditional use of word lattices, confusion networks and 1-best string hypotheses in a sequential interpretation strategy. 1617-1620 - Jáchym Kolár, Yang Liu, Elizabeth Shriberg:

Speaker adaptation of language models for automatic dialog act segmentation of meetings. 1621-1624 - Amparo Albalate, Dimitar Dimitrov, Roberto Pieraccini:

Unsupervised categorisation approaches for technical support automated agents. 1625-1628
Pitch Extraction I, II
- Michael Wohlmayr, Marián Képesi:

Joint position-pitch extraction from multichannel audio. 1629-1632 - Hyun Soo Kim:

Morphological pre-processing technique and its applications on speech signal. 1633-1636 - Patricia A. Pelle, Claudio Estienne:

A pitch extraction system based on phase locked loops and consensus decision. 1637-1640 - Milan Legát, Jindrich Matousek, Daniel Tihelka:

A robust multi-phase pitch-mark detection algorithm. 1641-1644 - Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu, Md. Kamrul Hasan:

Pitch estimation of noisy speech signals using empirical mode decomposition. 1645-1648 - Daniel Hirst, Hyongsil Cho, Sunhee Kim, Hyunji Yu:

Evaluating two versions of the momel pitch modelling algorithm on a corpus of read speech in Korean. 1649-1652 - Hussein Hussein, Oliver Jokisch:

Hybrid electroglottograph and speech signal based algorithm for pitch marking. 1653-1656
Speech Coding and Transmission
- Saikat Chatterjee, Thippur V. Sreenivas:

Normalized two stage SVQ for minimum complexity wide-band LSF quantization. 1657-1660 - Peng Zhang, Changchun Bao:

A novel 2kb/s waveform interpolation speech coder based on non-negative matrix factorization. 1661-1664 - Ahmed Ismail, Yasser Dakroury, Hazem M. Abbas:

A novel energy distribution comparison approach for robust speech spectrum vector quantization. 1665-1668 - Ahmed Ismail, Yasser Dakroury, Hazem M. Abbas:

Novel low-band phase representation for low bit-rate speech coding. 1669-1672 - Chun-Feng Wu, Cheng-Lung Lee, Wen-Whei Chang:

Perceptual-based playout mechanisms for multi-stream voice over IP networks. 1673-1676 - Robert Zopf, Jes Thyssen, Juin-Hwey Chen:

Time-warping and re-phasing in packet loss concealment. 1677-1680 - Yannis Agiomyrgiannakis, Yannis Stylianou:

The harmonic model codec (HMC) framework for voIP. 1681-1684 - Yannis Agiomyrgiannakis, Yannis Stylianou:

Bit-erasure channel decoding for GMM-based multiple description coding. 1685-1688 - Hua Yuan, Tiago H. Falk, Wai-Yip Chan:

Degradation-classification assisted single-ended quality measurement of speech. 1689-1692 - Alexander Raake, Sascha Spors, Jens Ahrens, Jitendra Ajmera:

Concept and evaluation of a downward-compatible system for spatial teleconferencing using automatic speaker clustering. 1693-1696 - Min-Ki Lee, Kyung-Tae Kim, Hong-Goo Kang, Dae Hee Youn:

Speech quality estimation using packet loss effects in CELP-type speech coders. 1697-1700 - Masahiro Oshikiri, Hiroyuki Ehara, Toshiyuki Morii, Tomofumi Yamanashi, Kaoru Satoh, Koji Yoshida:

An 8-32 kbit/s scalable wideband coder extended with MDCT-based bandwidth extension on top of a 6.8 kbit/s narrowband CELP coder. 1701-1704
Topics in Acoustic Modeling
- Robert Wielgat, Tomasz P. Zielinski, Pawel Swietojanski, Piotr Zoladz, Daniel Król, Tomasz Wozniak, Stanislaw Grabias:

Comparison of HMM and DTW methods in automatic recognition of pathological phoneme pronunciation. 1705-1708 - Kai Yu, Mark J. F. Gales, Philip C. Woodland:

Unsupervised training with directed manual transcription for recognising Mandarin broadcast audio. 1709-1712 - Hao Wu, Xihong Wu:

Context dependent syllable acoustic model for continuous Chinese speech recognition. 1713-1716 - Dimitris Oikonomidis, Vassilios Diakoloukas, Vassilios Digalakis:

A sub-optimal viterbi-like search for linear dynamic models classification. 1717-1720 - Georg Heigold, Ralf Schlüter, Hermann Ney:

On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields. 1721-1724 - Stefano Scanzio, Pietro Laface, Roberto Gemello, Franco Mana:

Speeding-up neural network training using sentence and frame selection. 1725-1728 - Linquan Liu, Thomas Fang Zheng, Makoto Akabane, Ruxin Chen, Wenhu Wu:

Using a small development set to build a robust dialectal Chinese speech recognizer. 1729-1732
Confidence Measures (and Related Topics)
- Carlos Molina, Néstor Becerra Yoma, Fernando Huenupán, Claudio Garretón:

Unsupervised re-scoring of observation probability in viterbi based on reinforcement learning by using confidence measure and HMM neighborhood. 1733-1736 - Shiuan-Sung Lin, François Yvon:

Optimization on decoding graphs by discriminative training. 1737-1740 - Stéphane Huet, Guillaume Gravier, Pascale Sébillot:

Morphosyntactic processing of n-best lists for improved recognition and confidence measure computation. 1741-1744 - Xiang Li, Juan M. Huerta:

How predictable is ASR confidence in dialog applications? 1745-1748 - Alexandre Allauzen:

Error detection in confusion network. 1749-1752 - Takanobu Oba, Takaaki Hori, Atsushi Nakamura:

An approach to efficient generation of high-accuracy and compact error-corrective models for speech recognition. 1753-1756 - Hamed Ketabdar, Mirko Hannemann, Hynek Hermansky:

Detection of out-of-vocabulary words in posterior based ASR. 1757-1760
Grapheme-to-Phoneme Conversion
- Daniela Braga, Luís Pinto Coelho, Fernando Gil Vianna Resende Jr.:

Homograph ambiguity resolution in front-end design for portuguese TTS systems. 1761-1764 - Ghinwa F. Choueiter, Stephanie Seneff, James R. Glass:

New word acquisition using subword modeling. 1765-1768 - Samuel Thomas, Ashish Verma:

Language identification of person names using CF-IOF based weighing function. 1769-1772 - Henk van den Heuvel, Jean-Pierre Martens, Nanneke Konings:

G2p conversion of names: what can we do (better)? 1773-1776 - Ausdang Thangthai, Chai Wutiwiwatchai, Anocha Rugchatjaroen, Sittipong Saychum:

A learning method for Thai phonetization of English words. 1777-1780 - Steffen Werner, Rüdiger Hoffmann:

Spontaneous speech synthesis by pronunciation variant selection - a comparison to natural speech. 1781-1784 - Nikos Tsourakis

, Vassilios Digalakis:
A generic methodology of converting transliterated text to phonetic strings case study: greeklish. 1785-1788 - Rita Singh, Evandro B. Gouvêa, Bhiksha Raj:

Probabilistic deduction of symbol mappings for extension of lexicons. 1789-1792
Lexical and Prosodic Modeling
- Sergey Astrov, Joachim Hofer, Harald Höge:

Use of syllable center detection for improved duration modeling in Chinese Mandarin connected digits recognition. 1793-1796 - Thomas Pellegrini, Lori Lamel:

Using phonetic features in unsupervised word decompounding for ASR with application to a less-represented language. 1797-1800 - Sheng Qiang, Yao Qian, Frank K. Soong, Congfu Xu:

Robust F0 modeling for Mandarin speech recognition in noise. 1801-1804 - Dino Seppi, Daniele Falavigna, Georg Stemmer, Roberto Gretter:

Word duration modeling for word graph rescoring in LVCSR. 1805-1808 - Fabio Tamburini, Petra Wagner:

On automatic prominence detection for German. 1809-1812 - Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan:

Prosody-enriched lattices for improved syllable recognition. 1813-1816 - Joel Pinto, Andrew Lovitt, Hynek Hermansky:

Exploiting phoneme similarities in hybrid HMM-ANN keyword spotting. 1817-1820 - C. E. Liu, Kishan Thambiratnam, Frank Seide:

Online vocabulary adaptation using limited adaptation data. 1821-1824
Speech Recognition by Automatic Attribute Transcription
- Chin-Hui Lee, Mark A. Clements, Sorin Dusan, Eric Fosler-Lussier, Keith Johnson, Biing-Hwang Juang, Lawrence R. Rabiner:

An overview on automatic speech attribute transcription (ASAT). 1825-1828 - Ilana Bromberg, Qian Qian, Jun Hou, Jinyu Li, Chengyuan Ma, Brett Matthews, Antonio Moreno-Daniel, Jeremy Morris, Sabato Marco Siniscalchi, Yu Tsao, Yu Wang:

Detection-based ASR in the automatic speech attribute transcription project. 1829-1832 - Chi-Yueh Lin, Hsiao-Chuan Wang:

Attribute-based Mandarin speech recognition using conditional random fields. 1833-1836 - Helmer Strik

, Khiet P. Truong, Febe de Wet, Catia Cucchiarini:
Comparing classifiers for pronunciation error detection. 1837-1840 - Jarek Krajewski, Bernd J. Kröger:

Using prosodic and spectral characteristics for sleepiness detection. 1841-1844 - Brian M. Ore, Raymond E. Slyh:

Score fusion for articulatory feature detection. 1845-1848
Speaker Diarization
- Scott Otterson:

Improved location features for meeting speaker diarization. 1849-1852 - Kyu Jeong Han, Shrikanth S. Narayanan:

A robust stopping criterion for agglomerative hierarchical clustering in a speaker diarization system. 1853-1856 - Marijn Huijbregts, Chuck Wooters

:
The blame game: performance analysis of speaker diarization system components. 1857-1860 - Hagai Aronowitz:

Trainable speaker diarization. 1861-1864 - Jing Huang, Etienne Marcheret, Karthik Visweswariah:

Improving speaker diarization for CHIL lecture meetings. 1865-1868 - Viet Bac Le, Odile Mella, Dominique Fohr:

Speaker diarization using normalized cross likelihood ratio. 1869-1872
First and Second Language Learning
- Wai-Sum Lee:

Tone production by the speakers of different age-and-gender groups. 1873-1876 - Nan Xu, Denis Burnham, Christine Kitamura:

Vowels and tones in infant directed speech: hyperarticulation for both, but different developmental patterns. 1877-1880 - Eon-Suk Ko:

Acquisition of vowel duration in children speaking american English. 1881-1884 - Hiroko Hirano, Keikichi Hirose, Goh Kawai, Wentao Gu, Nobuaki Minematsu:

F0 models show Chinese speakers of Japanese insert intonational boundaries and drop pitch. 1885-1888 - Paola Escudero, Jelle Kastelein, Klara A. Weiand, R. J. J. H. van Son:

Formal modelling of L1 and L2 perceptual learning: computational linguistics versus machine learning. 1889-1892 - Mirjam Broersma:

Kettle hinders cat, shadow does not hinder shed: activation of 'almost embedded' words in nonnative listening. 1893-1896
Speech Synthesis I, II
- Sacha Krstulovic, Anna Hunecke, Marc Schröder:

An HMM-based speech synthesis system applied to German and its adaptation to a limited set of expressive football announcements. 1897-1900 - Liang Gu, Wei Zhang, Lazkin Tahir, Yuqing Gao:

Statistical vowelization of Arabic text for speech synthesis in speech-to-speech translation systems. 1901-1904 - Wu Liu, Dezhi Huang, Yuan Dong, Xinnian Mao, Haila Wang:

A pair-based language model for the robust lexical analysis in Chinese text-to-speech synthesis. 1905-1908 - Ranniery Maia, Tomoki Toda, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda:

A trainable excitation model for HMM-based speech synthesis. 1909-1912 - Jochen Steigner, Marc Schröder:

Cross-language phonemisation in German text-to-speech synthesis. 1913-1916 - Ryuki Tachibana, Tohru Nagano, Gakuto Kurata, Masafumi Nishimura, Noboru Babaguchi:

Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone. 1917-1920
Phonetic Segmentation and Classification I, II
- Xiaochuan Niu, Jan P. H. van Santen:

Dual-channel acoustic detection of nasalization states. 1921-1924 - Tarun Pruthi, Carol Y. Espy-Wilson:

Acoustic parameters for the automatic detection of vowel nasalization. 1925-1928 - Jun Hou, Lawrence R. Rabiner, Sorin Dusan:

On the use of time-delay neural networks for highly accurate classification of stop consonants. 1929-1932 - Ladan Golipour, Douglas D. O'Shaughnessy:

A new approach for phoneme segmentation of speech signals. 1933-1936 - Veronique Stouten, Kris Demuynck, Hugo Van hamme

:
Automatically learning the units of speech by non-negative matrix factorisation. 1937-1940 - Ozlem Kalinli, Shrikanth S. Narayanan:

A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech. 1941-1944 - Sung Jun An, Young-Ik Kim, Rhee Man Kil:

Zero-crossing-based ratio masking for sound segregation. 1945-1948 - Satomi Tanaka, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka:

Event detection of speech signals based on auditory processing with a dynamic compressive gammachirp filterbank. 1949-1952 - Odette Scharenborg, Mirjam Ernestus, Vincent Wan:

Segmentation of speech: child's play? 1953-1956 - Andrew Errity, John McKenna, Barry Kirkpatrick:

Dimensionality reduction methods applied to both magnitude and phase derived features. 1957-1960
Voice Conversion and Modification
- Zdenek Hanzlícek, Jindrich Matousek:

F0 transformation within the voice conversion framework. 1961-1964 - Daniel Erro, Asunción Moreno:

Weighted frequency warping for voice conversion. 1965-1968 - Daniel Erro, Asunción Moreno:

Frame alignment method for cross-lingual voice conversion. 1969-1972 - Jani Nurminen, Jilei Tian, Victor Popa:

Voicing level control with application in voice conversion. 1973-1976 - Winston S. Percybrooks, Elliot Moore:

New algorithm for LPC residual estimation from LSF vectors for a voice conversion system. 1977-1980 - Yamato Ohtani, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:

Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model. 1981-1984 - Petko Nikolov Petkov, W. Bastiaan Kleijn

:
Improving the phase vocoder approach to pitch-shifting. 1985-1988 - Larbi Mesbahi, Vincent Barreaud, Olivier Boëffard:

Comparing GMM-based speech transformation systems. 1989-1992
Speaker Verification & Identification I-IV
- Michael Gerber, René Beutler, Beat Pfister:

Quasi text-independent speaker-verification based on pattern matching. 1993-1996 - Yosef A. Solewicz, Moshe Koppel:

Virtual fusion for speaker recognition. 1997-2000 - Yi-Hsiang Chao, Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang, Ruei-Chuan Chang:

Evolutionary minimum verification error learning of the alternative hypothesis model for LLR-based speaker verification. 2001-2004 - Seiichi Nakagawa, Kouhei Asakawa, Longbiao Wang:

Speaker recognition by combining MFCC and phase information. 2005-2008 - Sandeep Manocha, Carol Y. Espy-Wilson:

A semi-automatic approach for speaker mining of tapped telephone conversations. 2009-2012 - Hao Yang, Yuan Dong, Xianyu Zhao, Jian Zhao, Liang Lu, Haila Wang:

Cluster adaptive training weights as features in SVM-based speaker verification. 2013-2016 - Hideki Okamoto

, Mariko Kojima, Tomoko Matsui, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano:
Study on speaker verification with non-audible murmur segments. 2017-2020 - Xugang Lu, Jianwu Dang:

Dimension reduction for speaker identification based on mutual information. 2021-2024 - Jonas Lindh, Anders Eriksson:

Robustness of long time measures of fundamental frequency. 2025-2028 - Vinod Prakash, John H. L. Hansen:

Score distribution scaling for speaker recognition. 2029-2032 - Andrew C. Morris, Jacques C. Koreman, B. Ly-Van, Harin Sellahewa, Sabah Jassim, R. Llarena Gómez:

Global features for rapid identity verification with dynamic biometric data. 2033-2036 - Tuan Van Pham, Michael Neffe, Gernot Kubin:

Robust voice activity detection for narrow-bandwidth speaker verification under adverse environments. 2037-2040 - Fernando Huenupán, Néstor Becerra Yoma, Carlos Molina, Claudio Garretón:

Speaker verification with multiple classifier fusion using Bayes based confidence measure. 2041-2044 - Girija Chetty, Michael Wagner:

Audiovisual speaker identity verification based on lip motion features. 2045-2048 - Gökhan Tür, Elizabeth Shriberg, Andreas Stolcke, Sachin S. Kajarekar:

Duration and pronunciation conditioned lexical modeling for speaker verification. 2049-2052 - Jean-François Bonastre, Driss Matrouf, Corinne Fredouille:

Artificial impostor voice transformation effects on false acceptance rates. 2053-2056
Improved Acoustic Modeling for ASR
- Jen-Wei Kuo, Hung-Yi Lo, Hsin-Min Wang:

Improved HMM/SVM methods for automatic phoneme segmentation. 2057-2060 - Takahiro Shinozaki, Tatsuya Kawahara:

Gaussian mixture optimization for HMM based on efficient cross-validation. 2061-2064 - Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda:

Model-space MLLR for trajectory HMMs. 2065-2068 - Hamed Ketabdar, Hervé Bourlard:

In-context phone posteriors as complementary features for tandem ASR. 2069-2072 - Qian Qian, Xiaodong He, Li Deng:

Phone-discriminating minimum classification error (p-MCE) training for phonetic recognition. 2073-2076 - Lori Lamel, Abdelkhalek Messaoudi, Jean-Luc Gauvain:

Improved acoustic modeling for transcribing Arabic broadcast data. 2077-2080 - Erik McDermott, Atsushi Nakamura:

String and lattice based discriminative training for the corpus of spontaneous Japanese lecture transcription task. 2081-2084 - Byung Ok Kang, Ho-Young Jung, Yunkeun Lee:

Discriminative noise adaptive training approach for an environment migration. 2085-2088 - Jia-Yu Chen, Peder A. Olsen, John R. Hershey:

Word confusability - measuring hidden Markov model similarity. 2089-2092 - Thomas Deselaers, Georg Heigold, Hermann Ney:

Speech recognition with state-based nearest neighbour classifiers. 2093-2096 - Remco Teunen, Masami Akamine:

HMM-based speech recognition using decision trees instead of GMMs. 2097-2100 - Christian Gollan, Stefan Hahn, Ralf Schlüter, Hermann Ney:

An improved method for unsupervised training of LVCSR systems. 2101-2104 - Mohamed Kamal Omar:

A variational approach to robust maximum likelihood estimation for speech recognition. 2105-2108 - Kai Yu, Rob A. Rutenbar:

Generating small, accurate acoustic models with a modified Bayesian information criterion. 2109-2112 - Peter Bell, Simon King:

Sparse Gaussian graphical models for speech recognition. 2113-2116 - Sakriani Sakti, Konstantin Markov, Satoshi Nakamura:

An HMM acoustic model incorporating various additional knowledge sources. 2117-2120 - Matti Varjokallio, Mikko Kurimo:

Comparison of subspace methods for Gaussian mixture models in speech recognition. 2121-2124
Multilingualism in Speech and Language Processing
- Tanja Schultz, Alan W. Black, Sameer Badaskar, Matthew Hornyak, John Kominek:

SPICE: web-based tools for rapid language adaptation in speech processing systems. 2125-2128 - Filip Deprez, Jan Odijk, Jan De Moortel:

Introduction to multilingual corpus-based concatenative speech synthesis. 2129-2132 - Frederik Stouten, Jean-Pierre Martens:

Recognition of foreign names spoken by native speakers. 2133-2136 - Ricardo de Córdoba, Luis Fernando D'Haro, Fernando Fernández Martínez

, Juan Manuel Montero, Roberto Barra-Chicote:
Language identification using several sources of information with a multiple-Gaussian classifier. 2137-2140 - Carmen del Solar, Guillermo Pérez García, Eva Florencio, David Moral, Gabriel Amores Carredano, Pilar Manchón Portillo:

Dynamic language change in MIMUS. 2141-2144
Systems for LVCSR and Rich Transcription I, II
- Jonas Lööf, Christian Gollan, Stefan Hahn, Georg Heigold, Björn Hoffmeister, Christian Plahl, David Rybach, Ralf Schlüter, Hermann Ney:

The RWTH 2007 TC-STAR evaluation system for european English and Spanish. 2145-2148 - Chin-Wei Eugene Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Engsiong Chng, Haizhou Li, Susanto Rahardja:

Using direction of arrival estimate and acoustic feature information in speaker diarization. 2149-2152 - Fernando Batista, Diamantino Caseiro, Nuno J. Mamede, Isabel Trancoso:

Recovering punctuation marks for automatic speech recognition. 2153-2156 - Jui-Feng Yeh, Chung-Hsien Wu, Wei-Yen Wu:

Disfluency correction of spontaneous speech using conditional random fields with variable-length features. 2157-2160 - Jing Huang, Etienne Marcheret, Karthik Visweswariah, Vit Libal, Gerasimos Potamianos:

Detection, diarization, and transcription of far-field lecture speech. 2161-2164 - Timothy J. Hazen, Brennan Sherry, Mark Adler:

Speech-based annotation and retrieval of digital photographs. 2165-2168
Language Learning and Assessment
- Joseph Tepperman, Abe Kazemzadeh, Shrikanth S. Narayanan:

A text-free approach to assessing nonnative intonation. 2169-2172 - John Lee, Stephanie Seneff:

Automatic generation of cloze items for prepositions. 2173-2176 - Christopher J. Waple, Hongcui Wang, Tatsuya Kawahara, Yasushi Tsubota, Masatake Dantsuji:

Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance. 2177-2180 - Catia Cucchiarini, Ambra Neri, Febe de Wet, Helmer Strik:

ASR-based pronunciation training: scoring accuracy and pedagogical effectiveness of a system for dutch L2 learners. 2181-2184 - Joseph Tepperman, Matthew Black, Patti Price, Sungbok Lee, Abe Kazemzadeh, Matteo Gerosa, Margaret Heritage, Abeer Alwan, Shrikanth S. Narayanan:

A Bayesian network classifier for word-level reading assessment. 2185-2188
Multimodal Interaction: Analysis and Technology
- Hartwig Holzapfel, Alex Waibel:

Behavior models for learning and receptionist dialogs. 2189-2192 - Markku Turunen, Jaakko Hakulinen, Anssi Kainulainen, Aleksi Melto, Topi Hurtig:

Design of a rich multimodal interface for mobile spoken route guidance. 2193-2196 - Mariët Theune, Dennis Hofs, Marco van Kessel:

The virtual guide: a direction giving embodied conversational agent. 2197-2200 - Sudeep Gandhe, David R. Traum:

Creating spoken dialogue characters from corpora without annotations. 2201-2204 - Pui-Yu Hui, Zhengyu Zhou, Helen M. Meng:

Complementarity and redundancy in multimodal user inputs with speech and pen gestures. 2205-2208 - Linda Bell, Joakim Gustafson:

Children's convergence in referring expressions to graphical objects in a speech-enabled computer game. 2209-2212
Emotion
- Hiromi Kawatsu, Sumio Ohno:

An analysis of individual differences in the f0 contour and the duration of anger utterances at several degrees. 2213-2216 - Yoshiko Arimoto

, Sumio Ohno, Hitoshi Iida:
Acoustic features of anger utterances during natural dialog. 2217-2220 - Fadi Biadsy, Julia Hirschberg, Andrew Rosenberg, Wisam Dakka:

Comparing american and palestinian perceptions of charisma using acoustic-prosodic and lexical analysis. 2221-2224 - Carlos Busso, Sungbok Lee, Shrikanth S. Narayanan:

Using neutral speech models for emotional speech analysis. 2225-2228 - N. Satoh, Katsuya Yamauchi, Shoichi Matsunaga, Masaru Yamashita, R. Nakagawa, Kazuyuki Shinohara:

Emotion clustering using the results of subjective opinion tests for emotion recognition in infants' cries. 2229-2232 - Roberto Barra-Chicote, Juan Manuel Montero, Javier Macías Guarasa, Juana M. Gutiérrez-Arriola, Javier Ferreiros, José Manuel Pardo:

On the limitations of voice conversion techniques in emotion identification tasks. 2233-2236 - Kate Dupuis, Kathleen Pichora-Fuller:

Use of lexical and affective prosodic cues to emotion by younger and older adults. 2237-2240 - Purnima Gupta, Nitendra Rajput:

Two-stream emotion recognition for call center monitoring. 2241-2244 - Ioulia Grichkovtsova, Anne Lacheret, Michel Morel:

The role of intonation and voice quality in the affective speech perception. 2245-2248 - Bogdan Vlasenko, Björn W. Schuller, Andreas Wendemuth, Gerhard Rigoll:

Combining frame and turn-level information for robust recognition of emotions within speech. 2249-2252
Speakers: Expression, Emotion and Personality Recognition
- Björn W. Schuller, Anton Batliner, Dino Seppi, Stefan Steidl, Thurid Vogt, Johannes Wagner, Laurence Devillers, Laurence Vidrascu, Noam Amir, Loïc Kessous, Vered Aharonson:

The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals. 2253-2256 - Minh-Quang Vu, Laurent Besacier, Eric Castelli:

Automatic question detection: prosodic-lexical features and crosslingual experiments. 2257-2260 - Makoto Tachibana, Keigo Kawashima, Junichi Yamagishi, Takao Kobayashi:

Performance evaluation of HMM-based style classification with a small amount of training data. 2261-2264 - Khiet P. Truong, David A. van Leeuwen:

Visualizing acoustic similarities between emotions in speech: an acoustic map of emotions. 2265-2268 - Hao Hu, Ming-Xing Xu, Wei Wu:

Fusion of global statistical and segmental spectral features for speech emotion recognition. 2269-2272 - Vidhyasaharan Sethu, Eliathamby Ambikairajah, Julien Epps:

Group delay features for emotion detection. 2273-2276 - Christian A. Müller, Felix Burkhardt:

Combining short-term cepstral and long-term pitch features for automatic recognition of speaker age. 2277-2280 - Frank Enos, Elizabeth Shriberg, Martin Graciarena, Julia Hirschberg, Andreas Stolcke:

Detecting deception using critical segments. 2281-2284 - Takashi Nose, Yoichi Kato, Takao Kobayashi:

Style estimation of speech based on multiple regression hidden semi-Markov model. 2285-2288 - Chi Zhang, John H. L. Hansen:

Analysis and classification of speech mode: whispered through shouted. 2289-2292
First Language, Second Language, Cross-language
- Melissa Bettoni-Techio, Andréia S. Rauber, Rosana Denise Koerich:

Perception and production of word-final alveolar stops by brazilian portuguese learners of English. 2293-2296 - Denise Cristina Kluge, Andréia S. Rauber, Mara Silvia Reis, Ricardo Augusto Hoffmann Bion:

The relationship between the perception and production of English nasal codas by brazilian learners of English. 2297-2300 - Takafumi Utashiro, Goh Kawai:

CALL courseware for learning reactive tokens in face-to-face dialogs. 2301-2304 - Shinya Kiriyama, Ryo Tsuji, Tomohiko Kasami, Shogo Ishikawa, Naofumi Otani, Hiroaki Horiuchi, Yoichi Takebayashi, Shigeyoshi Kitazawa:

The developmental analysis of demonstrative expression skills utilizing a multimodal infant behavior corpus. 2305-2308 - Elena E. Lyakso, Olga V. Frolova:

Russian vowels system acoustic features development in ontogenesis. 2309-2312 - Petra van Alphen, Elise de Bree, Paula Fikkert, Frank Wijnen:

The role of metrical stress in comprehension and production in dutch children at-risk of dyslexia. 2313-2316 - Seiichi Nakagawa, Kei Ohta:

A statistical method of evaluating pronunciation proficiency for presentation in English. 2317-2320 - Akiyo Joto, Yoshiki Nagase, Seiya Funatsu:

The intelligibility and its relations to acoustic characteristics of English /s/ and /esh/ produced by native speakers of Japanese. 2321-2324 - Martijn Goudbeek, Daniel Swingley, Keith R. Kluender:

The limits of multidimensional category learning. 2325-2328 - Maria Uther, James Uther, Panos Athanasopoulos, Pushpendra Singh, Reiko Akahane-Yamada:

Mobile adaptive CALL (MAC): a lightweight speech-based intervention for mobile language learners. 2329-2332 - Catherine T. Best, Pierre A. Hallé, Jennifer S. Pardo:

English and French speakers' perception of voicing distinctions in non-native lateral consonant syllable onsets. 2333-2336 - Francisco Lacerda, Lisa Gustavsson:

Predicting the consequences of vocalizations in early infancy. 2337-2340 - David Weenink, Guangqin Chen, Zongyan Chen, Stefan de Konink, Dennis Vierkant, Eveline van Hagen, R. J. J. H. van Son:

Learning tone distinctions for Mandarin Chinese. 2341-2344 - Catherine Lai, Kyle Gorman, Jiahong Yuan, Mark Y. Liberman:

Perception of disfluency: language differences and listener bias. 2345-2348
Language Modeling I, II
- Hiroki Yamazaki, Koji Iwano, Koichi Shinoda, Sadaoki Furui, Haruo Yokota:

Dynamic language model adaptation using presentation slides for lecture speech recognition. 2349-2352 - Cosmin Munteanu, Gerald Penn

, Ronald Baecker:
Web-based language modelling for automatic lecture transcription. 2353-2356 - Tanel Alumäe, Toomas Kirt:

LSA-based language model adaptation for highly inflected languages. 2357-2360 - Aaron Heidel, Hung-An Chang, Lin-Shan Lee:

Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm. 2361-2364 - Sibel Yaman, Jen-Tzung Chien, Chin-Hui Lee:

Structural Bayesian language modeling and adaptation. 2365-2368 - Ciro Martins, António J. S. Teixeira, João Paulo Neto:

Vocabulary selection for a broadcast news transcription system using a morpho-syntactic approach. 2369-2372 - Nguyen Bach, Mohamed Noamany, Ian R. Lane, Tanja Schultz:

Handling OOV words in Arabic ASR via flexible morphological constraints. 2373-2376 - Raquel Justo, M. Inés Torres:

Phrases in category-based language models for Spanish and basque ASR. 2377-2380 - Ebru Arisoy, Hasim Sak, Murat Saraclar:

Language modeling for automatic turkish broadcast news transcription. 2381-2384
Spoken Data Retrieval I, II
- Roy Wallace, Robbie Vogt, Sridha Sridharan:

A phonetic search approach to the 2006 NIST spoken term detection evaluation. 2385-2388 - Yoshiaki Itoh, Kohei Iwata, Kazunori Kojima, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee:

An integration method of retrieval results using plural subword models for vocabulary-free spoken document retrieval. 2389-2392 - Dimitra Vergyri, Izhak Shafran, Andreas Stolcke, Venkata Ramana Rao Gadde, Murat Akbacak, Brian Roark, Wen Wang:

The SRI/OGI 2006 spoken term detection system. 2393-2396 - Masataka Goto

, Jun Ogata, Kouichirou Eto:
Podcastle: a web 2.0 approach to speech recognition research. 2397-2400 - Nathalie Camelin, Frédéric Béchet, Géraldine Damnati, Renato de Mori:

Speech mining in noisy audio message corpus. 2401-2404 - Jian Shao, Qingwei Zhao, Pengyuan Zhang, Zhaojie Liu, Yonghong Yan:

A fast fuzzy keyword spotting algorithm based on syllable confusion network. 2405-2408 - Wooil Kim, John H. L. Hansen:

Advances in speechfind: transcript reliability estimation employing confidence measure based on discriminative sub-word model for SDR. 2409-2412 - Benoît Favre, Jean-François Bonastre, Patrice Bellot:

An interactive timeline for speech database browsing. 2413-2416
Novel Techniques for the NATO Non-native Air-traffic Control and HIWIRE Cockpit Databases
- Stéphane Pigeon, Wade Shen, Aaron D. Lawson, David A. van Leeuwen:

Design and characterization of the non-native military air traffic communications database (nnMATC). 2417-2420 - Wade Shen, Douglas A. Reynolds:

A comparison of speaker clustering and speech recognition techniques for air situational awareness. 2421-2424 - Dimitrios Dimitriadis, José C. Segura, Luz García, Alexandros Potamianos, Petros Maragos, Vassilis Pitsikalis:

Advanced front-end for robust speech recognition in extremely adverse environments. 2425-2428 - Roberto Gemello, Franco Mana, Stefano Scanzio:

Experiments on hiwire database using denoising and adaptation with a hybrid HMM-ANN model. 2429-2432 - Brett Y. Smolenski:

Detection and removal of switching noise in push-to-talk and voice operated exchange communications systems. 2433-2436 - Luis Buera, Antonio Miguel, Oscar Saz, Eduardo Lleida, Alfonso Ortega

:
Evaluation of the combined use of MEMLIN and MLLR on the non-native adaptation task of hiwire project database. 2437-2440
Systems for Spoken Language Translation I, II
- Daniel Déchelotte, Holger Schwenk, Gilles Adda, Jean-Luc Gauvain:

Improved machine translation of speech-to-text outputs. 2441-2444 - Shirin Saleem, Krishna Subramanian, Rohit Prasad, David Stallard, Chia-Lin Kao, Prem Natarajan, Raid Suleiman:

Improvements in machine translation for English/iraqi speech translation. 2445-2448 - Evgeny Matusov, Dustin Hillard, Mathew Magimai-Doss

, Dilek Hakkani-Tür, Mari Ostendorf, Hermann Ney:
Improving speech translation with automatic boundary prediction. 2449-2452 - Roldano Cattoni, Nicola Bertoldi, Marcello Federico:

Punctuating confusion networks for speech translation. 2453-2456 - Aarthi M. Reddy, Richard C. Rose, Alain Désilets:

Integration of ASR and machine translation models in a document translation task. 2457-2460 - Yik-Cheung Tam, Tanja Schultz:

Bilingual LSA-based translation lexicon adaptation for spoken language translation. 2461-2464
Articulatory Features
- Korin Richmond:

A multitask learning perspective on acoustic-articulatory inversion. 2465-2468 - Chao Qin, Miguel Á. Carreira-Perpiñán:

A comparison of acoustic features for articulatory inversion. 2469-2472 - Odette Scharenborg, Vincent Wan:

Can unquantised articulatory feature continuums be modelled? 2473-2476 - Milind S. Shah, Prem C. Pandey:

Estimation of place of articulation in stop consonants for visual feedback. 2477-2480 - Blaise Potard, Yves Laprie:

Compact representations of the articulatory-to-acoustic mapping. 2481-2484 - Joe Frankel, Mathew Magimai-Doss

, Simon King, Karen Livescu, Özgür Çetin:
Articulatory feature classifiers trained on 2000 hours of telephone speech. 2485-2488
Wideband Speech Processing
- Amr H. Nour-Eldin, Peter Kabal:

Objective analysis of the effect of memory inclusion on bandwidth extension of narrowband speech. 2489-2492 - Bernd Geiser, Hervé Taddei, Peter Vary:

Artificial bandwidth extension without side information for ITU-t g.729.1. 2493-2496 - Hannu Pulakka, Paavo Alku, Laura Laaksonen, Päivi Valve:

The effect of highband harmonic structure in the artificial bandwidth expansion of telephone speech. 2497-2500 - Shingo Kuroiwa, Masashi Takashina, Satoru Tsuge, Fuji Ren:

Artificial bandwidth extension for speech signals using speech recogniton. 2501-2504 - Driss Guerchi, Tamer Rabie, Abdelrhani Louzi:

Voicing-based codebook in low-rate wideband CELP coding. 2505-2508 - Ethan Robert Duni, Bhaskar D. Rao:

Performance of speaker-dependent wideband speech coding. 2509-2512
Accessibility Issues
- Philippe Dreuw, David Rybach, Thomas Deselaers, Morteza Zahedi, Hermann Ney:

Speech recognition techniques for a sign language recognition system. 2513-2516 - Keigo Nakamura, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:

Impact of various small sound source signals on voice conversion accuracy in speech communication aid for laryngectomees. 2517-2520 - Petr Cerva, Jan Nouza:

Design and development of voice controlled aids for motor-handicapped persons. 2521-2524 - Kouichi Katsurada, Yuji Okuma, Makoto Yano, Yurie Iribe, Tsuneo Nitta:

Management of static/dynamic properties in a multimodal interaction system. 2525-2528 - Rubén San Segundo, Alicia Pérez, Daniel Ortiz, Luis Fernando D'Haro, M. Inés Torres, Francisco Casacuberta:

Evaluation of alternatives on speech to sign language translation. 2529-2532 - Géza Németh, Gábor Olaszy, Mátyás Bartalis, Géza Kiss, Csaba Zainkó, Péter Mihajlik:

Speech based drug information system for aged and visually impaired persons. 2533-2536 - Waldo Nogueira, Tamás Harczos, Bernd Edler, Jörn Ostermann, Andreas Büchner:

Automatic speech recognition with a cochlear implant front-end. 2537-2540 - Soo-Young Suk, Hiroaki Kojima:

Voice activated powered wheelchair with non-voice rejection algorithm. 2541-2544 - Laurianne Sitbon, Patrice Bellot, Philippe Blache:

Phonetic based sentence level rewriting of questions typed by dyslexic spellers in an information retrieval context. 2545-2548
New Application Areas
- André Berton, Peter Regel-Brietzmann, Hans Ulrich Block, Stefanie Schachtl, Manfred Gehrke:

How to integrate speech-operated internet information dialogs into a car. 2549-2552 - James R. Glass, Timothy J. Hazen, D. Scott Cyphers, Igor Malioutov, David Huynh, Regina Barzilay:

Recent progress in the MIT spoken lecture processing project. 2553-2556 - Philipp Fischer, Andreas Österle, André Berton, Peter Regel-Brietzmann:

How to personalize speech applications for web-based information in a car. 2557-2560 - Satoshi Ikeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno:

Topic estimation with domain extensibility for guiding user's out-of-grammar utterances in multi-domain spoken dialogue systems. 2561-2564 - Ryota Nishimura, Norihide Kitaoka, Seiichi Nakagawa:

Prosody change and response timing analysis in spontaneously spoken dialogs and their modeling in a spoken dialog system. 2565-2568 - Satoshi Tamura, Kunihiko Takamatsu, Shinji Ogura, Satoru Hayamizu:

GEMSIS - a novel application of speech recognition to emergency and disaster medicine. 2569-2572 - Rachel Coulston, Esther Klabbers, Jacques de Villiers, John-Paul Hosom:

Application of speech technology in a home based assessment kiosk for early detection of alzheimer's disease. 2573-2576 - Olga Vybornova, Monica Gemo, Ronald Moncarey, Benoît Macq:



Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID