


Остановите войну!
for scientists:


default search action
INTERSPEECH 2010: Makuhari, Japan
- Takao Kobayashi, Keikichi Hirose, Satoshi Nakamura:
INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010. ISCA 2010
Keynotes
- Steve J. Young:
Still talking to machines (cognitively speaking). 1-10 - Tohru Ifukube:
Sound-based assistive technology supporting "seeing", "hearing" and "speaking" for the disabled and the elderly. 11-19 - Chiu-yu Tseng:
Beyond sentence prosody. 20-29
Special Session: Models of Speech - In Search of Better Representations
- Hosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Y. Espy-Wilson, Mark Hasegawa-Johnson:
A procedure for estimating gestural scores from natural speech. 30-33 - Yen-Liang Shue, Gang Chen, Abeer Alwan:
On the interdependencies between voice quality, glottal gaps, and voice-source related acoustic measures. 34-37 - Hideki Kawahara, Masanori Morise, Toru Takahashi, Hideki Banno, Ryuichi Nisimura, Toshio Irino:
Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems. 38-41 - Sadao Hiroya, Takemi Mochida:
Phase equalization-based autoregressive model of speech signals. 42-45 - Yi Xu, Santitham Prom-on:
Articulatory-functional modeling of speech prosody: a review. 46-49 - Humberto M. Torres, Hansjörg Mixdorff, Jorge A. Gurlekian, Hartmut R. Pfitzinger:
Two new estimation methods for a superpositional intonation model. 50-53
ASR: Acoustic Models I-III
- Simon Wiesler, Georg Heigold, Markus Nußbaum-Thom, Ralf Schlüter, Hermann Ney:
A discriminative splitting criterion for phonetic decision trees. 54-57 - Mark J. F. Gales, Kai Yu:
Canonical state models for automatic speech recognition. 58-61 - Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen:
Restructuring exponential family mixture models. 62-65 - Françoise Beaufays, Vincent Vanhoucke, Brian Strope:
Unsupervised discovery and training of maximally dissimilar cluster models. 66-69 - Khe Chai Sim:
Probabilistic state clustering using conditional random field for context-dependent acoustic modelling. 70-73 - Xie Sun, Yunxin Zhao:
Integrate template matching and statistical modeling for speech recognition. 74-77 - George Saon, Hagen Soltau:
Boosting systems for LVCSR. 1341-1344 - Vaibhava Goel, Tara N. Sainath, Bhuvana Ramabhadran, Peder A. Olsen, David Nahamoo, Dimitri Kanevsky:
Incorporating sparse representation phone identification features in automatic speech recognition using exponential families. 1345-1348 - Xin Chen, Yunxin Zhao:
Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling. 1349-1352 - Jui-Ting Huang, Mark Hasegawa-Johnson:
Semi-supervised training of Gaussian mixture models by conditional entropy minimization. 1353-1356 - Guangchuan Shi, Yu Shi, Qiang Huo:
A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSR. 1357-1360 - Roger Hsiao, Florian Metze, Tanja Schultz:
Improvements to generalized discriminative feature transformation for speech recognition. 1361-1364 - Karel Veselý, Lukás Burget, Frantisek Grézl:
Parallel training of neural networks for speech recognition. 2934-2937 - Rita Singh, Benjamin Lambert, Bhiksha Raj:
The use of sense in unsupervised training of acoustic models for ASR systems. 2938-2941 - Jun Du, Yu Hu, Hui Jiang:
Boosted mixture learning of Gaussian mixture HMMs for speech recognition. 2942-2945 - Volker Leutnant, Reinhold Haeb-Umbach:
On the exploitation of hidden Markov models and linear dynamic models in a hybrid decoder architecture for continuous speech recognition. 2946-2949 - Alberto Abad, Thomas Pellegrini, Isabel Trancoso, João Paulo Neto:
Context dependent modelling approaches for hybrid speech recognizers. 2950-2953 - Yotaro Kubo, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi:
A regularized discriminative training method of acoustic models derived by minimum relative entropy discrimination. 2954-2957 - Hank Liao, Christopher Alberti, Michiel Bacchiani, Olivier Siohan:
Decision tree state clustering with word and syllable features. 2958-2961 - Hiroshi Fujimura, Takashi Masuko, Mitsuyoshi Tachimori:
A duration modeling technique with incremental speech rate normalization. 2962-2965 - Martin Wöllmer, Yang Sun, Florian Eyben, Björn W. Schuller
:
Long short-term memory networks for noise robust speech recognition. 2966-2969 - Tsuneo Nitta, Takayuki Onoda, Masashi Kimura, Yurie Iribe, Kouichi Katsurada:
One-model speech recognition and synthesis based on articulatory movement HMMs. 2970-2973 - Xiaodong Cui, Jian Xue, Pierre L. Dognin, Upendra V. Chaudhari, Bowen Zhou:
Acoustic modeling with bootstrap and restructuring for low-resourced languages. 2974-2977 - Tetsuo Kosaka, Keisuke Goto, Takashi Ito, Masaharu Katoh:
Lecture speech recognition by combining word graphs of various acoustic models. 2978-2981 - Khe Chai Sim, Shilin Liu:
Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognition. 2982-2985 - Dong Yu, Li Deng:
Deep-structured hidden conditional random fields for phonetic recognition. 2986-2989 - Jonathan Malkin, Jeff A. Bilmes:
Semi-supervised learning for improved expression of uncertainty in discriminative classifiers. 2990-2993 - Peder A. Olsen, Vaibhava Goel, Charles A. Micchelli, John R. Hershey:
Modeling posterior probabilities using the linear exponential family. 2994-2997
Spoken Dialogue Systems I, II
- Fabrice Lefèvre, François Mairesse, Steve J. Young:
Cross-lingual spoken language understanding from unaligned data using discriminative classification models and machine translation. 78-81 - Rajesh Balchandran, Leonid Rachevsky, Bhuvana Ramabhadran, Miroslav Novak:
Techniques for topic detection based processing in spoken dialog systems. 82-85 - Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin:
Optimizing spoken dialogue management with fitted value iteration. 86-89 - Filip Jurcícek, Blaise Thomson, Simon Keizer, François Mairesse, Milica Gasic, Kai Yu, Steve J. Young:
Natural belief-critic: a reinforcement algorithm for parameter estimation in statistical spoken dialogue systems. 90-93 - Alexander Schmitt, Michael Scholz, Wolfgang Minker, Jackson Liscombe, David Suendermann:
Is it possible to predict task completion in automated troubleshooters?. 94-97 - David Suendermann, Jackson Liscombe, Roberto Pieraccini:
Minimally invasive surgery for spoken dialog systems. 98-101
Spoken Dialogue Systems II
- Ramón López-Cózar, David Griol:
New technique to enhance the performance of spoken dialogue systems based on dialogue states-dependent language models and grammatical rules. 2998-3001 - Lluís F. Hurtado, Joaquin Planells, Encarna Segarra, Emilio Sanchis, David Griol:
A stochastic finite-state transducer approach to spoken dialog management. 3002-3005 - Romain Laroche, Philippe Bretier, Ghislain Putois:
Enhanced monitoring tools and online dialogue optimisation merged into a new spoken dialogue system design experience. 3006-3009 - Romain Laroche, Ghislain Putois, Philippe Bretier:
Optimising a handcrafted dialogue system design. 3010-3013 - Felix Putze, Tanja Schultz:
Utterance selection for speech acts in a cognitive tourguide scenario. 3014-3017 - Gabriel Parent, Maxine Eskénazi:
Lexical entrainment of real users in the let's go spoken dialog system. 3018-3021 - Silvia Quarteroni, Meritxell González, Giuseppe Riccardi, Sebastian Varges:
Combining user intention and error modeling for statistical dialog simulators. 3022-3025 - Jaakko Hakulinen, Markku Turunen, Raúl Santos de la Cámara, Nigel T. Crook:
Parallel processing of interruptions and feedback in companions affective dialogue system. 3026-3029 - Antoine Raux, Neville Mehta, Deepak Ramachandran, Rakesh Gupta:
Dynamic language modeling using Bayesian networks for spoken dialog systems. 3030-3033 - Sunao Hara, Norihide Kitaoka, Kazuya Takeda:
Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act n-gram. 3034-3037 - Wei-Bin Liang, Chung-Hsien Wu, Yu-Cheng Hsiao:
Dialogue act detection in error-prone spoken dialogue systems using partial sentence tree and latent dialogue act matrix. 3038-3041 - Tatsuya Kawahara, Kouhei Sumi, Zhi-Qiang Chang, Katsuya Takanashi:
Detection of hot spots in poster conversations based on reactive tokens of audience. 3042-3045 - Yoichi Matsuyama, Shinya Fujie, Hikaru Taniyama, Tetsunori Kobayashi:
Psychological evaluation of a group communication activation robot in a party game. 3046-3049 - Kyoko Matsuyama, Kazunori Komatani, Ryu Takeda, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno:
Analyzing user utterances in barge-in-able spoken dialogue system for improving identification accuracy. 3050-3053 - Mattias Heldner, Jens Edlund, Julia Hirschberg:
Pitch similarity in the vicinity of backchannels. 3054-3057 - Khiet P. Truong, Ronald Poppe
, Dirk Heylen:
A rule-based backchannel prediction model using pitch and pause information. 3058-3061
Speech Perception: Factors Influencing Perception
- Paul Boersma, Katerina Chládková:
Detecting categorical perception in continuous discrimination data. 102-105 - Titia Benders, Paola Escudero:
The interrelation between the stimulus range and the number of response categories in vowel categorization. 106-109 - Marie Nilsenová, Martijn Goudbeek, Luuk Kempen:
The relation between pitch perception preference and emotion identification. 110-113 - Takashi Otake, James M. McQueen, Anne Cutler:
Competition in the perception of spoken Japanese words. 114-117 - Makiko Sadakata, Lotte van der Zanden, Kaoru Sekiyama:
Influence of musical training on perception of L2 speech. 118-121 - Donald Derrick, Bryan Gick:
Full body aero-tactile integration in speech perception. 122-125
Prosody: Models
- Tomás Dubeda, Katalin Mády:
Nucleus position within the intonation phrase: a typological study of English, Czech and Hungarian. 126-129 - Yong-cheol Lee, Satoshi Nambu:
Focus-sensitive operator or focus inducer: always and only. 130-133 - Jiahong Yuan, Mark Liberman:
F0 declination in English and Mandarin broadcast news speech. 134-137 - Katrin Schweitzer, Michael Walsh, Bernd Möbius, Hinrich Schütze:
Frequency of occurrence effects on pitch accent realisation. 138-141 - César González Ferreras, Carlos Vivaracho-Pascual, David Escudero Mancebo, Valentín Cardeñoso-Payo:
On the automatic toBI accent type identification from data. 142-145 - Andrew Rosenberg:
AutoBI - a tool for automatic toBI annotation. 146-149
Speech Synthesis: Unit Selection and Others
- Volker Strom, Simon King:
A classifier-based target cost for unit selection speech synthesis trained on perceptual data. 150-153 - Wei Zhang, Xiaodong Cui:
Applying scalable phonetic context similarity in unit selection of concatenative text-to-speech. 154-157 - Mitsuaki Isogai, Hideyuki Mizuno:
Speech database reduction method for corpus-based TTS system. 158-161 - Heng Lu, Zhen-Hua Ling, Si Wei, Li-Rong Dai, Ren-Hua Wang:
Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifier. 162-165 - Hanna Silén, Elina Helander, Jani Nurminen, Konsta Koppinen, Moncef Gabbouj:
Using robust viterbi algorithm and HMM-modeling in unit selection TTS to replace units of poor quality. 166-169 - Yeon-Jun Kim, Marc C. Beutnagel:
Automatic detection of abnormal stress patterns in unit selection synthesis. 170-173 - Daniel Tihelka, Jirí Kala, Jindrich Matousek:
Enhancements of viterbi search for fast unit selection synthesis. 174-177 - Thomas Ewender, Beat Pfister:
Accurate pitch marking for prosodic modification of speech segments. 178-181 - Shifeng Pan, Meng Zhang, Jianhua Tao:
A novel hybrid approach for Mandarin speech synthesis. 182-185 - Josafá de Jesus Aguiar Pontes, Sadaoki Furui:
Modeling liaison in French by using decision trees. 186-189 - Jian Luan, Jian Li:
Improvement on plural unit selection and fusion. 190-193 - Alok Parlikar, Alan W. Black, Stephan Vogel:
Improving speech synthesis of machine translation output. 194-197 - Ghislain Putois, Jonathan Chevelu, Cédric Boidin:
Paraphrase generation to improve text-to-speech synthesis. 198-201
ASR: Search, Decoding and Confidence Measures I, II
- Chang Woo Han, Shin Jae Kang, Chul Min Lee, Nam Soo Kim:
Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizer. 202-205 - Petr Motlícek, Fabio Valente, Philip N. Garner:
English spoken term detection in multilingual recordings. 206-209 - Icksang Han, Chiyoun Park, Jeongmi Cho, Jeongsu Kim:
A hybrid approach to robust word lattice generation via acoustic-based word detection. 210-213 - Volker Steinbiss, Martin Sundermeyer, Hermann Ney:
Direct observation of pruning errors (DOPE): a search analysis tool. 214-217 - David Rybach, Michael Riley:
Direct construction of compact context-dependency transducers from data. 218-221 - Miroslav Novak:
Incremental composition of static decoding graphs with label pushing. 222-225 - Zhanlei Yang, Wenju Liu:
A novel path extension framework using steady segment detection for Mandarin speech recognition. 226-229 - Ralf Schlüter, Markus Nußbaum-Thom, Hermann Ney:
On the relation of Bayes risk, word error, and word posteriors in ASR. 230-233 - David Nolden, Hermann Ney, Ralf Schlüter:
Time conditioned search in automatic speech recognition reconsidered. 234-237 - Satoshi Kobashikawa, Taichi Asami, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi:
Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent models. 238-241 - Atsunori Ogawa, Atsushi Nakamura:
A novel confidence measure based on marginalization of jointly estimated error cause probabilities. 242-245 - Julien Fayolle, Fabienne Moreau, Christian Raymond, Guillaume Gravier, Patrick Gros:
CRF-based combination of contextual features to improve a posteriori word-level confidence measures. 1942-1945 - Martin Wöllmer, Florian Eyben, Björn W. Schuller
, Gerhard Rigoll:
Recognition of spontaneous conversational speech using long short-term memory phoneme predictions. 1946-1949 - Thomas Pellegrini, Isabel Trancoso:
Improving ASR error detection with non-decoder based features. 1950-1953 - Ladan Golipour, Douglas D. O'Shaughnessy:
Phoneme classification and lattice rescoring based on a k-NN approach. 1954-1957 - Jeff A. Bilmes, Hui Lin:
Online adaptive learning for speech recognition decoding. 1958-1961 - Takaaki Hori, Shinji Watanabe, Atsushi Nakamura:
Improvements of search error risk minimization in viterbi beam search for speech recognition. 1962-1965
Special-Purpose Speech Applications
- Robin Hofe, Stephen R. Ell, Michael J. Fagan, James M. Gilbert, Phil D. Green, Roger K. Moore
, Sergey I. Rybchenko:
Evaluation of a silent speech interface based on magnetic sensing. 246-249 - Rubén San Segundo, Verónica López-Ludeña, Raquel Martín, Syaheerah L. Lutfi, Javier Ferreiros, Ricardo de Córdoba, José Manuel Pardo:
Advanced speech communication system for deaf people. 250-253 - Sethserey Sam, Eric Castelli, Laurent Besacier:
Unsupervised acoustic model adaptation for multi-origin non native ASR. 254-257 - Dilek Hakkani-Tür, Dimitra Vergyri, Gökhan Tür:
Speech-based automated cognitive status assessment. 258-261 - Toru Imai, Shinichi Homma, Akio Kobayashi, Takahiro Oku, Shoei Sato:
Speech recognition with a seamlessly updated language model for real-time closed-captioning. 262-265 - Takuya Nishimoto, Takayuki Watanabe:
The comparison between the deletion-based methods and the mixing-based methods for audio CAPTCHA systems. 266-269 - Martine Adda-Decker, Lori Lamel, Natalie D. Snoeren:
Comparing mono- & multilingual acoustic seed models for a low e-resourced language: a case-study of luxembourgish. 270-273 - R. J. J. H. van Son, Irene Jacobi, Frans J. M. Hilgers:
Manipulating treacheoesophageal speech. 274-277 - David Imseng, Hervé Bourlard, Mathew Magimai-Doss:
Towards mixed language speech recognition systems. 278-281 - Etienne Barnard, Johan Schalkwyk, Charl Johannes van Heerden, Pedro J. Moreno:
Voice search for development. 282-285 - Gina-Anne Levow, Susan Duncan, Edward T. King:
Cross-cultural investigation of prosody in verbal feedback in interactional rapport. 286-289 - Mary Tai Knox, Gerald Friedland:
Multimodal speaker diarization using oriented optical flow histograms. 290-293 - Catherine Middag, Yvan Saeys, Jean-Pierre Martens:
Towards an ASR-free objective analysis of pathological speech. 294-297
Speech Analysis
- Keith W. Godin, John H. L. Hansen:
Session variability contrasts in the MARP corpus. 298-301 - Kazuhiro Kondo
, Yusuke Takano:
Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted models. 302-305 - Thomas Schaaf, Florian Metze:
Analysis of gender normalization using MLP and VTLN features. 306-309 - Guillaume Aimetti, Roger K. Moore
, Louis ten Bosch:
Discovering an optimal set of minimally contrasting acoustic speech units: a point of focus for whole-word pattern matching. 310-313 - Themos Stafylakis, Xavier Anguera:
Improvements to the equal-parameter BIC for speaker diarization. 314-317