


default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 30
Volume 30, 2022
- Qianying Liu
, Wenyu Guan, Sujian Li, Fei Cheng, Daisuke Kawahara
, Sadao Kurohashi
:
RODA: Reverse Operation Based Data Augmentation for Solving Math Word Problems. 1-11 - Kai Zhen
, Jongmo Sung
, Mi Suk Lee, Seungkwon Beack, Minje Kim
:
Scalable and Efficient Neural Speech Coding: A Hybrid Design. 12-25 - Sen Yang, Yang Liu, Dawei Feng, Dongsheng Li
:
Text Generation From Data With Dynamic Planning. 26-34 - Stefan Liebich
, Peter Vary
:
Occlusion Effect Cancellation in Headphones and Hearing Devices - The Sister of Active Noise Cancellation. 35-48 - Zhuosheng Zhang
, Haojie Yu, Hai Zhao
, Masao Utiyama
:
Which Apple Keeps Which Doctor Away? Colorful Word Representations With Visual Oracles. 49-59 - Zhenyu Wang, John H. L. Hansen
:
Multi-Source Domain Adaptation for Text-Independent Forensic Speaker Recognition. 60-75 - Kengtao Zheng
, Nankai Lin
, Shengyi Jiang:
Unsupervised Character Embedding Correction and Candidate Word Denoising. 76-86 - Bing Ma, Haifeng Sun
, Jingyu Wang
, Qi Qi, Jianxin Liao:
Extractive Dialogue Summarization Without Annotation Based on Distantly Supervised Machine Reading Comprehension in Customer Service. 87-97 - Shengcai Liu
, Ning Lu
, Cheng Chen
, Ke Tang
:
Efficient Combinatorial Optimization for Word-Level Adversarial Textual Attack. 98-111 - Alessandro Terenzi
, Nicola Ortolani, Inês Nolasco
, Emmanouil Benetos
, Stefania Cecchi
:
Comparison of Feature Extraction Methods for Sound-Based Classification of Honey Bee Activity. 112-122 - Shuiyang Mao
, P. C. Ching
, Tan Lee
:
Enhancing Segment-Based Speech Emotion Recognition by Iterative Self-Learning. 123-134 - Abdolreza Sabzi Shahrebabaki
, Giampiero Salvi
, Torbjørn Svendsen
, Sabato Marco Siniscalchi
:
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models. 135-147 - Javier Jorge
, Adrià Giménez
, Joan Albert Silvestre-Cerdà
, Jorge Civera
, Alberto Sanchís
, Alfons Juan
:
Live Streaming Speech Recognition Using Deep Bidirectional LSTM Acoustic Models and Interpolated Language Models. 148-161 - P. V. Muhammed Shifas
, Catalin Zorila, Yannis Stylianou:
End-to-End Neural Based Modification of Noisy Speech for Speech-in-Noise Intelligibility Improvement. 162-173 - Joon-Young Yang, Joon-Hyuk Chang
:
VACE-WPE: Virtual Acoustic Channel Expansion Based on Neural Networks for Weighted Prediction Error-Based Speech Dereverberation. 174-189 - Chenpeng Du
, Kai Yu
:
Phone-Level Prosody Modelling With GMM-Based MDN for Diverse and Controllable Speech Synthesis. 190-201 - Haibin Wu
, Xu Li
, Andy T. Liu
, Zhiyong Wu
, Helen Meng, Hung-Yi Lee
:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. 202-217 - Mixiao Hou
, Zheng Zhang
, Qi Cao
, David Zhang
, Guangming Lu
:
Multi-View Speech Emotion Recognition Via Collective Relation Construction. 218-229 - Da-Rong Liu, Po-Chun Hsu, Yi-Chen Chen, Sung-Feng Huang
, Shun-Po Chuang, Da-Yi Wu, Hung-yi Lee
:
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network. 230-243 - Yuting Zhao
, Mamoru Komachi
, Tomoyuki Kajiwara, Chenhui Chu
:
Word-Region Alignment-Guided Multimodal Neural Machine Translation. 244-259 - Zhuosheng Zhang
, Yiqing Zhang, Hai Zhao
:
Syntax-Aware Multi-Spans Generation for Reading Comprehension. 260-268 - Pengfei Zhu
, Zhuosheng Zhang
, Hai Zhao
, Xiaoguang Li:
DUMA: Reading Comprehension With Transposition Thinking. 269-279 - Jiayuan Xie, Ningxin Peng, Yi Cai
, Tao Wang
, Qingbao Huang
:
Diverse Distractor Generation for Constructing High-Quality Multiple Choice Questions. 280-291 - Jie Zhang
, Guanghui Zhang
:
A Parametric Unconstrained Beamformer Based Binaural Noise Reduction for Assistive Hearing. 292-304 - Luca Turchet
, Johan Pauwels
:
Music Emotion Recognition: Intention of Composers-Performers Versus Perception of Musicians, Non-Musicians, and Listening Machines. 305-316 - Wenxin Hou, Han Zhu
, Yidong Wang, Jindong Wang
, Tao Qin, Renjun Xu
, Takahiro Shinozaki
:
Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition. 317-329 - Kehai Chen
, Rui Wang
, Masao Utiyama
, Eiichiro Sumita
:
Integrating Prior Translation Knowledge Into Neural Machine Translation. 330-339 - Keqi Deng
, Gaofeng Cheng
, Runyan Yang
, Yonghong Yan
:
Alleviating ASR Long-Tailed Problem by Decoupling the Learning of Representation and Classification. 340-354 - Zuchao Li
, Junru Zhou, Hai Zhao
, Kevin Parnow:
HPSG-Inspired Joint Neural Constituent and Dependency Parsing in O($n^3$) Time Complexity. 355-366 - Xuan Shi
, Erica Cooper
, Junichi Yamagishi
:
Use of Speaker Recognition Approaches for Learning and Evaluating Embedding Representations of Musical Instrument Sounds. 367-377 - Zengwei Yao
, Wenjie Pei, Fanglin Chen
, Guangming Lu
, David Zhang
:
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-Order Latent Domain. 378-393 - Yanmin Qian
, Zhikai Zhou
:
Optimizing Data Usage for Low-Resource Speech Recognition. 394-403 - Narla John Metilda Sagaya Mary
, Srinivasan Umesh, Sandesh Varadaraju Katta:
S-Vectors and TESA: Speaker Embeddings and a Speaker Authenticator Based on Transformer Encoder. 404-413 - Bengt J. Borgström
:
Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification. 414-428 - Menglong Lu
, Zhen Huang, Binyang Li, Yunxiang Zhao
, Zheng Qin, Dong Sheng Li
:
SIFTER: A Framework for Robust Rumor Detection. 429-442 - Lantian Li
, Dong Wang
, Jiawen Kang, Renyu Wang, Jing Wu, Zhendong Gao, Xiao Chen:
A Principle Solution for Enroll-Test Mismatch in Speaker Recognition. 443-455 - Feiran Yang
:
Analysis of Deficient-Length Partitioned-Block Frequency-Domain Adaptive Filters. 456-467 - Hui Jiang
, Linfeng Song
, Yubin Ge, Fandong Meng, Junfeng Yao
, Jinsong Su
:
An AST Structure Enhanced Decoder for Code Generation. 468-476 - Anssi Kanervisto
, Ville Hautamäki
, Tomi Kinnunen, Junichi Yamagishi
:
Optimizing Tandem Speaker Verification and Anti-Spoofing Systems. 477-488 - Xin Ni
, Jia Ren:
FC-U2-Net: A Novel Deep Neural Network for Singing Voice Separation. 489-494 - Neil Zeghidour
, Alejandro Luebs, Ahmed Omran, Jan Skoglund
, Marco Tagliasacchi
:
SoundStream: An End-to-End Neural Audio Codec. 495-507 - Wageesha Manamperi
, Thushara D. Abhayapala
, Jihui Zhang
, Prasanga N. Samarasinghe
:
Drone Audition: Sound Source Localization Using On-Board Microphones. 508-519 - Qian Li, Hao Peng
, Jianxin Li
, Jia Wu
, Yuanxing Ning, Lihong Wang, Philip S. Yu
, Zheng Wang
:
Reinforcement Learning-Based Dialogue Guided Event Extraction to Exploit Argument Relations. 520-533 - Santiago Ruiz
, Toon van Waterschoot
, Marc Moonen
:
Distributed Combined Acoustic Echo Cancellation and Noise Reduction in Wireless Acoustic Sensor and Actuator Networks. 534-547 - Lukas Grinewitschus
, Peter Jung
:
The Harmonic Shift Algorithm for Efficient Multi-Pitch Detection. 548-561 - Ziyao Lu
, Xiang Li, Yang Liu
, Chulun Zhou, Jianwei Cui, Bin Wang, Min Zhang, Jinsong Su
:
Exploring Multi-Stage Information Interactions for Multi-Source Neural Machine Translation. 562-570 - Jingxuan Yang
, Si Li
, Sheng Gao
, Jun Guo
:
CorefDPR: A Joint Model for Coreference Resolution and Dropped Pronoun Recovery in Chinese Conversations. 571-581 - Timuçin Berk Atalay
, Zühre Sü Gül, Enzo De Sena
, Zoran Cvetkovic
, Hüseyin Hacihabiboglu
:
Scattering Delay Network Simulator of Coupled Volume Acoustics. 582-593 - Yi Zhang
, Lei Li, Yunfang Wu, Qi Su
, Xu Sun
:
Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge. 594-604 - Ke Tan
, Zhong-Qiu Wang
, DeLiang Wang
:
Neural Spectrospatial Filtering. 605-621 - Qianren Mao
, Jianxin Li
, Chenghua Lin
, Congwen Chen, Hao Peng
, Lihong Wang
, Philip S. Yu
:
Adaptive Pre-Training and Collaborative Fine-Tuning: A Win-Win Strategy to Improve Review Analysis Tasks. 622-634 - Zifeng Cheng
, Zhiwei Jiang
, Yafeng Yin
, Cong Wang
, Qing Gu
:
Learning to Classify Open Intent via Soft Labeling and Manifold Mixup. 635-645 - Xiaochun An, Frank K. Soong, Lei Xie
:
Disentangling Style and Speaker Attributes for TTS Style Transfer. 646-658 - Zhuang Chen
, Tieyun Qian
:
Retrieve-and-Edit Domain Adaptation for End2End Aspect Based Sentiment Analysis. 659-672 - Jian Liu
, Mengshi Yu, Yufeng Chen, Jinan Xu:
Cross-Domain Slot Filling as Machine Reading Comprehension: A New Perspective. 673-685 - Yongkang Liu, Qingbao Huang, Jing Li, Linzhang Mo, Yi Cai, Qing Li:
SSAP: Storylines and Sentiment Aware Pre-Trained Model for Story Ending Generation. 686-694 - Ying Zhou
, Xuefeng Liang
, Yu Gu
, Yifei Yin, Longshan Yao:
Multi-Classifier Interactive Learning for Ambiguous Speech Emotion Recognition. 695-705 - Poul Hoang
, Jan Mark de Haan, Zheng-Hua Tan
, Jesper Jensen
:
Multichannel Speech Enhancement With Own Voice-Based Interfering Speech Suppression for Hearing Assistive Devices. 706-720 - Weijie Yu
, Chen Xu, Jun Xu
, Liang Pang, Ji-Rong Wen:
Distribution Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains. 721-733 - Heming Wang
, DeLiang Wang
:
Neural Cascade Architecture With Triple-Domain Loss for Speech Enhancement. 734-743 - Riccardo R. De Lucia
, Antonio Canclini
, Fabio Antonacci, Augusto Sarti
:
Group Dictionary Equivalent Source Method for Sparse Nearfield Acoustic Holography. 744-757 - Tong Ma
, Ying Wei
, Xin Lou
:
Reconfigurable Nonuniform Filter Bank for Hearing Aid Systems. 758-771 - Victoria Mingote
, Antonio Miguel
, Dayana Ribas
, Alfonso Ortega
, Eduardo Lleida
:
aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems. 772-784 - Quansheng Tu, Huawei Chen
:
Theoretical Lower Bounds on the Performance of the First-Order Differential Microphone Arrays With Sensor Imperfections. 785-801 - Taihui Wang
, Feiran Yang
, Jun Yang
:
Convolutive Transfer Function-Based Multichannel Nonnegative Matrix Factorization for Overdetermined Blind Source Separation. 802-815 - Yi Zhang
, Guangyou Zhou
, Zhiwen Xie
, Jimmy Xiangji Huang
:
HGEN: Learning Hierarchical Heterogeneous Graph Encoding for Math Word Problem Solving. 816-828 - Eduardo Fonseca
, Xavier Favory, Jordi Pons
, Frederic Font, Xavier Serra:
FSD50K: An Open Dataset of Human-Labeled Sound Events. 829-852 - Yi Lei, Shan Yang, Xinsheng Wang
, Lei Xie
:
MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis. 853-864 - Tao Wang
, Ruibo Fu
, Jiangyan Yi
, Jianhua Tao
, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation. 865-878 - Simon Stone
, Yingming Gao
, Peter Birkholz
:
Articulatory Synthesis of Vocalized /r/ Allophones in German. 879-889 - Prashant Serai
, Vishal Sunder, Eric Fosler-Lussier
:
Hallucination of Speech Recognition Errors With Sequence to Sequence Learning. 890-900 - Bin Wu
, Sakriani Sakti
, Jinsong Zhang, Satoshi Nakamura
:
Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR. 901-916 - Mi Zhang, Tieyun Qian
, Bing Liu
:
Exploit Feature and Relation Hierarchy for Relation Extraction. 917-930 - Wenxiang Jiao
, Xing Wang, Shilin He, Zhaopeng Tu, Irwin King
, Michael R. Lyu:
Exploiting Inactive Examples for Natural Language Generation With Data Rejuvenation. 931-943 - Youzhi Tu
, Man-Wai Mak
:
Aggregating Frame-Level Information in the Spectral Domain With Self-Attention for Speaker Embedding. 944-957 - Zhixing Tan
, Zeyuan Yang
, Meng Zhang, Qun Liu
, Maosong Sun
, Yang Liu
:
Dynamic Multi-Branch Layers for On-Device Neural Machine Translation. 958-967 - Weiwei Lin
, Man-Wai Mak
:
Mixture Representation Learning for Deep Speaker Embedding. 968-978 - Peng Zhu
, Dawei Cheng
, Fangzhou Yang, Yifeng Luo
, Dingjiang Huang, Weining Qian, Aoying Zhou:
Improving Chinese Named Entity Recognition by Large-Scale Syntactic Dependency Graph. 979-991 - Xiaobo Liang, Lijun Wu
, Juntao Li
, Tao Qin
, Min Zhang, Tie-Yan Liu:
Multi-Teacher Distillation With Single Model for Neural Machine Translation. 992-1002 - Xiaofeng Chen, Guohua Wang, Haopeng Ren, Yi Cai
, Ho-fung Leung
, Tao Wang
:
Task-Adaptive Feature Fusion for Generalized Few-Shot Relation Classification in an Open World Environment. 1003-1015 - Yu-Chen Lin
, Cheng Yu, Yi-Te Hsu, Szu-Wei Fu
, Yu Tsao
, Tei-Wei Kuo
:
SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points. 1016-1031 - Tomohiro Nakatani
, Rintaro Ikeshita
, Keisuke Kinoshita
, Hiroshi Sawada
, Naoyuki Kamo, Shoko Araki:
Switching Independent Vector Analysis and its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms. 1032-1047 - Jianhua Geng
, Sifan Wang
, Qinglai Liu
, Xin Lou
:
Multi-Level Time-Frequency Bins Selection for Direction of Arrival Estimation Using a Single Acoustic Vector Sensor. 1048-1060 - Qinzhuo Wu
, Qi Zhang
, Xuanjing Huang
:
Automatic Math Word Problem Generation With Topic-Expression Co-Attention Mechanism and Reinforcement Learning. 1061-1072 - Michael Nigro
, Sridhar Krishnan
:
Multimodal System for Audio Scene Source Counting and Analysis. 1073-1082 - Yishu Peng
, Sheng Zhang
, Jiashu Zhang
, Wei Xing Zheng
:
Combined-Sample Multiband-Structured Subband Filtering Algorithms. 1083-1092 - Shoukang Hu
, Xurong Xie, Mingyu Cui, Jiajun Deng
, Shansong Liu, Jianwei Yu
, Mengzhe Geng
, Xunying Liu
, Helen Meng:
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks. 1093-1107 - Xudong Dang
, Wen Ma, Emanuël A. P. Habets
, Hongyan Zhu
:
TDOA-Based Robust Sound Source Localization With Sparse Regularization in Wireless Acoustic Sensor Networks. 1108-1123 - Shan Gao, Jing Lin, Xihong Wu, Tianshu Qu
:
Sparse DNN Model for Frequency Expanding of Higher Order Ambisonics Encoding Process. 1124-1135 - Giovanni Pepe
, Leonardo Gabrielli
, Stefano Squartini
, Carlo Tripodi, Nicolo Strozzi
:
Deep Optimization of Parametric IIR Filters for Audio Equalization. 1136-1149 - Moa Lee, Junmo Lee, Joon-Hyuk Chang
:
Non-Autoregressive Fully Parallel Deep Convolutional Neural Speech Synthesis. 1150-1159 - Liam Barrett
, Junchao Hu
, Peter Howell
:
Systematic Review of Machine Learning Approaches for Detecting Developmental Stuttering. 1160-1172 - Sang-Hoon Lee
, Hyeong-Rae Noh
, Woo-Jeoung Nam
, Seong-Whan Lee
:
Duration Controllable Voice Conversion via Phoneme-Based Information Bottleneck. 1173-1183 - Zhihong Shao
, Zhongqin Wu, Minlie Huang
:
AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text. 1184-1196 - Dhanunjaya Varma Devalraju
, Padmanabhan Rajan:
Multiview Embeddings for Soundscape Classification. 1197-1206 - Chengyu Wang
, Suyang Dai, Yipeng Wang, Fei Yang, Minghui Qiu
, Kehan Chen, Wei Zhou, Jun Huang:
ARoBERT: An ASR Robust Pre-Trained Language Model for Spoken Language Understanding. 1207-1218 - Jonah Ong
, Ba-Tuong Vo
, Sven Nordholm
, Ba-Ngu Vo
, Diluka Moratuwage
, Changbeom Shim
:
Audio-Visual Based Online Multi-Source Separation. 1219-1234 - Leyang Cui
, Yafu Li
, Yue Zhang
:
Label Attention Network for Structured Prediction. 1235-1248 - Sarinah Sutojo
, Tobias May
, Steven van de Par:
Segmentation of Multitalker Mixtures Based on Local Feature Contrasts and Auditory Glimpses. 1249-1262 - Hao Gao
, Xuelei Feng
, Yong Shen
:
Weighted Loudspeaker Placement Method for Sound Field Reproduction. 1263-1276 - Gongping Huang
, Jacob Benesty
, Israel Cohen
, Jingdong Chen
:
Kronecker Product Multichannel Linear Filtering for Adaptive Weighted Prediction Error-Based Speech Dereverberation. 1277-1289 - Takehiro Sugimoto
:
Loudness-Level-Chasing Algorithm for Multiformat Live Audio Production. 1290-1304 - Junshuang Wu
, Richong Zhang
, Yongyi Mao, Jinpeng Huai:
Dealing With Hierarchical Types and Label Noise in Fine-Grained Entity Typing. 1305-1318 - Anton Ragni
, Mark J. F. Gales
, Oliver Rose, Katherine M. Knill
, Alexandros Kastanos, Qiujia Li
, Preben Ness:
Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition. 1319-1329 - Zhongxin Bai
, Jianyu Wang, Xiao-Lei Zhang
, Jingdong Chen
:
End-to-End Speaker Verification via Curriculum Bipartite Ranking Weighted Binary Cross-Entropy. 1330-1344 - Shang-Yi Chuang
, Hsin-Min Wang
, Yu Tsao
:
Improved Lite Audio-Visual Speech Enhancement. 1345-1359 - Gaofeng Cheng
, Haoran Miao
, Runyan Yang
, Keqi Deng
, Yonghong Yan
:
ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture. 1360-1373 - Ashutosh Pandey
, DeLiang Wang
:
Self-Attending RNN for Speech Enhancement to Improve Cross-Corpus Generalization. 1374-1385 - Di Jin
, Shuyang Gao, Seokhwan Kim
, Yang Liu, Dilek Hakkani-Tür:
Towards Textual Out-of-Domain Detection Without In-Domain Labels. 1386-1395 - K. Mrinalini
, P. Vijayalakshmi
, T. Nagarajan
:
SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems. 1396-1406 - Changhong Wang
, Emmanouil Benetos
, Vincent Lostanlen
, Elaine Chew
:
Adaptive Scattering Transforms for Playing Technique Recognition. 1407-1421 - Danwei Cai
, Weiqing Wang, Ming Li
:
Incorporating Visual Information in Audio Based Self-Supervised Speaker Recognition. 1422-1435 - Yu Luo
, Lina Pu
:
EC-ANC: Edge Case-Enhanced Active Noise Cancellation for True Wireless Stereo Earbuds. 1436-1447 - Tao Li
, Xinsheng Wang
, Qicong Xie, Zhichao Wang, Lei Xie
:
Cross-Speaker Emotion Disentangling and Transfer for End-to-End Speech Synthesis. 1448-1460