default search action
ICASSP 2024: Seoul, Korea
- IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2024, Seoul, Republic of Korea, April 14-19, 2024. IEEE 2024, ISBN 979-8-3503-4485-1
- Jiwei Shen, Hu Lu, Hao Zhang, Shujing Lyu, Yue Lu:
Enhanced Deep Reinforcement Learning for Parcel Singulation in Non-Stationary Environments. 1-5 - Yaowei Li, Yating Liu, Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Bang Yang, Zhiqi Huang:
KC-Prompt: End-To-End Knowledge-Complementary Prompting for Rehearsal-Free Continual Learning. 1-5 - Miao Jiang, Min Li, Junxing Ren, Weiqing Huang:
HOICS: Zero-Shot Hoi Detection via Compatibility Self-Learning. 1-5 - Xuanhao Zhang, Hui Kou, Chenjie Xia, Hao Cai, Bo Liu:
Small-Footprint Automatic Speech Recognition System using Two-Stage Transfer Learning based Symmetrized Ternary Weight Network. 1-5 - Zhenjiao Liu, Xiao Wang, Xiaodi Huang, Guanlin Li, Ke Sun, Zhikui Chen:
Incomplete Multi-View Representation Learning Through Anchor Graph-Based GCN and Information Bottleneck. 1-5 - Samuel Fernández-Menduiña, J. Rapp, Hassan Mansour, M. Greiff, Kieran Parsons:
Tracking Beyond the Unambiguous Range with Modulo Single-Photon Lidar. 6-10 - Yhonatan Kvich, Yonina C. Eldar:
Modulo Sampling and Recovery in Shift-Invariant Spaces. 11-15 - Chaoqun Gong, Yuqin Dai, Ronghui Li, Achun Bao, Jun Li, Jian Yang, Yachao Zhang, Xiu Li:
Text2Avatar: Text to 3d Human Avatar Generation with Codebook-Driven Body Controllable Attribute. 16-20 - Tao Chen, Minxing Li, Ziming Liu:
The Joint Grid-Free DOA and Polarization Estimation Algorithm based on Atomic Norm Minimization. 21-25 - Shaolei Feng, Xiaoguang Lu, Deshana Kaushal Desai, Lei Guan:
A Learning-Based System for Automatic Intentional Non-Adherence Detection from Dosing Videos. 26-30 - Jingqing Ruan, Runpeng Xie, Xuantang Xiong, Shuang Xu, Bo Xu:
MaDE: Multi-Scale Decision Enhancement for Multi-Agent Reinforcement Learning. 31-35 - Yuanbo Wen, Tao Gao, Ziqi Li, Jing Zhang, Ting Chen:
Encoder-Minimal and Decoder-Minimal Framework for Remote Sensing Image Dehazing. 36-40 - Tao Chen, Qi An, Minxing Li:
An Error Self-Corrected DOA Estimation Model for Sparse Array Based on ANM. 41-45 - Yijia Zhang, Deepak Mishra, Hassan Habibi Gharakheili, Derrick Wing Kwan Ng:
UAV Operation Time Minimization for Wireless-Powered Data Collection. 46-50 - Christophe El Zeinaty, Glenn Herrou, Wassim Hamidouche, Daniel Ménard:
Dicetrack: Lightweight Dice Classification on Resource-Constrained Platforms with Optimized Deep Learning Models. 51-55 - Kaiyuan Hu, Hongjie Liao, Mingxiao Li, Fangxin Wang:
MMCOUNT: Stationary Crowd Counting System Based on Commodity Millimeter-Wave Radar. 56-60 - Zirui Wan, Saeid Sanei:
Crowd Modeling and Control Via Cooperative Adaptive Filtering. 61-65 - Pavlo Hilei, Marian Petruk, Ievgen Korotkyi, Oleg Farenyuk:
Deep Learning AMR Model Inference Acceleration with CFU for Edge Systems. 66-70 - Masahito Togami, Jean-Marc Valin, Karim Helwani, Ritwik Giri, Umut Isik, Michael M. Goodwin:
Real-Time Stereo Speech Enhancement with Spatial-Cue Preservation Based on Dual-Path Structure. 71-75 - Deeksha Chandola, Enas Altarawneh, Michael Jenkin, Manos Papagelis:
SERC-GCN: Speech Emotion Recognition In Conversation Using Graph Convolutional Networks. 76-80 - Tenghao Cai, Lei Li, Tsung-Hui Chang:
Sensing-Assisted Distributed User Scheduling and Beamforming in Muli-Cell mmWave Networks. 81-85 - Jiayuan Gao, Yingwei Zhang, Yiqiang Chen, Tengxiang Zhang, Boshi Tang, Xiaoyu Wang:
Unsupervised Human Activity Recognition Via Large Language Models and Iterative Evolution. 91-95 - Tao Chen, Ziming Liu, Lei Zhan:
ANM-Based Source Localization Under Mixed Field. 96-100 - Ran Wang, Jing Sun, Cheng Xu, Ruixue Li, Shihong Duan, Xiaotong Zhang:
Reinforcement Learning Compensated Filter for Multi-Agents Cooperative Localization. 101-105 - Entong He, Yuxiang Yang, Chenshu Wu:
Quantum Ranging Enhanced TDoA Localization. 106-110 - Haoyu Wang, Jinbo Chen, Dongheng Zhang, Zhi Lu, Changwei Wu, Yang Hu, Qibin Sun, Yan Chen:
Contactless Radar Heart Rate Variability Monitoring Via Deep Spatio-Temporal Modeling. 111-115 - Nikolaos Palaiodimopoulos, Vítor Fortes Rey, Matthias Tschöpe, Christina Jörg, Paul Lukowicz, Maximilian Kiefer-Emmanouilidis:
Quantum Inspired Image Augmentation Applicable to Waveguides and Optical Image Transfer Via Anderson Localization. 116-120 - Anestis Kaimakamidis, Ioannis Pitas:
Political Tweet Sentiment Analysis for Public Opinion Polling. 121-125 - V. R. J. Deville, C. M. Lievers, Jonathan H. Manton:
Enhanced Axle-Based Vehicle Classification Using Angle-Based Micro-Doppler Signature. 126-130 - Su Fong Chien, David Chieng, Samuel Y. C. Chen, Charilaos C. Zarakovitis, Heng Siong Lim, Y. H. Xu:
Applying Hybrid Quantum LSTM for Indoor Localization Based on RSSI. 131-135 - Hengxi Zhang, Zhendong Shi, Yuanquan Hu, Wenbo Ding, Ercan E. Kuruoglu, Xiao-Ping Zhang:
Optimizing Trading Strategies in Quantitative Markets Using Multi-Agent Reinforcement Learning. 136-140 - Yan Zhang, Xin Liu, Zuping Zhang:
Motif-Matching Based Sub-Braingraph Level Networks for Noisy Resting-State fMRI Analysis. 141-145 - Judith Herrmann, Raphael Kunert, Ron Hachmon, Aviv Markus, Allison Gunby-Mann, Sarel Cohen, Tobias Friedrich, Peter Chin:
Detecting Continuous Gravitational Waves Using Generated Training Data. 146-150 - Titan Yuan, Filip Maksimovic, David C. Burnett, Kristofer S. J. Pister:
Hardware-Limited Time Constant Estimation Using a Weighted Linear Regression. 151-155 - Kunwar Pritiraj Rajput, Linlong Wu, M. R. Bhavani Shankar, Pramod K. Varshney:
Joint Transmit Precoders and Passive Reflection Beamformer Design in IRS-Aided IoT Networks. 156-160 - Zhiqiang Zhou, Linxiao Yang, Qingsong Wen, Liang Sun:
RobustTSVar: A Robust Time Series Variance Estimation Algorithm. 161-165 - Xu Wang, Dongheng Zhang, Fengquan Zhan, Xuecheng Xie, Pengcheng Huang, Yang Hu, Yan Chen:
RoFi: Robust WiFi Intrusion Detection via Distribution Matching. 166-170 - Wuxia Hu, Yang Yang, Yonina C. Eldar, Chunyan Feng, Caili Guo:
Digital Task-Oriented Communication with Hardware-Limited Task-Based Quantization. 171-175 - Shuai Yang, Dongheng Zhang, Jinbo Chen, Fang Zhou, Guanzhong Wang, Qibin Sun, Yan Chen:
Automotive Radar Interference Mitigation Via SINR Maximization. 176-180 - Keshab K. Parhi:
A Low-Latency Fft-Ifft Cascade Architecture. 181-185 - Seyed Ali Ghazi Asgar, Kaan Sel, Anando Paul, Roderic I. Pettigrew, Roozbeh Jafari:
Cuffless Blood Pressure Estimation Using Magnetic Flux In A Ring Form Factor. 186-190 - Xuantang Xiong, Linghui Meng, Jingqing Ruan, Shuang Xu, Bo Xu:
UNeC: Unsupervised Exploring In Controllable Space. 191-195 - Jia-Yu Yang, Chih-I Ho, Pei-Yun Tsai, Hung-Ju Lin, Tzung-Dau Wang:
MAML-Based 24-Hour Personalized Blood Pressure Estimation from Wrist Photoplethysmography Signals in Free-Living Context. 196-200 - Shuyi Ren, Beichen Huang, Xiaoyang Li, Kaiming Shen:
Aerial-IRS-Assisted Load Balancing In Downlink Networks. 201-205 - Yu-Min Chiu, Ching-Te Chiu, Dao-Heng Luo:
Multi-Layer Relation Knowledge Distillation For Fingerprint Restoration. 206-210 - Toivo Henningson, Stefan Ingi Adalbjörnsson, Anders Berkeman, Carl Drougge, Xavante Erickson, Alexander Hunt:
A Concept for a Slam Back End Hardware Accelerator. 211-215 - Ganlin Zhang, Dongheng Zhang, Hongyu Deng, Yun Wu, Fengquan Zhan, Yan Chen:
Practical Challenge and Solution for IRS-Aided Indoor Localization System. 216-220 - Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
SVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. 221-225 - Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural Networks. 226-230 - Zheng Si, Chao Liu, Jianyu Liu, Yinhao Zhou:
Application of SNNS Model Based On Multi-Dimensional Attention In Drone Radio Frequency Signal Classification. 231-235 - Yize Sun, Jiarui Liu, Yunpu Ma, Volker Tresp:
Differentiable Quantum Architecture Search For Job Shop Scheduling Problem. 236-240 - Peichao Wang, Qian He:
Low-Complexity GLRT Based Quickest Detection With Unknown Parameters. 241-245 - Irtaza Shahid, Khaldoon Al-Naimi, Ting Dang, Yang Liu, Fahim Kawsar, Alessandro Montanari:
Towards Enabling DPOAE Estimation on Single-Speaker Earbuds. 246-250 - Bo Han, Liangjian Han:
Efficient 3D Position Estimation in Badminton Scene. 251-255 - Kevin Wilkinghoff, Keisuke Imoto:
F1-EV score: Measuring The Likelihood of Estimating a Good Decision Threshold for Semi-Supervised Anomaly Detection. 256-260 - Xinlei Niu, Jing Zhang, Christian Walder, Charles Patrick Martin:
SoundLoCD: An Efficient Conditional Discrete Contrastive Latent Diffusion Model for Text-to-Sound Generation. 261-265 - Christopher Hahne, Michel Hayoz, Raphael Sznitman:
StofNet: Super-Resolution Time of Flight Network. 266-270 - Yiming Li, Xiangdong Wang, Hong Liu, Rui Tao, Long Yan, Kazushige Ouchi:
Semi-Supervised Sound Event Detection with Local and Global Consistency Regularization. 271-275 - Kevin Wilkinghoff:
Self-Supervised Learning for Anomalous Sound Detection. 276-280 - Yushu Wu, Xiao Quan, Mohammad Rasool Izadi, Chuan-Che Jeff Huang:
"It os Okay to be Uncommon": Quantizing Sound Event Detection Networks on Hardware Accelerators with Uncommon Sub-Byte Support. 281-285 - Shansong Liu, Atin Sakkeer Hussain, Chenshuo Sun, Ying Shan:
Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning. 286-290 - Heinrich Dinkel, Yongqing Wang, Zhiyong Yan, Junbo Zhang, Yujun Wang:
CED: Consistent Ensemble Distillation for Audio Tagging. 291-295 - Ali Gökçe, Hüseyin Hacihabiboglu:
Semi-Blind Estimation of Direct-to-Reverberant Energy Ratio Using Residual Energy Test Statistics. 296-300 - Haojie Wei, Xueke Cao, Wenbo Xu, Tangpeng Dan, Yueguo Chen:
DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation. 301-305 - Rhiannon Mogridge, George Close, Robert Sutherland, Thomas Hain, Jon Barker, Stefan Goetze, Anton Ragni:
Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users Using Intermediate ASR Features and Human Memory Models. 306-310 - Jiayi Zhang, Rita Singh:
Vocal Fold Dynamics for Automatic Detection of Amyotrophic Lateral Sclerosis from Voice. 311-315 - Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-Weon Jung, François G. Germain, Jonathan Le Roux, Shinji Watanabe:
Improving Audio Captioning Models with Fine-Grained Audio Features, Text Embedding Supervision, and LLM Mix-Up Augmentation. 316-320 - Yoshihide Tomita, Shoichi Koyama, Hiroshi Saruwatari:
Localizing Acoustic Energy in Sound Field Synthesis by Directionally Weighted Exterior Radiation Suppression. 321-325 - Jia Qi Yip, Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Dianwen Ng, Eng Siong Chng, Bin Ma:
SPGM: Prioritizing Local Features for Enhanced Speech Separation Performance. 326-330 - Mahesh Kumar Nandwana, Yifan He, Joseph Liu, Xiao Yu, Charles Shang, Eloi du Bois, Morgan McGuire, Kiran Bhat:
Voice Toxicity Detection Using Multi-Task Learning. 331-335 - Benjamin Elizalde, Soham Deshmukh, Huaming Wang:
Natural Language Supervision For General-Purpose Audio Representations. 336-340 - Yao Qiu, Jinchao Zhang, Yong Shan, Jie Zhou:
Enhancing Note-Level Singing Transcription Model with Unlabeled and Weakly Labeled Data. 341-345 - Yo Sasaki, Yasushige Nakayama:
Simultaneous Interior and Exterior Sound Field Synthesis Using Cylindrical and Spherical Loudspeaker Arrays. 346-350 - George Close, William Ravenscroft, Thomas Hain, Stefan Goetze:
Multi-CMGAN+/+: Leveraging Multi-Objective Speech Quality Metric Prediction for Speech Enhancement. 351-355 - Johannes Zeitler, Michael Krause, Meinard Müller:
Soft Dynamic Time Warping with Variable Step Weights. 356-360 - Yi-Chiao Wu, Dejan Markovic, Steven Krenn, Israel D. Gebru, Alexander Richard:
ScoreDec: A Phase-Preserving High-Fidelity Audio Codec with a Generalized Score-Based Diffusion Post-Filter. 361-365 - Ali Vosoughi, Luca Bondi, Ho-Hsiang Wu, Chenliang Xu:
Learning Audio Concepts from Counterfactual Natural Language. 366-370 - Soham Deshmukh, Benjamin Elizalde, Dimitra Emmanouilidou, Bhiksha Raj, Rita Singh, Huaming Wang:
Training Audio Captioning Models without Audio. 371-375 - Pranay Manocha, Donald Williamson, Adam Finkelstein:
Corn: Co-Trained Full- and No-Reference Speech Quality Assessment. 376-380 - Jozef Coldenhoff, Andrew Harper, Paul Kendrick, Tijana Stojkovic, Milos Cernak:
Multi-Channel Mosra: Mean Opinion Score and Room Acoustics Estimation Using Simulated Data and A Teacher Model. 381-385 - Idan Cohen, Sharon Gannot, Ofir Lindenbaum:
Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality Reduction. 386-390 - Manuel Milling, Andreas Triantafyllopoulos, Iosif Tsangko, Simon David Noel Rampp, Björn Wolfgang Schuller:
Bringing the Discussion of Minima Sharpness to the Audio Domain: A Filter-Normalised Evaluation for Acoustic Scene Classification. 391-395 - Chih-Cheng Chang, Li Su:
Beast: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer. 396-400 - Yiqun Zhang, Xinmeng Xu, Weiping Tu:
Improving Acoustic Echo Cancellation by Exploring Speech and Echo Affinity with Multi-Head Attention. 401-405 - Pavan Seshadri, Chaeyeon Han, Bon-Woo Koo, Noah Posner, Subhrajit Guhathakurta, Alexander Lerch:
ASPED: An Audio Dataset for Detecting Pedestrians. 406-410 - Yuki Okamoto, Keisuke Imoto, Shinnosuke Takamichi, Ryotaro Nagase, Takahiro Fukumori, Yoichi Yamashita:
Environmental Sound Synthesis from Vocal Imitations and Sound Event Labels. 411-415 - Mattes Ohlenbusch, Christian Rollwage, Simon Doclo:
Multi-Microphone Noise Data Augmentation for DNN-Based Own Voice Reconstruction for Hearables in Noisy Environments. 416-420 - Shulin He, Jinjiang Liu, Hao Li, Yang Yang, Fei Chen, Xueliang Zhang:
3S-TSE: Efficient Three-Stage Target Speaker Extraction for Real-Time and Low-Resource Applications. 421-425 - Yi Luo, Rongzhi Gu:
Improving Music Source Separation with Simo Stereo Band-Split Rnn. 426-430 - Yichi Wang, Jie Zhang, Shihao Chen, Weitai Zhang, Zhongyi Ye, Xinyuan Zhou, Lirong Dai:
A Study of Multichannel Spatiotemporal Features and Knowledge Distillation on Robust Target Speaker Extraction. 431-435 - Clara Borrelli, James Rae, Dogac Basaran, Matt McVicar, Mehrez Souden, Matthias Mauch:
Resource-Constrained Stereo Singing Voice Cancellation. 436-440 - Zhengding Luo, Dongyuan Shi, Xiaoyi Shen, Woon-Seng Gan:
Unsupervised Learning Based End-to-End Delayless Generative Fixed-Filter Active Noise Control. 441-445 - Younglo Lee, Shukjae Choi, Byeong-Yeol Kim, Zhongqiu Wang, Shinji Watanabe:
Boosting Unknown-Number Speaker Separation with Transformer Decoder-Based Attractor. 446-450 - Youqiang Zheng, Weiping Tu, Li Xiao, Xinmeng Xu:
Srcodec: Split-Residual Vector Quantization for Neural Speech Codec. 451-455 - Haocheng Guo, Xiaohuai Le, Kai Chen, Jing Lu:
A Light-Weight State Detection Model for Kalman-Filter-Based Acoustic Feedback Cancellation with Rapid Recovery from Abrupt Path Changes. 456-460 - Anbin Qi, Xiang Xie, Jing Wang:
Mtdiffusion: Multi-Task Diffusion Model With Dual-Unet for Foley Sound Generation. 461-465 - Chenglong Jiang, Ying Gao, Hao Jin, Linrong Pan, Wing W. Y. Ng:
Fastmandarin: Efficient Local Modeling for Natural Mandarin Speech Synthesis. 461-465 - Shrishti Saha Shetu, Soumitro Chakrabarty, Oliver Thiergart, Edwin Mabande:
Ultra Low Complexity Deep Learning Based Noise Suppression. 466-470 - Carlotta Anemüller, Oliver Thiergart, Emanuël A. P. Habets:
Binaural Rendering of Heterogeneous Sound Sources with Extent. 471-475 - Jan Büthe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Michael M. Goodwin:
NOLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping. 476-480 - Wei Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, Yun-Ning Hung:
Music Source Separation With Band-Split Rope Transformer. 481-485 - Yiming Li, Xiangdong Wang, Hong Liu:
Audio-Free Prompt Tuning for Language-Audio Models. 491-495 - Pengyu Wang, Xiaofei Li:
RVAE-EM: Generative Speech Dereverberation Based On Recurrent Variational Auto-Encoder And Convolutive Transfer Function. 496-500 - Weilong Huang, Cheng Xue, Jinwei Feng, W. Bastiaan Kleijn:
A Practical Online Multichannel Dereverberation Approach with Data-Reuse Technique. 501-505 - Yile Angela Zhang, Fei Ma, Thushara D. Abhayapala, Prasanga N. Samarasinghe, Amy Bastine:
An Active Noise Control System Based On Soundfield Interpolation Using A Physics-Informed Neural Network. 506-510 - Fan Zhang, Chao Pan, Jacob Benesty, Jingdong Chen:
Directional Gain Based Noise Covariance Matrix Estimation for MVDR Beamforming. 511-515 - Soonhyeon Choi, Jung-Woo Choi:
Noisy-Arcmix: Additive Noisy Angular Margin Loss Combined With Mixup For Anomalous Sound Detection. 516-520 - Dichucheng Li, Yinghao Ma, Weixing Wei, Qiuqiang Kong, Yulun Wu, Mingjin Che, Fan Xia, Emmanouil Benetos, Wei Li:
Mertech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model with Multi-Task Finetuning. 521-525 - Haesun Joung, Kyogu Lee:
Music Auto-Tagging with Robust Music Representation Learned via Domain Adversarial Training. 526-530 - Théo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega Giménez:
An Explainable Proxy Model for Multilabel Audio Segmentation. 531-535 - Jae-Won Kim, Byeongho Jo, Seungkwon Beack, Hochong Park:
Pre-Echo Reduction in Transform Audio Coding via Temporal Envelope Control with Machine Learning Based Estimation. 536-540 - Wuyang Liu, Yanzhen Ren:
Semantic Proximity Alignment: Towards Human Perception-Consistent Audio Tagging by Aligning with Label Text Description. 541-545 - Jordi Pons, Xiaoyu Liu, Santiago Pascual, Joan Serrà:
GASS: Generalizing Audio Source Separation with Large-Scale Data. 546-550