


default search action
30th ACM Multimedia 2022: Lisboa, Portugal
- João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:

MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022. ACM 2022, ISBN 978-1-4503-9203-7
Keynote Talks
- Yoelle Maarek:

Alexa, let's work together! How Alexa Helps Customers Complete Tasks with Verbal and Visual Guidance in the Alexa Prize TaskBot Challenge. 1-2 - Nuria Oliver

:
Data Science against COVID-19: The Valencian Experience. 3-4 - Douwe Kiela:

Grounding, Meaning and Foundation Models: Adventures in Multimodal Machine Learning. 5
Oral Session I: Engaging Users with Multimedia -- Emotional and Social Signals
- Rui Li, Yiting Wang

, Wei-Long Zheng, Bao-Liang Lu:
A Multi-view Spectral-Spatial-Temporal Masked Autoencoder for Decoding Emotions with Self-supervised Learning. 6-14 - Teng Sun, Wenjie Wang, Liqiang Jing, Yiran Cui, Xuemeng Song, Liqiang Nie:

Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analysis. 15-23 - Yuanyuan Liu, Wei Dai, Chuanxu Feng, Wenbin Wang

, Guanghao Yin, Jiabei Zeng, Shiguang Shan:
MAFW: A Large-scale, Multi-modal, Compound Affective Database for Dynamic Facial Expression Recognition in the Wild. 24-32 - Shengzhe Liu, Xin Zhang, Jufeng Yang:

SER30K: A Large-Scale Dataset for Sticker Emotion Recognition. 33-41
Poster Session I: Engaging Users with Multimedia -- Emotional and Social Signals
- Jicai Pan, Shangfei Wang, Lin Fang:

Representation Learning through Multimodal Attention and Time-Sync Comments for Affective Video Content Analysis. 42-50 - Xujin Li, Wei Wei, Shuang Qiu, Huiguang He:

TFF-Former: Temporal-Frequency Fusion Transformer for Zero-training Decoding of Two BCI Tasks. 51-59 - Yuedong Chen, Xu Yang, Tat-Jen Cham

, Jianfei Cai:
Towards Unbiased Visual Emotion Recognition via Causal Intervention. 60-69 - Michal Balazia, Philipp Müller, Ákos Levente Tánczos, August von Liechtenstein, François Brémond:

Bodily Behaviors in Social Interaction: Novel Annotations and State-of-the-Art Evaluation. 70-79 - Niki Maria Foteinopoulou

, Ioannis Patras:
Learning from Label Relationships in Human Affect. 80-89 - Ziyi Ye, Xiaohui Xie, Yiqun Liu, Zhihong Wang, Xuesong Chen, Min Zhang, Shaoping Ma:

Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System. 90-100 - Yan Wang, Yixuan Sun, Wei Song

, Shuyong Gao, Yiwen Huang, Zhaoyu Chen, Weifeng Ge, Wenqiang Zhang:
DPCNet: Dual Path Multi-Excitation Collaborative Network for Facial Expression Representation Learning in Videos. 101-110 - Yingjie Chen, Chong Chen, Xiao Luo

, Jianqiang Huang, Xian-Sheng Hua, Tao Wang, Yun Liang:
Pursuing Knowledge Consistency: Supervised Hierarchical Contrastive Learning for Facial Action Unit Recognition. 111-119 - Shiqing Zhang, Ruixin Liu, Yijiao Yang, Xiaoming Zhao, Jun Yu:

Unsupervised Domain Adaptation Integrating Transformer and Mutual Information for Cross-Corpus Speech Emotion Recognition. 120-129 - Zhen Xing, Weimin Tan, Ruian He

, Yangle Lin, Bo Yan:
Co-Completion for Occluded Facial Expression Recognition. 130-140 - Weichen Yu, Hongyuan Yu, Yan Huang, Liang Wang:

Generalized Inter-class Loss for Gait Recognition. 141-150 - Fan Qi, Zixin Zhang

, Xianshan Yang, Huaiwen Zhang, Changsheng Xu:
Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation. 151-160 - Jianjian Shao, Zhenqian Wu, Yuanyan Luo, Shudong Huang, Xiaorong Pu, Yazhou Ren:

Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition. 161-169 - Yong Zhao, Haifeng Chen, Hichem Sahli, Ke Lu, Dongmei Jiang:

Uncertainty-Aware Semi-Supervised Learning of 3D Face Rigging from Single Image. 170-179 - Junyu Chen, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang:

A Unified Framework against Topology and Class Imbalance. 180-188 - Yang Yu, Dong Zhang, Shoushan Li:

Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning. 189-198 - Zhicheng Zhang

, Jufeng Yang:
Temporal Sentiment Localization: Listen and Look in Untrimmed Videos. 199-208 - Xinyu Cheng, Wei Wei, Changde Du, Shuang Qiu, Sanli Tian, Xiaojun Ma, Huiguang He:

VigilanceNet: Decouple Intra- and Inter-Modality Learning for Multimodal Vigilance Estimation in RSVP-Based BCI. 209-217 - Lijuan Wang, Guoli Jia, Ning Jiang, Haiying Wu, Jufeng Yang:

EASE: Robust Facial Expression Recognition via Emotion Ambiguity-SEnsitive Cooperative Networks. 218-227 - Bo-Kai Ruan, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng:

Mimicking the Annotation Process for Recognizing the Micro Expressions. 228-236
Oral Session II: Engaging User with Multimedia -- Multimedia Search and Recommendation
- Peng-Fei Zhang, Guangdong Bai

, Zi Huang
, Xin-Shun Xu:
Machine Unlearning for Image Retrieval: A Generative Scrubbing Approach. 237-245 - Jianfeng Dong, Xianke Chen, Minsong Zhang, Xun Yang, Shujie Chen, Xirong Li

, Xun Wang:
Partially Relevant Video Retrieval. 246-257 - Fangxiong Xiao, Lixi Deng, Jingjing Chen

, Houye Ji, Xiaorui Yang, Zhuoye Ding, Bo Long:
From Abstract to Details: A Generative Multimodal Fusion Framework for Recommendation. 258-267 - Weili Guan, Xuemeng Song, Haoyu Zhang

, Meng Liu, Chung-Hsing Yeh, Xiaojun Chang
:
Bi-directional Heterogeneous Graph Hashing towards Efficient Outfit Recommendation. 268-276 - MeiYu Liang, Junping Du, Xiaowen Cao, Yang Yu, Kangkang Lu, Zhe Xue, Min Zhang:

Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning. 277-285 - Dan Song, Yue Yang, Weizhi Nie

, Xuanya Li, An-An Liu:
Cross-Domain 3D Model Retrieval Based On Contrastive Learning And Label Propagation. 286-295 - Zhixin Ma, Chong-Wah Ngo:

Interactive Video Corpus Moment Retrieval using Reinforcement Learning. 296-306 - Chao Huang, Yabo Liu, Zheng Zhang

, Chengliang Liu
, Jie Wen, Yong Xu, Yaowei Wang:
Hierarchical Graph Embedded Pose Regularity Learning via Spatio-Temporal Transformer for Abnormal Behavior Detection. 307-315 - Yue Zhao, Weizhi Nie

, Zan Gao, Anan Liu:
HMTN: Hierarchical Multi-scale Transformer Network for 3D Shape Recognition. 316-324 - Peng-Fei Zhang, Zi Huang

, Guangdong Bai
, Xin-Shun Xu:
IDEAL: High-Order-Ensemble Adaptation Network for Learning with Noisy Labels. 325-333 - Yu Zheng

, Chen Gao, Jingtao Ding, Lingling Yi, Depeng Jin, Yong Li, Meng Wang:
DVR: Micro-Video Recommendation Optimizing Watch-Time-Gain under Duration Bias. 334-345
Poster Session II: Engaging User with Multimedia -- Multimedia Search and Recommendation
- Bolin Zhang, Chao Yang, Bin Jiang, Xiaokang Zhou:

Video Moment Retrieval with Hierarchical Contrastive Learning. 346-355 - Avinash Madasu, Junier Oliva, Gedas Bertasius:

Learning to Retrieve Videos by Asking Questions. 356-365 - Jinan Sun, Haixin Wang, Xiao Luo

, Shikun Zhang, Wei Xiang, Chong Chen, Xian-Sheng Hua:
HEART: Towards Effective Hash Codes under Label Noise. 366-375 - Zongshen Mu

, Yueting Zhuang, Jie Tan, Jun Xiao, Siliang Tang
:
Learning Hybrid Behavior Patterns for Multimedia Recommendation. 376-384 - Feiyu Chen, Junjie Wang, Yinwei Wei, Hai-Tao Zheng, Jie Shao:

Breaking Isolation: Multimodal Graph Fusion for Multimedia Recommendation by Edge-wise Modulation. 385-394 - Jianwei Zhu, Zhixin Li

, Yufei Zeng, Jiahui Wei, Huifang Ma:
Image-Text Matching with Fine-Grained Relational Dependency and Bidirectional Attention-Based Generative Networks. 395-403 - Yuxi Sun

, Shanshan Feng, Xutao Li, Yunming Ye, Jian Kang, Xu Huang:
Visual Grounding in Remote Sensing Images. 404-412 - Guolong Wang, Xun Wu, Zhaoyuan Liu, Junchi Yan:

Prompt-based Zero-shot Video Moment Retrieval. 413-421 - Yabing Wang, Jianfeng Dong, Tianxiang Liang, Minsong Zhang, Rui Cai, Xun Wang:

Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning. 422-433 - Ziyue Wang, Aozhu Chen, Fan Hu, Xirong Li

:
Learn to Understand Negation in Video Retrieval. 434-443 - Yongjie Zhu, Chunhui Han, Yuefeng Zhan, Bochen Pang, Zhaoju Li, Hao Sun, Si Li, Boxin Shi, Nan Duan, Weiwei Deng, Ruofei Zhang, Liangjie Zhang, Qi Zhang:

AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search. 444-452 - Junfeng Tu, Xueliang Liu, Zongxiang Lin, Richang Hong, Meng Wang:

Differentiable Cross-modal Hashing via Multimodal Transformers. 453-461 - Zhixin Ling

, Zhen Xing, Jiangtong Li
, Li Niu:
Multi-Level Region Matching for Fine-Grained Sketch-Based Image Retrieval. 462-470 - Xiaolin Zheng, Jiajie Su, Weiming Liu, Chaochao Chen:

DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation. 471-481 - Yong Zhuang, Tong Yu, Junda Wu

, Shiqu Wu, Shuai Li:
Spatial-Temporal Aligned Multi-Agent Learning for Visual Dialog Systems. 482-490 - Huafeng Liu, Liping Jing, Dahai Yu, Mingjie Zhou, Michael Ng:

Learning Intrinsic and Extrinsic Intentions for Cold-start Recommendation with Neural Stochastic Processes. 491-500 - Pingting Hong, Dayan Wu, Bo Li

, Weiping Wang
:
Camera-specific Informative Data Augmentation Module for Unbalanced Person Re-identification. 501-510 - Zhiqiang Guo

, Guohui Li, Jianjun Li, Huaicong Chen:
TopicVAE: Topic-aware Disentanglement Representation Learning for Enhanced Recommendation. 511-520 - Chao Huang, Chengliang Liu

, Zheng Zhang
, Zhihao Wu, Jie Wen, Qiuping Jiang, Yong Xu:
Pixel-Level Anomaly Detection via Uncertainty-aware Prototypical Transformer. 521-530 - Lei Tan, Pingyang Dai

, Rongrong Ji, Yongjian Wu:
Dynamic Prototype Mask for Occluded Person Re-Identification. 531-540 - Nan Pu

, Yu Liu, Wei Chen, Erwin M. Bakker
, Michael S. Lew:
Meta Reconciliation Normalization for Lifelong Person Re-Identification. 541-549 - Lin Wang, Wanqian Zhang, Dayan Wu, Fei Zhu, Bo Li

:
Attack is the Best Defense: Towards Preemptive-Protection Person Re-Identification. 550-559 - Kai Chen, Weihua Chen, Tao He, Rong Du, Fan Wang, Xiuyu Sun, Yuchen Guo, Guiguang Ding:

TAGPerson: A Target-Aware Generation Pipeline for Person Re-identification. 560-571 - Dayan Wu, Qinghang Su, Bo Li

, Weiping Wang
:
Efficient Hash Code Expansion by Recycling Old Bits. 572-580 - Desheng Cai, Shengsheng Qian, Quan Fang, Jun Hu, Changsheng Xu:

Adaptive Anti-Bottleneck Multi-Modal Graph Learning Network for Personalized Micro-video Recommendation. 581-590 - Uttaran Bhattacharya

, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha:
Show Me What I Like: Detecting User-Specific Video Highlights Using Content-Based Multi-Head Attention. 591-600 - Kai Wang, Yifan Wang

, Xing Xu, Xin Liu, Weihua Ou, Huimin Lu:
Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval. 601-609 - Siyuan Li, Xing Xu, Zailei Zhou, Yang Yang, Guoqing Wang, Heng Tao Shen:

ARRA: Absolute-Relative Ranking Attack against Image Retrieval. 610-618 - Xiaoyu Du, Zike Wu, Fuli Feng, Xiangnan He, Jinhui Tang:

Invariant Representation Learning for Multimedia Recommendation. 619-628 - Tianyuan Xu, Xueliang Liu, Zhen Huang, Dan Guo

, Richang Hong, Meng Wang:
Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels. 629-637 - Yiwei Ma, Guohai Xu, Xiaoshuai Sun, Ming Yan, Ji Zhang, Rongrong Ji:

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval. 638-647 - Yi Zhong, Chengyao Wang, Shiyong Li, Zhu Zhou, Yaowei Wang, Wei-Shi Zheng:

Mixed Supervision for Instance Learning in Object Detection with Few-shot Annotation. 648-658 - Zeyu Ma

, Wei Ju
, Xiao Luo
, Chong Chen, Xian-Sheng Hua, Guangming Lu:
Improved Deep Unsupervised Hashing via Prototypical Learning. 659-667 - Rui Wang, Feng Chen, Jun Tang, Pu Yan:

Adaptive Camera Margin for Mask-guided Domain Adaptive Person Re-identification. 668-677 - Shengshan Hu, Ziqi Zhou, Yechao Zhang, Leo Yu Zhang, Yifeng Zheng, Yuanyuan He, Hai Jin:

BadHash: Invisible Backdoor Attacks against Deep Hashing with Clean Label. 678-686 - Xiaohao Liu

, Zhulin Tao, Jiahong Shao, Lifang Yang, Xianglin Huang:
EliMRec: Eliminating Single-modal Bias in Multimedia Recommendation. 687-695 - Zhicheng Sun, Yadong Mu:

Patch-based Knowledge Distillation for Lifelong Person Re-Identification. 696-707
Oral Session III: Engaging User with Multimedia -- Summarization, Analytics, and Storytelling
- Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei:

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition. 708-718 - Xun Jiang, Xing Xu, Zhiguo Chen, Jingran Zhang, Jingkuan Song, Fumin Shen, Huimin Lu, Heng Tao Shen:

DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing. 719-727
Poster Session III: Engaging User with Multimedia -- Summarization, Analytics, and Storytelling
- Dixin Luo, Yutong Wang, Angxiao Yue, Hongteng Xu:

Weakly-Supervised Temporal Action Alignment Driven by Unbalanced Spectral Fused Gromov-Wasserstein Distance. 728-739 - Jiehang Xie

, Xuanbai Chen, Shao-Ping Lu, Yulu Yang:
A Knowledge Augmented and Multimodal-Based Framework for Video Summarization. 740-749 - Dizhan Xue

, Shengsheng Qian, Quan Fang, Changsheng Xu:
MMT: Image-guided Story Ending Generation with Multimodal Memory Transformer. 750-758 - Niankai Zhang, Junli Zhao, Fuqing Duan, Zhenkuan Pan, Zhongke Wu, Mingquan Zhou, Xianfeng Gu:

An End-to-End Conditional Generative Adversarial Network Based on Depth Map for 3D Craniofacial Reconstruction. 759-768 - Bowen Li, Philip H. S. Torr, Thomas Lukasiewicz:

Clustering Generative Adversarial Networks for Story Visualization. 769-778 - Jiayin Cai, Changlin Li, Xin Tao

, Chun Yuan, Yu-Wing Tai:
DeViT: Deformed Vision Transformers in Video Inpainting. 779-789 - Ming Yao, Yu Bai, Wei Du, Xuejun Zhang, Heng Quan, Fuli Cai, Hongwei Kang:

Multi-Level Spatiotemporal Network for Video Summarization. 790-798
Oral Session IV: Experience -- Interactions and Quality of Experience
- Li Yang

, Mai Xu, Tie Liu, Liangyu Huo, Xinbo Gao:
TVFormer: Trajectory-guided Visual Quality Assessment on 360° Images with Transformers. 799-808 - Zheng Lin, Zheng-Peng Duan, Zhao Zhang

, Chun-Le Guo, Ming-Ming Cheng
:
KnifeCut: Refining Thin Part Segmentation with Cutting Lines. 809-817 - Minju Kim, Yuhyun Lee, Jungjin Lee

:
Multi-view Layout Design for VR Concert Experience. 818-826 - Kui Jiang, Zhongyuan Wang, Chen Chen, Zheng Wang, Laizhong Cui, Chia-Wen Lin:

Magic ELF: Image Deraining Meets Association Learning and Transformer. 827-836 - Liang Liao, Kangmin Xu, Haoning Wu, Chaofeng Chen, Wenxiu Sun, Qiong Yan, Weisi Lin:

Exploring the Effectiveness of Video Perceptual Representation in Blind Video Quality Assessment. 837-846 - Mengshun Hu, Kui Jiang, Zhixiang Nie, Zheng Wang:

You Only Align Once: Bidirectional Interaction for Spatial-Temporal Video Super-Resolution. 847-855 - Wei Sun

, Xiongkuo Min, Wei Lu, Guangtao Zhai:
A Deep Learning based No-reference Quality Assessment Model for UGC Videos. 856-865
Poster Session IV: Experience - Interactions and Quality of Experience
- Szu-Wei Fu, Yaran Fan, Yasaman Hosseinkashi, Jayant Gupchup, Ross Cutler:

Improving Meeting Inclusiveness using Speech Interruption Analysis. 887-895 - Yaohui Li, Yuzhe Yang, Huaxiong Li, Haoxing Chen, Liwu Xu, Leida Li

, Yaqian Li, Yandong Guo:
Transductive Aesthetic Preference Propagation for Personalized Image Aesthetics Assessment. 896-904 - Zheng Lin, Zhao Zhang

, Linghao Han, Shao-Ping Lu:
Multi-Mode Interactive Image Segmentation. 905-914 - Nasim Jamshidi Avanaki, Steven Schmidt, Thilo Michael, Saman Zadtootaghaj, Sebastian Möller

:
Deep-BVQM: A Deep-learning Bitstream-based Video Quality Model. 915-923 - Anton Ratnarajah, Zhenyu Tang, Rohith Aralikatti, Dinesh Manocha:

MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes. 924-933 - Wei Zhou, Zhou Wang:

Quality Assessment of Image Super-Resolution: Balancing Deterministic and Statistical Fidelity. 934-942 - Chaofan Zhang, Shiguang Liu:

No-reference Omnidirectional Image Quality Assessment Based on Joint Network. 943-951 - Abhishek Kumar

, Lik-Hang Lee, Jagmohan Chauhan, Xiang Su, Mohammad Ashraful Hoque, Susanna Pirttikangas
, Sasu Tarkoma, Pan Hui:
PassWalk: Spatial Authentication Leveraging Lateral Shift and Gaze on Mobile Headsets. 952-960 - Jun Fu, Chen Hou, Wei Zhou, Jiahua Xu, Zhibo Chen:

Adaptive Hypergraph Convolutional Network for No-Reference 360-degree Image Quality Assessment. 961-969 - Xingran Liao, Baoliang Chen

, Hanwei Zhu
, Shiqi Wang
, Mingliang Zhou, Sam Kwong
:
DeepWSD: Projecting Degradations in Perceptual Space to Wasserstein Distance in Deep Feature Space. 970-978 - Bohua Peng, Mobarakol Islam

, Mei Tu:
Angular Gap: Reducing the Uncertainty of Image Difficulty through Model Calibration. 979-987 - Min Wang, Hao Yang

, Qing Cheng:
GCL: Graph Calibration Loss for Trustworthy Graph Neural Network. 988-996 - Yixuan Gao, Xiongkuo Min, Yucheng Zhu, Jing Li, Xiao-Ping Zhang, Guangtao Zhai:

Image Quality Assessment: From Mean Opinion Score to Opinion Score Distribution. 997-1005 - Zihan Zhou, Yong Xu, Ruotao Xu

, Yuhui Quan:
No-Reference Image Quality Assessment Using Dynamic Complex-Valued Neural Model. 1006-1015 - Tong Shao, Deming Zhai, Junjun Jiang, Xianming Liu:

Hybrid Conditional Deep Inverse Tone Mapping. 1016-1024 - Yili Jin, Junhua Liu, Fangxin Wang, Shuguang Cui

:
Where Are You Looking?: A Large-Scale Dataset of Head and Gaze Behavior for 360-Degree Videos and a Pilot Study. 1025-1034
Oral Session V: Experience -- Art and Culture
- Zhengyan Tong, Xiaohang Wang, Shengchao Yuan, Xuanhong Chen, Junjie Wang, Xiangzhong Fang:

Im2Oil: Stroke-Based Oil Painting Rendering with Linearly Controllable Fineness Via Adaptive Sampling. 1035-1046 - Chen Zhang, LuChin Chang, Songruoyao Wu, Xu Tan

, Tao Qin, Tie-Yan Liu, Kejun Zhang:
ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships. 1047-1056 - Zihao Wang

, Kejun Zhang, Yuxing Wang
, Chen Zhang, Qihao Liang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang:
SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias. 1057-1067 - Yijun Wang, Tao Liang, Jianxin Lin:

CACOLIT: Cross-domain Adaptive Co-learning for Imbalanced Image-to-Image Translation. 1068-1076 - Kyungwon Lee, Yu-Kyung Jang, Jaewoo Jung, Dong Hwan Kim, Hyun-Jean Lee, Seung Ah Lee

:
EuglPollock: Rethinking Interspecies Collaboration through Art Making. 1077-1084
Poster Session V: Experience -- Art and Culture
- Nisha Huang

, Fan Tang, Weiming Dong, Changsheng Xu:
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion. 1085-1094 - Zhizhong Wang, Zhanjie Zhang

, Lei Zhao, Zhiwen Zuo, Ailin Li, Wei Xing, Dongming Lu:
AesUST: Towards Aesthetic-Enhanced Universal Style Transfer. 1095-1106 - Matthias Springstein, Stefanie Schneider

, Christian Althaus, Ralph Ewerth:
Semi-supervised Human Pose Estimation in Art-historical Images. 1107-1116 - Shenglan Cui, Fang Liu, Tongqing Zhou, Mohan Zhang

:
Understanding and Identifying Artwork Plagiarism with the Wisdom of Designers: A Case Study on Poster Artworks. 1117-1127 - Quanwei Yang, Xinchen Liu, Wu Liu, Hongtao Xie, Xiaoyan Gu

, Lingyun Yu, Yongdong Zhang:
REMOT: A Region-to-Whole Framework for Realistic Human Motion Transfer. 1128-1137 - Zixuan Wang, Jia Jia, Haozhe Wu, Junliang Xing, Jinghe Cai, Fanbo Meng, Guowen Chen, Yanfeng Wang:

GroupDancer: Music to Multi-People Dance Synthesis with Style Collaboration. 1138-1146 - Daqian Shi, Xiaolei Diao

, Lida Shi, Hao Tang, Yang Chi, Chuntao Li, Hao Xu:
CharFormer: A Glyph Fusion based Attentive Framework for High-precision Character Image Denoising. 1147-1155 - Guang Yang

, Wu Liu, Xinchen Liu, Xiaoyan Gu
, Juan Cao, Jintao Li:
Delving into the Frequency: Temporally Consistent Human Motion Transfer in the Fourier Space. 1156-1166 - Zhimeng Zhang, Yu Ding:

Adaptive Affine Transformation: A Simple and Effective Operation for Spatial Misaligned Image Generation. 1167-1176 - Daqian Shi, Xiaolei Diao

, Hao Tang, Xiaomin Li, Hao Xing, Hao Xu:
RCRN: Real-world Character Image Restoration Network via Skeleton Extraction. 1177-1185 - Yupei Lin

, Sen Zhang, Tianshui Chen, Yongyi Lu, Guangping Li, Yukai Shi:
Exploring Negatives in Contrastive Learning for Unpaired Image-to-Image Translation. 1186-1194 - Xiang Chang

, Fei Chao, Changjing Shang, Qiang Shen:
Sundial-GAN: A Cascade Generative Adversarial Networks Framework for Deciphering Oracle Bone Inscriptions. 1195-1203 - Xueyao Zhang

, Jinchao Zhang, Yao Qiu, Li Wang, Jie Zhou:
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning. 1204-1213 - Xingzhong Hou, Boxiao Liu, Shuai Zhang, Lulin Shi, Zite Jiang

, Haihang You:
Dynamic Weighted Semantic Correspondence for Few-Shot Image Generative Adaptation. 1214-1222 - Zhejing Hu, Xiao Ma

, Yan Liu, Gong Chen
, Yongxu Liu:
The Beauty of Repetition in Machine Composition Scenarios. 1223-1231 - Xin Huang, Dong Liang, Hongrui Cai, Juyong Zhang, Jinyuan Jia:

CariPainter: Sketch Guided Interactive Caricature Generation. 1232-1240 - Jieun Lee, Hyeonwoo Kim, Jonghwa Shim, Eenjun Hwang:

Cartoon-Flow: A Flow-Based Generative Adversarial Network for Arbitrary-Style Photo Cartoonization. 1241-1251
Oral Session VI: Experience -- Multimedia Applications
- Yiling Wu, Xinfeng Zhang, Yaowei Wang, Qingming Huang:

Span-based Audio-Visual Localization. 1252-1260 - Jibin Gao

, Junfu Pu, Honglun Zhang, Ying Shan, Wei-Shi Zheng:
PC-Dance: Posture-controllable Music-driven Dance Synthesis. 1261-1269 - Haipeng Liu

, Yang Wang, Meng Wang, Yong Rui:
Delving Globally into Texture and Structure for Image Inpainting. 1270-1278 - Zeyu Ma, Yang Yang, Guoqing Wang, Xing Xu, Heng Tao Shen, Mingxing Zhang:

Rethinking Open-World Object Detection in Autonomous Driving Scenarios. 1279-1288 - Zhihua Hu, Bo Duan, Yanfeng Zhang, Mingwei Sun, Jingwei Huang:

MVLayoutNet: 3D Layout Reconstruction with Multi-view Panoramas. 1289-1298 - Jiaming Li, Hongtao Xie, Lingyun Yu, Yongdong Zhang:

Wavelet-enhanced Weakly Supervised Local Feature Learning for Face Forgery Detection. 1299-1308 - Xiaoyu Ma, Yaqi Wang, Chang Liu, Suiyu Zhang, Dingguo Yu:

ADGNet: Attention Discrepancy Guided Deep Neural Network for Blind Image Quality Assessment. 1309-1318 - Jingjing Wu, Pengyuan Lyu

, Guangming Lu, Chengquan Zhang, Kun Yao, Wenjie Pei:
Decoupling Recognition from Detection: Single Shot Self-Reliant Scene Text Spotter. 1319-1328 - Chaofeng Chen, Xinyu Shi, Yipeng Qin, Xiaoming Li, Xiaoguang Han, Tao Yang, Shihui Guo:

Real-World Blind Super-Resolution via Feature Matching with Implicit High-Resolution Priors. 1329-1338 - Mengya Han, Heliang Zheng, Chaoyue Wang, Yong Luo, Han Hu, Bo Du:

Leveraging GAN Priors for Few-Shot Part Segmentation. 1339-1347 - Bo Fang

, Wenhao Wu
, Chang Liu, Yu Zhou
, Dongliang He, Weiping Wang
:
MaMiCo: Macro-to-Micro Semantic Correspondence for Self-supervised Video Representation Learning. 1348-1357 - Jinwang Pan

, Deming Zhai, Yuanchao Bai, Junjun Jiang, Debin Zhao, Xianming Liu:
ChebyLighter: Optimal Curve Estimation for Low-light Image Enhancement. 1358-1366 - Xiaotong Lu, Teng Xi, Baopu Li, Gang Zhang, Weisheng Dong, Guangming Shi:

Bayesian based Re-parameterization for DNN Model Pruning. 1367-1375 - Dejia Xu, Hayk Poghosyan, Shant Navasardyan, Yifan Jiang, Humphrey Shi

, Zhangyang Wang:
ReCoRo: Region-Controllable Robust Light Enhancement with User-Specified Imprecise Masks. 1376-1386 - Aaron Chadha, Ioannis Katsavounidis, Ayan Kumar Bhunia, Cosmin Stejerean

, Muhammad Umar Karim Khan, Yiannis Andreopoulos:
Domain-Specific Fusion Of Objective Video Quality Metrics. 1387-1395 - Wen Yang, Jinjian Wu, Jupo Ma, Leida Li

, Weisheng Dong, Guangming Shi:
Learning for Motion Deblurring with Hybrid Frames and Events. 1396-1404 - Yulei Lu, Yawei Luo

, Li Zhang, Zheyang Li, Yi Yang, Jun Xiao:
Bidirectional Self-Training with Multiple Anisotropic Prototypes for Domain Adaptive Semantic Segmentation. 1405-1415 - Hui Lin

, Zhiheng Ma
, Xiaopeng Hong, Yaowei Wang, Zhou Su:
Semi-supervised Crowd Counting via Density Agency. 1416-1426 - Huachen Fang, Jinjian Wu, Leida Li

, Junhui Hou
, Weisheng Dong, Guangming Shi:
AEDNet: Asynchronous Event Denoising with Spatial-Temporal Correlation among Irregular Data. 1427-1435 - Hansen Feng

, Lizhi Wang, Yuzhi Wang, Hua Huang
:
Learnability Enhancement for Low-light Raw Denoising: Where Paired Real Data Meets Noise Modeling. 1436-1444 - Qian Cao

, Xu Chen, Ruihua Song, Hao Jiang, Guang Yang, Zhao Cao:
Multi-Modal Experience Inspired AI Creation. 1445-1454 - Boming Zhao

, Bangbang Yang, Zhenyang Li, Zuoyue Li, Guofeng Zhang, Jiashu Zhao, Dawei Yin, Zhaopeng Cui, Hujun Bao:
Factorized and Controllable Neural Re-Rendering of Outdoor Scene for Photo Extrapolation. 1455-1464 - Zhuowen Yuan

, Zhengxin You, Sheng Li, Zhenxing Qian
, Xinpeng Zhang, Alex C. Kot:
On Generating Identifiable Virtual Faces. 1465-1473 - Peijia Zheng, Zhiwei Cai, Huicong Zeng, Jiwu Huang:

Keyword Spotting in the Homomorphic Encrypted Domain Using Deep Complex-Valued CNN. 1474-1483 - Zhangkai Ni

, Wenhan Yang, Hanli Wang, Shiqi Wang, Lin Ma, Sam Kwong
:
Cycle-Interactive Generative Adversarial Network for Robust Unsupervised Low-Light Enhancement. 1484-1492 - Yunhao Li, Zhenbo Yu, Yucheng Zhu, Bingbing Ni, Guangtao Zhai, Wei Shen:

Skeleton2Humanoid: Animating Simulated Characters for Physically-plausible Motion In-betweening. 1493-1502 - Jiahao Li, Bin Li, Yan Lu:

Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression. 1503-1511 - Shuai Li, Kaixin Wang, Yanbo Gao, Xun Cai, Mao Ye:

Geometric Warping Error Aware CNN for DIBR Oriented View Synthesis. 1512-1521
Poster Session VI: Experience -- Multimedia Applications
- Jinbao Wang, Guoyang Xie, Yawen Huang, Yefeng Zheng, Yaochu Jin, Feng Zheng:

FedMed-ATL: Misaligned Unpaired Cross-Modality Neuroimage Synthesis via Affine Transform Loss. 1522-1531 - Rui Ma, Mengxi Guo

, Yi Hou
, Fan Yang, Yuan Li, Huizhu Jia, Xiaodong Xie:
Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms. 1532-1542 - Kaixiong Gong

, Shuang Li, Shugang Li
, Rui Zhang, Chi Harold Liu
, Qiang Chen:
Improving Transferability for Domain Adaptive Detection Transformers. 1543-1551 - Dariusz Mikulowski:

Support for Teaching Mathematics of the Blind by Sighted Tutors Through Multisensual Access to Formulas with Braille Converters and Speech. 1552-1560 - Yunning Cao, Ye Ma

, Min Zhou, Chuanbin Liu, Hongtao Xie, Tiezheng Ge, Yuning Jiang:
Geometry Aligned Variational Transformer for Image-conditioned Layout Generation. 1561-1571 - Xianggang Yu, Jiapeng Tang, Yipeng Qin, Chenghong Li

, Xiaoguang Han, Linchao Bao, Shuguang Cui
:
PVSeRF: Joint Pixel-, Voxel- and Surface-Aligned Radiance Field for Single-Image Novel View Synthesis. 1572-1583 - Chaowei Fang, Dingwen Zhang, Liang Wang, Yulun Zhang

, Lechao Cheng
, Junwei Han:
Cross-Modality High-Frequency Transformer for MR Image Super-Resolution. 1584-1592 - Xintian Wu, Hanbin Zhao, Liangli Zheng, Shouhong Ding, Xi Li:

Adma-GAN: Attribute-Driven Memory Augmented GANs for Text-to-Image Generation. 1593-1602 - Chang Tang, Zhenglai Li, Weiqing Yan, Guanghui Yue, Wei Zhang:

Efficient Multiple Kernel Clustering via Spectral Perturbation. 1603-1611 - Yang Yang, Jingshuai Zhang, Fan Gao, Xiaoru Gao, Hengshu Zhu:

DOMFN: A Divergence-Orientated Multi-Modal Fusion Network for Resume Assessment. 1612-1620 - Ping Wei, Sheng Li, Xinpeng Zhang, Ge Luo, Zhenxing Qian

, Qing Zhou:
Generative Steganography Network. 1621-1629 - Haiping Wang

, Yuan Liu, Zhen Dong, Wenping Wang:
You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors. 1630-1641 - Dingkang Yang, Shuai Huang, Haopeng Kuang, Yangtao Du, Lihua Zhang

:
Disentangled Representation Learning for Multimodal Emotion Recognition. 1642-1651 - Yi Huang, Xiaoshan Yang

, Ji Zhang, Changsheng Xu:
Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation. 1652-1660 - Lin Yuan

, Linguo Liu, Xiao Pu, Zhao Li, Hongbo Li, Xinbo Gao:
PRO-Face: A Generic Framework for Privacy-preserving Recognizable Obfuscation of Face Images. 1661-1669 - Xuanhan Wang, Yan Dai, Lianli Gao, Jingkuan Song:

Skeleton-based Action Recognition via Adaptive Cross-Form Learning. 1670-1678 - Yi Zhang, Weixuan Liang, Xinwang Liu, Sisi Dai, Siwei Wang, Liyang Xu, En Zhu:

Sample Weighted Multiple Kernel K-means via Min-Max optimization. 1679-1687 - Hanlei Zhang

, Hua Xu, Xin Wang, Qianrui Zhou
, Shaojie Zhao, Jiayan Teng:
MIntRec: A New Dataset for Multimodal Intent Recognition. 1688-1697 - Zhangming Li

, Shengsheng Qian, Jie Cao, Quan Fang, Changsheng Xu:
Adaptive Transformer-Based Conditioned Variational Autoencoder for Incomplete Social Event Classification. 1698-1707 - Dingkang Yang, Haopeng Kuang, Shuai Huang, Lihua Zhang

:
Learning Modality-Specific and -Agnostic Representations for Asynchronous Multimodal Language Sequences. 1708-1717 - Zijin Wu, Xingyi Li

, Juewen Peng, Hao Lu, Zhiguo Cao, Weicai Zhong:
DoF-NeRF: Depth-of-Field Meets Neural Radiance Fields. 1718-1729 - Mingjin Zhang, Haichen Bai

, Jing Zhang
, Rui Zhang, Chaoyue Wang, Jie Guo
, Xinbo Gao:
RKformer: Runge-Kutta Transformer with Random-Connection Attention for Infrared Small Target Detection. 1730-1738 - Liqiang Yin, Ruize Han, Wei Feng, Song Wang:

Self-Supervised Human Pose based Multi-Camera Video Synchronization. 1739-1748 - Zhekai Du, Jingjing Li, Lin Zuo, Lei Zhu, Ke Lu:

Energy-Based Domain Generalization for Face Anti-Spoofing. 1749-1757 - Jiajian Zhao, Yifan Zhao, Xiaowu Chen, Jia Li:

Revisiting Stochastic Learning for Generalizable Person Re-identification. 1758-1768 - Zhuo Chen

, Chaoyue Wang, Haimei Zhao
, Bo Yuan
, Xiu Li:
D2Animator: Dual Distillation of StyleGAN For High-Resolution Face Animation. 1769-1778 - Lijian Gao, Ling Zhou, Qirong Mao, Ming Dong:

Adaptive Hierarchical Pooling for Weakly-supervised Sound Event Detection. 1779-1787 - Juze Zhang, Jingya Wang, Ye Shi, Fei Gao, Lan Xu, Jingyi Yu:

Mutual Adaptive Reasoning for Monocular 3D Multi-Person Pose Estimation. 1788-1796 - Fengjun Li, Xin Feng, Fanglin Chen, Guangming Lu, Wenjie Pei:

Learning Generalizable Latent Representations for Novel Degradations in Super-Resolution. 1797-1807 - Run Wang, Haoxuan Li, Lingzhou Mu, Jixing Ren, Shangwei Guo, Li Liu, Liming Fang, Jing Chen, Lina Wang:

Rethinking the Vulnerability of DNN Watermarking: Are Watermarks Robust against Naturalness-aware Perturbations? 1808-1818 - Xiao Pan, Peike Li, Zongxin Yang, Huiling Zhou, Chang Zhou, Hongxia Yang, Jingren Zhou, Yi Yang:

In-N-Out Generative Learning for Dense Unsupervised Video Segmentation. 1819-1827 - Rishubh Parihar, Ankit Dhiman

, Tejan Karmali, Venkatesh Babu R.:
Everything is There in Latent Space: Attribute Editing and Attribute Style Manipulation by StyleGAN Latent Space Exploration. 1828-1836 - Dongyu She, Kun Xu:

An Image-to-video Model for Real-Time Video Enhancement. 1837-1846 - Xuesong Niu, Jili Gu, Guoxin Zhang, Pengfei Wan, Zhongyuan Wang:

Learning an Inference-accelerated Network from a Pre-trained Model with Frequency-enhanced Feature Distillation. 1847-1856 - Mingjin Zhang, Ke Yue, Jing Zhang

, Yunsong Li, Xinbo Gao:
Exploring Feature Compensation and Cross-level Correlation for Infrared Small Target Detection. 1857-1865 - Fuming You, Jingjing Li, Zhi Chen

, Lei Zhu:
Pixel Exclusion: Uncertainty-aware Boundary Discovery for Active Cross-Domain Semantic Segmentation. 1866-1874 - Mingjia Li

, Yuanbin Fu, Xinhui Li, Xiaojie Guo:
Deep Flexible Structure Preserving Image Smoothing. 1875-1883 - Taeheon Kim, Youngjoon Yu, Yong Man Ro:

Defending Physical Adversarial Attack on Object Detection via Adversarial Patch-Feature Energy. 1905-1913 - Shankhanil Mitra

, Rajiv Soundararajan:
Multiview Contrastive Learning for Completely Blind Video Quality Assessment of User Generated Content. 1914-1924 - Lechao Cheng

, Chaowei Fang, Dingwen Zhang, Guanbin Li, Gang Huang
:
Compound Batch Normalization for Long-tailed Image Classification. 1925-1934 - Yalan Ye

, Ziqi Liu, Yangwuyong Zhang, Jingjing Li, Hengtao Shen:
Alleviating Style Sensitivity then Adapting: Source-free Domain Adaptation for Medical Image Segmentation. 1935-1944 - Jian Liu, Yufeng Chen, Jinan Xu:

Multimedia Event Extraction From News With a Unified Contrastive Learning Framework. 1945-1953 - Bolun Zheng, Xiaokai Pan, Hua Zhang, Xiaofei Zhou, Gregory G. Slabaugh

, Chenggang Yan, Shanxin Yuan:
DomainPlus: Cross Transform Domain Learning towards High Dynamic Range Imaging. 1954-1963 - Shuai Wang, Da Yang, Yubin Wu, Yang Liu, Hao Sheng:

Tracking Game: Self-adaptative Agent based Multi-object Tracking. 1964-1972 - Gangwei Jiang, Shiyao Wang, Tiezheng Ge, Yuning Jiang, Ying Wei

, Defu Lian
:
Self-Supervised Text Erasing with Controllable Image Synthesis. 1973-1983 - Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, Yifeng Li:

Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold. 1984-1992 - Xingxing Zhang, Zhizhe Liu, Weikai Yang, Liyuan Wang, Jun Zhu:

The More, The Better? Active Silencing of Non-Positive Transfer for Efficient Multi-Domain Few-Shot Classification. 1993-2001 - Lu Zhang, Yang Wang, Jiaogen Zhou, Chenbo Zhang, Yinglu Zhang, Jihong Guan, Yatao Bian

, Shuigeng Zhou:
Hierarchical Few-Shot Object Detection: Problem, Benchmark and Method. 2002-2011 - Renshuai Tao, Tianbo Wang, Ziyang Wu, Cong Liu, Aishan Liu, Xianglong Liu:

Few-shot X-ray Prohibited Item Detection: A Benchmark and Weak-feature Enhancement Network. 2012-2020 - Shilv Cai, Zhijun Zhang, Liqun Chen, Luxin Yan, Sheng Zhong, Xu Zou:

High-Fidelity Variable-Rate Image Compression via Invertible Activation Transformation. 2021-2031 - Xudong Mao, Liujuan Cao, Aurele Tohokantche Gnanha

, Zhenguo Yang, Qing Li, Rongrong Ji:
Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and Editability. 2032-2041 - Yeqi Bai, Tao Ma, Lipo Wang

, Zhenjie Zhang:
Speech Fusion to Face: Bridging the Gap Between Human's Vocal Characteristics and Facial Imaging. 2042-2050 - Wei Li, Tianzhao Yang, Xiao Wu, Xian-Jun Du, Jian-Jun Qiao

:
Learning Action-guided Spatio-temporal Transformer for Group Activity Recognition. 2051-2060 - Yangyang Guo, Liqiang Nie, Yongkang Wong, Yibing Liu

, Zhiyong Cheng, Mohan S. Kankanhalli
:
A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA. 2061-2069 - Tengyu Ma

, Long Ma, Xin Fan, Zhongxuan Luo, Risheng Liu
:
PIA: Parallel Architecture with Illumination Allocator for Joint Enhancement and Detection in Low-Light. 2070-2078 - Abhinav Aggarwal, Yash Pandya, Lokesh A. Ravindranathan, Laxmi S. Ahire, Manivel Sethu, Kaustav Nandy:

Robust Actor Recognition in Entertainment Multimedia at Scale. 2079-2087 - Yufan Zhang, Junkai Man, Peng Sun:

MF-Net: A Novel Few-shot Stylized Multilingual Font Generation Method. 2088-2096 - Yuan Sun, Dezhong Peng, Haixiao Huang, Zhenwen Ren

:
Feature and Semantic Views Consensus Hashing for Image Set Classification. 2097-2105 - Che Sun, Yunde Jia, Yuwei Wu:

Evidential Reasoning for Video Anomaly Detection. 2106-2114 - Danni Xu

, Ruimin Hu, Zheng Wang, Linbo Luo, Dengshi Li, Wenjun Zeng
:
Gaze- and Spacing-flow Unveil Intentions: Hidden Follower Discovery. 2115-2123 - Hongcheng Zhang, Xu Zhao, Dongqi Wang:

Semi-supervised Learning for Multi-label Video Action Detection. 2124-2134 - Bo Zhang

, Jiakang Yuan, Baopu Li, Tao Chen, Jiayuan Fan, Botian Shi:
Learning Cross-Image Object Semantic Relation in Transformer for Few-Shot Fine-Grained Image Classification. 2135-2144 - Mengshun Hu, Kui Jiang, Liang Liao, Zhixiang Nie, Jing Xiao, Zheng Wang:

Progressive Spatial-temporal Collaborative Network for Video Frame Interpolation. 2145-2153 - Xinwei Xue, Jia He, Long Ma, Yi Wang, Xin Fan, Risheng Liu

:
Best of Both Worlds: See and Understand Clearly in the Dark. 2154-2162 - Xin Jin, Tianyu He, Xu Shen, Tongliang Liu, Xinchao Wang

, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua:
Meta Clustering Learning for Large-scale Unsupervised Person Re-identification. 2163-2172 - Xiaotong Luo, Mingliang Dai, Yulun Zhang

, Yuan Xie, Ding Liu, Yanyun Qu, Yun Fu, Junping Zhang:
Adjustable Memory-efficient Image Super-resolution via Individual Kernel Sparsity. 2173-2181 - Ning Wang

, Jing Zhang
, Lefei Zhang, Dacheng Tao
:
GT-MUST: Gated Try-on by Learning the Mannequin-Specific Transformation. 2182-2190 - Chen Long

, Wenxiao Zhang, Ruihui Li, Hao Wang, Zhen Dong, Bisheng Yang:
PC2-PU: Patch Correlation and Point Correlation for Effective Point Cloud Upsampling. 2191-2201 - Luoyuan Xu, Tao Guan, Yuesong Wang, Yawei Luo, Zhuo Chen, Wenkai Liu, Wei Yang:

Self-Supervised Multi-view Stereo via Adjacent Geometry Guided Volume Completion. 2202-2210 - Zhenbo Shi

, Zhi Chen, Zhenbo Xu, Wei Yang, Liusheng Huang:
AtHom: Two Divergent Attentions Stimulated By Homomorphic Training in Text-to-Image Synthesis. 2211-2219 - Zhiqiang Fu, Yao Zhao, Dongxia Chang, Yiming Wang, Jie Wen, Xingxing Zhang, Guodong Guo:

One-step Low-Rank Representation for Clustering. 2220-2228 - Syed Muhammad Israr

, Feng Zhao:
Customizing GAN Using Few-shot Sketches. 2229-2238 - Mustafa Shukor, Bharath Bhushan Damodaran, Xu Yao, Pierre Hellier:

Video Coding using Learned Latent GAN Compression. 2239-2248 - Qiujing Lu, Yipeng Zhang

, Mingjian Lu, Vwani Roychowdhury:
Action-conditioned On-demand Motion Generation. 2249-2257 - Wenxu Shi, Lei Zhang, Weijie Chen, Shiliang Pu:

Universal Domain Adaptive Object Detector. 2258-2266 - Han Fang, Zhaoyang Jia, Zehua Ma

, Ee-Chien Chang, Weiming Zhang:
PIMoG: An Effective Screen-shooting Noise-Layer Simulation for Deep-Learning-Based Watermarking Network. 2267-2275 - Puneet Mathur, Atula Tejaswi Neerkaje, Malika Chhibber, Ramit Sawhney, Fuming Guo, Franck Dernoncourt, Sanghamitra Dutta, Dinesh Manocha:

MONOPOLY: Financial Prediction from MONetary POLicY Conference Videos Using Multimodal Cues. 2276-2285 - Pan Mu, Haotian Qian, Cong Bai:

Structure-Inferred Bi-level Model for Underwater Image Enhancement. 2286-2295 - Yazhou Xing, Yu Li

, Xintao Wang, Ye Zhu, Qifeng Chen:
Composite Photograph Harmonization with Complete Background Cues. 2296-2304 - Ke Qiu, Yawen Lai, Shiyi Liu, Ronggang Wang:

Self-supervised Multi-view Stereo via Inter and Intra Network Pseudo Depth. 2305-2313 - Zhenzhong Kuang, Longbin Teng, Zhou Yu

, Jun Yu, Jianping Fan, Mingliang Xu:
Delegate-based Utility Preserving Synthesis for Pedestrian Image Anonymization. 2314-2323 - Mingqian Wang, Yujun Zhang

, Wei Feng, Lei Zhu, Song Wang:
Video Instance Lane Detection via Deep Temporal and Geometry Consistency Constraints. 2324-2332 - Xu Liu, Jianing Li, Xianqi Zhang, Jingyuan Sun, Xiaopeng Fan, Yonghong Tian:

Learning Visible Surface Area Estimation for Irregular Objects. 2333-2343 - Qinwei Chang, Leichao Huang, Shaoteng Liu

, Hualuo Liu, Tianshu Yang, Yexin Wang:
Blind Robust Video Watermarking Based on Adaptive Region Selection and Channel Reference. 2344-2350 - Yongqi Zhai, Luyang Tang, Yi Ma, Rui Peng, Ronggang Wang:

Disparity-based Stereo Image Compression with Aligned Cross-View Priors. 2351-2360 - Junkun Yuan, Xu Ma, Defang Chen, Kun Kuang, Fei Wu, Lanfen Lin:

Label-Efficient Domain Generalization via Collaborative Exploration and Generalization. 2361-2370 - Wufan Wang, Lei Zhang, Hua Huang

:
Progressive Unsupervised Learning of Local Descriptors. 2371-2379 - Dong Zhang, Jinhui Tang, Kwang-Ting Cheng:

Graph Reasoning Transformer for Image Parsing. 2380-2389 - Qiang Liu, Tongqing Zhou, Zhiping Cai, Yonghao Tang:

Opportunistic Backdoor Attacks: Exploring Human-imperceptible Vulnerabilities on Speech Recognition Systems. 2390-2398 - Zhiqiang Gao

, Shufei Zhang, Kaizhu Huang, Qiufeng Wang, Rui Zhang, Chaoliang Zhong:
Certifying Better Robust Generalization for Unsupervised Domain Adaptation. 2399-2410 - Yu Yin

, Joseph P. Robinson
, Yun Fu:
Multimodal In-bed Pose and Shape Estimation under the Blankets. 2411-2419 - Xiaoyu Han, Shengping Zhang, Qinglin Liu, Zonglin Li, Chenyang Wang:

Progressive Limb-Aware Virtual Try-On. 2420-2429 - Anna Zhu, Zhanhui Yin, Brian Kenji Iwana, Xinyu Zhou

, Shengwu Xiong:
Text Style Transfer based on Multi-factor Disentanglement and Mixture. 2430-2440 - Zhaoyi Wan, Dejia Xu, Zhangyang Wang, Jian Wang, Jiebo Luo

:
Cloud2Sketch: Augmenting Clouds with Imaginary Sketches. 2441-2451 - Daiheng Gao, Xindi Zhang, Xingyu Chen, Andong Tan, Bang Zhang, Pan Pan, Ping Tan:

CycleHand: Increasing 3D Pose Estimation Ability on In-the-wild Monocular Image through Cyclic Flow. 2452-2463 - Ziwen He, Wei Wang

, Weinan Guan
, Jing Dong, Tieniu Tan:
Defeating DeepFakes via Adversarial Visual Reconstruction. 2464-2472 - Xichu Ma, Yuchen Wang

, Ye Wang
:
Content based User Preference Modeling in Music Generation. 2473-2482 - Liliang Chen, Jiaqi Li

, Han Huang, Yandong Guo:
CrossHuman: Learning Cross-guidance from Multi-frame Images for Human Reconstruction. 2483-2494 - Zhiqian Lin, Jiangke Lin, Lincheng Li, Yi Yuan, Zhengxia Zou:

High-Quality 3D Face Reconstruction with Affine Convolutional Networks. 2495-2503 - Astitva Srivastava, Chandradeep Pokhariya, Sai Sagar Jinka, Avinash Sharma:

xCloth: Extracting Template-free Textured 3D Clothes from a Monocular Image. 2504-2512 - Kangneng Zhou, Xiaobin Zhu, Daiheng Gao, Kai Lee, Xinjie Li, Xu-Cheng Yin:

SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute. 2513-2524 - Rongjie Huang, Chenye Cui, Feiyang Chen, Yi Ren, Jinglin Liu, Zhou Zhao, Baoxing Huai, Zhefeng Wang:

SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation. 2525-2535 - Yinpeng Chen, Zhiyu Pan, Min Shi, Hao Lu, Zhiguo Cao, Weicai Zhong:

Design What You Desire: Icon Generation from Orthogonal Application and Theme Labels. 2536-2546 - Zhaohui Jing, Youjian Zhang, Chaoyue Wang, Daqing Liu, Yong Xia:

Semantically-Consistent Dynamic Blurry Image Generation for Image Deblurring. 2547-2555 - Xintao Wang, Chao Dong, Ying Shan:

RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization. 2556-2564 - Shuoyi Chen, Mang Ye

, Bo Du:
Rotation Invariant Transformer for Recognizing Object in UAVs. 2565-2574 - Feifei Shao, Yawei Luo

, Ping Liu, Jie Chen, Yi Yang, Yulei Lu, Jun Xiao:
Active Learning for Point Cloud Semantic Segmentation via Spatial-Structural Diversity Reasoning. 2575-2585 - Ji Zhang, Jingkuan Song, Lianli Gao, Hengtao Shen:

Free-Lunch for Cross-Domain Few-Shot Learning: Style-Aware Episodic Training with Robust Contrastive Learning. 2586-2594 - Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren:

ProDiff: Progressive Fast Diffusion Model for High-Quality Text-to-Speech. 2595-2605 - Yifeng Zhou, Chuming Lin, Donghao Luo, Yong Liu, Ying Tai, Chengjie Wang, Mingang Chen:

Joint Learning Content and Degradation Aware Feature for Blind Super-Resolution. 2606-2616 - Wenjing Wang, Zhengbo Xu, Haofeng Huang, Jiaying Liu

:
Self-Aligned Concave Curve: Illumination Enhancement for Unsupervised Adaptation. 2617-2626 - Hong Ding, Fei Luo, Caoqing Jiang, Gang Fu, Zipei Chen, Shenghong Hu, Chunxia Xiao:

Photorealistic Style Transfer via Adaptive Filtering and Channel Seperation. 2627-2635 - Junyu Chen, Qianqian Xu, Zhiyong Yang, Ke Ma, Xiaochun Cao, Qingming Huang:

Recurrent Meta-Learning against Generalized Cold-start Problem in CTR Prediction. 2636-2644 - Liutao Yang

, Rongjun Ge, Shichang Feng, Daoqiang Zhang:
Learning Projection Views for Sparse-View CT Reconstruction. 2645-2653 - Peichi Zhou, Dingbo Lu, Chen Li, Jian Zhang, Long Liu, Changbo Wang:

Unsupervised Textured Terrain Generation via Differentiable Rendering. 2654-2662 - Nikita Drobyshev, Jenya Chelishev, Taras Khakhulin, Aleksei Ivakhnenko, Victor Lempitsky, Egor Zakharov

:
MegaPortraits: One-shot Megapixel Neural Head Avatars. 2663-2671 - Xin Ding, Tsuyoshi Takatani, Zhongyuan Wang, Ying Fu, Yinqiang Zheng

:
Event-guided Video Clip Generation from Blurry Images. 2672-2680 - Jianhui Chang, Jian Zhang, Youmin Xu, Jiguo Li, Siwei Ma, Wen Gao:

Consistency-Contrast Learning for Conceptual Coding. 2681-2690 - Mandi Luo, Jie Cao, Ran He:

Order-aware Human Interaction Manipulation. 2691-2699 - Zipei Chen, Xiao Lu

, Ling Zhang, Chunxia Xiao:
Semi-supervised Video Shadow Detection via Image-assisted Pseudo-label Generation. 2700-2708 - Xiaohao Xu, Jinglu Wang

, Xiang Ming, Yan Lu:
Towards Robust Video Object Segmentation with Adaptive Object Calibration. 2709-2718 - Chengming Xu, Chen Liu, Siqian Yang, Yabiao Wang

, Shijie Zhang, Lijie Jia, Yanwei Fu
:
Split-PU: Hardness-aware Training Strategy for Positive-Unlabeled Learning. 2719-2729 - Jialei Xu, Xianming Liu, Yuanchao Bai, Junjun Jiang, Kaixuan Wang, Xiaozhi Chen

, Xiangyang Ji:
Multi-Camera Collaborative Depth Prediction via Consistent Structure Estimation. 2730-2738 - Wenxue Cui, Shaohui Liu, Debin Zhao:

Fast Hierarchical Deep Unfolding Network for Image Compressed Sensing. 2739-2748 - Hongming Luo, Fei Zhou, Kin-Man Lam, Guoping Qiu:

Restoration of User Videos Shared on Social Media. 2749-2757 - Chenyang Qi, Junming Chen, Xin Yang, Qifeng Chen:

Real-time Streaming Video Denoising with Bidirectional Buffers. 2758-2766 - Yudong Liang, Bin Wang

, Wenqi Ren, Jiaying Liu
, Wenjian Wang, Wangmeng Zuo:
Learning Hierarchical Dynamics with Spatial Adjacency for Image Enhancement. 2767-2776 - Tao Xiang, Hangcheng Liu, Shangwei Guo, Hantao Liu, Tianwei Zhang:

Text's Armor: Optimized Local Adversarial Perturbation Against Scene Text Editing Attacks. 2777-2785 - Jiayun Fu, Bin B. Zhu, Haidong Zhang, Yayi Zou, Song Ge, Weiwei Cui, Yun Wang, Dongmei Zhang, Xiaojing Ma, Hai Jin:

ChartStamp: Robust Chart Embedding for Real-World Applications. 2786-2795 - Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang:

Few-shot Image Generation Using Discrete Content Representation. 2796-2804 - Jiaxin Zhang, Canjie Luo, Lianwen Jin, Fengjun Guo, Kai Ding:

Marior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild. 2805-2815 - Wenhan Yang, Rizhao Cai, Alex C. Kot:

Image Inpainting Detection via Enriched Attentive Pattern with Near Original Image Augmentation. 2816-2824 - Haojia Lin, Lijiang Li, Xiawu Zheng, Fei Chao, Rongrong Ji

:
Searching Lightweight Neural Network for Image Signal Processing. 2825-2833 - Zhengxin You, Qichao Ying, Sheng Li, Zhenxing Qian

, Xinpeng Zhang:
Image Generation Network for Covert Transmission in Online Social Network. 2834-2842 - Bin Yang

, Mang Ye
, Jun Chen, Zesen Wu:
Augmented Dual-Contrastive Aggregation Learning for Unsupervised Visible-Infrared Person Re-Identification. 2843-2851 - Nikhil Bansal, Kartik Gupta, Kiruthika Kannan, Sivani Pentapati, Ravi Kiran Sarvadevabhatla

:
DrawMon: A Distributed System for Detection of Atypical Sketch Content in Concurrent Pictionary Games. 2852-2861 - Jiali You

, Zhenwen Ren
, Quansen Sun, Yuan Sun, Xingfeng Li:
Approximate Shifted Laplacian Reconstruction for Multiple Kernel Clustering. 2862-2870 - Wujin Li

, Jiawei Zhan, Jinbao Wang, Bizhong Xia, Bin-Bin Gao
, Jun Liu, Chengjie Wang, Feng Zheng:
Towards Continual Adaptation in Industrial Anomaly Detection. 2871-2880 - Cheng Xiong

, Guorui Feng, Xinran Li, Xinpeng Zhang, Chuan Qin:
Neural Network Model Protection with Piracy Identification and Tampering Localization Capability. 2881-2889 - Gang He, Kepeng Xu, Li Xu, Chang Wu, Ming Sun, Xing Wen, Yu-Wing Tai:

SDRTV-to-HDRTV via Hierarchical Dynamic Context Feature Mapping. 2890-2898 - Chen Tang, Haoyu Zhai, Kai Ouyang, Zhi Wang, Yifei Zhu, Wenwu Zhu:

Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference Approach. 2899-2908 - Yiqin Zhao, Sheng Wei, Tian Guo

:
Privacy-preserving Reflection Rendering for Augmented Reality. 2909-2918
Oral Session VII: Multimedia Systems - Systems and Middleware
- Zitai Wang

, Qianqian Xu, Ke Ma, Xiaochun Cao, Qingming Huang:
Confederated Learning: Going Beyond Centralization. 2939-2947 - Insoo Lee, Seyeon Kim

, Sandesh Dhawaskar Sathyanarayana
, Kyungmin Bin
, Song Chong, Kyunghan Lee, Dirk Grunwald, Sangtae Ha:
R-FEC: RL-based FEC Adjustment for Better QoE in WebRTC. 2948-2956
Poster Session VII: Multimedia Systems -- Systems and Middleware
- Xingshuo Han, Guowen Xu

, Yuan Zhou
, Xuehuan Yang, Jiwei Li, Tianwei Zhang:
Physical Backdoor Attacks to Lane Detection Systems in Autonomous Driving. 2957-2968 - Haochen Wang, Jie Liu, Yongtuo Liu, Subhransu Maji, Jan-Jakob Sonke, Efstratios Gavves:

Dynamic Transformer for Few-shot Instance Segmentation. 2969-2977 - Hao Pan, Feitong Tan, Wenhao Li, Yi-Chao Chen, Guangtao Xue:

OISSR: Optical Image Stabilization Based Super Resolution on Smartphone Cameras. 2978-2986 - Iryanto Jaya, Yusen Li, Wentong Cai:

Improving Scalability, Sustainability and Availability via Workload Distribution in Edge-Cloud Gaming. 2987-2995 - Shahram Ghandeharizadeh

:
Display of 3D Illuminations using Flying Light Specks. 2996-3005
Oral Session VIII: Multimedia Systems -- Transport and Delivery
- Nuowen Kan, Yuankun Jiang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong:

Improving Generalization for Neural Adaptive Video Streaming via Meta Reinforcement Learning. 3006-3016 - Taslim Murad, Anh Nguyen, Zhisheng Yan:

DAO: Dynamic Adaptive Offloading for Video Analytics. 3017-3025 - Rui-Xiao Zhang, Changpeng Yang, Xiaochan Wang, Tianchi Huang, Chenglei Wu, Jiangchuan Liu, Lifeng Sun:

AggCast: Practical Cost-effective Scheduling for Large-scale Cloud-edge Crowdsourced Live Streaming. 3026-3034 - Shengzhong Liu, Tianshi Wang

, Jinyang Li, Dachun Sun
, Mani B. Srivastava, Tarek F. Abdelzaher:
AdaMask: Enabling Machine-Centric Video Streaming with Adaptive Frame Masking for DNN Inference Offloading. 3035-3044
Poster Session VIII: Multimedia Systems -- Transport and Delivery
- Tiesong Zhao, Weize Feng, Hongji Zeng, Yiwen Xu, Yuzhen Niu, Jiaying Liu

:
Learning-Based Video Coding with Joint Deep Compression and Enhancement. 3045-3054 - Han Gao

, Jinzhong Cui, Mao Ye, Shuai Li, Yu Zhao, Xiatian Zhu:
Structure-Preserving Motion Estimation for Learned Video Compression. 3055-3063 - Tianchi Huang, Chao Zhou, Lianchen Jia, Rui-Xiao Zhang, Lifeng Sun:

Learned Internet Congestion Control for Short Video Uploading. 3064-3075 - Wenhao Tang

, Sheng Huang, Xiaoxian Zhang, Luwen Huangfu
:
PicT: A Slim Weakly Supervised Vision Transformer for Pavement Distress Classification. 3076-3084 - Hang Yuan, Wei Gao

, Ge Li, Zhu Li:
Rate-Distortion-Guided Learning Approach with Cross-Projection Information for V-PCC Fast CU Decision. 3085-3093 - Shishir Subramanyam, Irene Viola, Jack Jansen, Evangelos Alexiou, Alan Hanjalic, Pablo César:

Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication. 3094-3103 - Devdeep Ray, Vicente Bobadilla Riquelme, Srinivasan Seshan

:
Prism: Handling Packet Loss for Ultra-low Latency Video. 3104-3114 - Jin Zhou

, Na Li, Yao Liu, Shuochao Yao, Songqing Chen:
Exploring Spherical Autoencoder for Spherical Video Content Processing. 3115-3123 - Jianxin Shi, Lingjun Pu, Xinjing Yuan, Qianyun Gong

, Jingdong Xu:
Sophon: Super-Resolution Enhanced 360° Video Streaming with Visual Saliency-aware Prefetch. 3124-3133 - Tzu-Kuan Hung, I-Chun Huang, Samuel Rhys Cox

, Wei Tsang Ooi, Cheng-Hsin Hsu:
Error Concealment of Dynamic 3D Point Cloud Streaming. 3134-3142 - Yiyun Lu, Yifei Zhu, Zhi Wang:

Personalized 360-Degree Video Streaming: A Meta-Learning Approach. 3143-3151
Oral Session IX: Multimedia Systems -- Data Systems Management and Indexing
- Evgenia Romanenkova

, Alexander Stepikin, Matvey Morozov, Alexey Zaytsev:
InDiD: Instant Disorder Detection via a Principled Neural Network. 3152-3162 - An Qin, Mengbai Xiao, Ben Huang, Xiaodong Zhang:

Maze: A Cost-Efficient Video Deduplication System at Web-scale. 3163-3172
Poster Session IX: Multimedia Systems -- Data Systems Management and Indexing
- Chengyin Xu, Zenghao Chai, Zhengzhuo Xu, Chun Yuan, Yanbo Fan, Jue Wang:

HyP2 Loss: Beyond Hypersphere Metric Space for Multi-label Image Retrieval. 3173-3184 - Heng Lian, John Scovil Atwood, Bojian Hou, Jian Wu, Yi He:

Online Deep Learning from Doubly-Streaming Data. 3185-3194 - Hyunmin Jung, Hyuk-Jae Lee, Chae-Eun Rhee:

Re-ordered Micro Image based High Efficient Residual Coding in Light Field Compression. 3195-3204 - Yu Mao

, Yufei Cui
, Tei-Wei Kuo
, Chun Jason Xue:
Accelerating General-purpose Lossless Compression via Simple and Scalable Parameterization. 3205-3213
Oral Session X: Understanding Multimedia Content -- Multimodal Fusion and Embeddings
- Mengzhu Wang, Jianlong Yuan, Qi Qian, Zhibin Wang, Hao Li:

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization. 3214-3223 - Yuehao Yin

, Bin Zhu
, Jingjing Chen
, Lechao Cheng
, Yu-Gang Jiang:
Mix-DANN and Dynamic-Modal-Distillation for Video Domain Adaptation. 3224-3233 - Liqiang Nie, Leigang Qu, Dai Meng, Min Zhang, Qi Tian, Alberto Del Bimbo:

Search-oriented Micro-video Captioning. 3234-3243 - Jiannan Ge

, Hongtao Xie, Shaobo Min, Pandeng Li, Yongdong Zhang:
Dual Part Discovery Network for Zero-Shot Learning. 3244-3252 - Yi Bin, Wenhao Shi, Jipeng Zhang, Yujuan Ding, Yang Yang, Heng Tao Shen:

Non-Autoregressive Cross-Modal Coherence Modelling. 3253-3261 - Ning Liao, Yifeng Liu, Xiaobo Li, Chenyi Lei, Guoxin Wang, Xian-Sheng Hua, Junchi Yan:

CoHOZ: Contrastive Multimodal Prompt Tuning for Hierarchical Open-set Zero-shot Recognition. 3262-3271 - Zhi-Qi Cheng

, Qi Dai, Siyao Li, Teruko Mitamura, Alexander Hauptmann:
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement. 3272-3281 - Qinyi Du, Qingqing Wang, Keqian Li, Jidong Tian, Liqiang Xiao, Yaohui Jin:

CALM: Commen-Sense Knowledge Augmentation for Document Image Understanding. 3282-3290 - Dapeng Chen, Min Wang, Haobin Chen

, Lin Wu, Jing Qin, Wei Peng
:
Cross-Modal Retrieval with Heterogeneous Graph Embedding. 3291-3300 - Yujie Mo, Yuhuan Chen, Liang Peng, Xiaoshuang Shi, Xiaofeng Zhu:

Simple Self-supervised Multiplex Graph Representation Learning. 3301-3309 - Tom Braude, Idan Schwartz, Alexander G. Schwing, Ariel Shamir:

Ordered Attention for Coherent Visual Storytelling. 3310-3318 - Zhong Wang

, Lin Zhang, Ying Shen, Yicong Zhou:
LVI-ExC: A Target-free LiDAR-Visual-Inertial Extrinsic Calibration Framework. 3319-3327 - Xiangming Gu

, Longshen Ou, Danielle Ong, Ye Wang
:
MM-ALT: A Multimodal Automatic Lyric Transcription System. 3328-3337 - Yachao Zhang

, Miaoyu Li, Yuan Xie, Cuihua Li, Cong Wang, Zhizhong Zhang, Yanyun Qu:
Self-supervised Exclusive Learning for 3D Segmentation with Cross-Modal Unsupervised Domain Adaptation. 3338-3346 - Yafei Zhang, Yongzeng Wang, Huafeng Li

, Shuang Li
:
Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification. 3347-3355 - Liang Yang, Weihang Peng, Wenmiao Zhou, Bingxin Niu, Junhua Gu, Chuan Wang, Yuanfang Guo, Dongxiao He, Xiaochun Cao:

Difference Residual Graph Neural Networks. 3356-3364
Poster Session X: Understanding Multimedia Content -- Multimodal Fusion and Embeddings
- Man Zhou, Jie Huang, Keyu Yan, Gang Yang, Aiping Liu, Chongyi Li

, Feng Zhao:
Normalization-based Feature Selection and Restitution for Pan-sharpening. 3365-3374 - Man Zhou, Jie Huang, Chongyi Li

, Hu Yu, Keyu Yan, Naishan Zheng, Feng Zhao:
Adaptively Learning Low-high Frequency Information Integration for Pan-sharpening. 3375-3384 - Rongyao Hu, Liang Peng, Jiangzhang Gan, Xiaoshuang Shi, Xiaofeng Zhu:

Complementary Graph Representation Learning for Functional Neuroimaging Identification. 3385-3393 - Jiwei Guo, Jiajia Tang, Weichen Dai, Yu Ding, Wanzeng Kong:

Dynamically Adjust Word Representations Using Unaligned Multimodal Information. 3394-3402 - Weiqing Yan, Jindong Xu, Jinglei Liu, Guanghui Yue, Chang Tang:

Bipartite Graph-based Discriminative Feature Learning for Multi-View Clustering. 3403-3411 - Xingfeng Li

, Quansen Sun, Zhenwen Ren
, Yinghui Sun:
Dynamic Incomplete Multi-view Imputing and Clustering. 3412-3420 - Shudong Huang, Yixi Liu, Yazhou Ren, Ivor W. Tsang

, Zenglin Xu, Jiancheng Lv:
Learning Smooth Representation for Multi-view Subspace Clustering. 3421-3429 - Mianzhao Wang, Fan Shi, Xu Cheng, Meng Zhao, Yao Zhang, Chen Jia, Weiwei Tian, Shengyong Chen:

LFBCNet: Light Field Boundary-aware and Cascaded Interaction Network for Salient Object Detection. 3430-3439 - Junpu Zhang, Liang Li

, Siwei Wang, Jiyuan Liu
, Yue Liu, Xinwang Liu, En Zhu:
Multiple Kernel Clustering with Dual Noise Minimization. 3440-3450 - Hui Cui

, Lei Zhu, Jingjing Li, Zheng Zhang
, Weili Guan:
Webly Supervised Image Hashing with Lightweight Semantic Transfer Network. 3451-3460 - Chenxi Ma, Bo Yan, Qing Lin, Weimin Tan, Siming Chen:

Rethinking Super-Resolution as Text-Guided Details Generation. 3461-3469 - Nan Yin, Li Shen, Baopu Li, Mengzhu Wang, Xiao Luo

, Chong Chen, Zhigang Luo, Xian-Sheng Hua:
DEAL: An Unsupervised Domain Adaptive Framework for Graph-level Classification. 3470-3479 - Pinci Yang, Xin Wang, Xuguang Duan, Hong Chen, Runze Hou

, Cong Jin, Wenwu Zhu:
AVQA: A Dataset for Audio-Visual Question Answering on Videos. 3480-3491 - Jinyu Yang, Zhe Li, Feng Zheng, Ales Leonardis, Jingkuan Song:

Prompting for Multi-Modal Tracking. 3492-3500 - Anjun Chen, Xiangyu Wang, Shaohao Zhu, Yanxu Li, Jiming Chen, Qi Ye:

mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar. 3501-3510 - Haizhuang Liu, Huimin Ma, Yilin Wang, Bochao Zou, Tianyu Hu, Rongquan Wang, Jiansheng Chen:

Eliminating Spatial Ambiguity for Weakly Supervised 3D Object Detection without Spatial Labels. 3511-3520 - Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu:

Dynamic Graph Reasoning for Multi-person 3D Pose Estimation. 3521-3529 - Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei:

DiT: Self-supervised Pre-training for Document Image Transformer. 3530-3539 - Nathan Louis, Jason J. Corso

, Tylan N. Templin, Travis D. Eliason, Daniel P. Nicolella:
Learning to Estimate External Forces of Human Motion in Video. 3540-3548 - Meihuizi Jia, Xin Shen, Lei Shen, Jinhui Pang, Lejian Liao, Yang Song, Meng Chen, Xiaodong He:

Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition. 3549-3558 - Md Fahim Faysal Khan

, Anusha Devulapally, Siddharth Advani, Vijaykrishnan Narayanan:
Robust Multimodal Depth Estimation using Transformer based Generative Adversarial Networks. 3559-3568 - Fu'ze Cong

, Shibiao Xu, Li Guo, Yinbing Tian:
Caption-Aware Medical VQA via Semantic Focusing and Progressive Cross-Modality Comprehension. 3569-3577 - Guiyang Luo, Hui Zhang, Quan Yuan, Jinglin Li:

Complementarity-Enhanced and Redundancy-Minimized Collaboration Network for Multi-agent Perception. 3578-3586 - Qian Yang, Yunxin Li, Baotian Hu, Lin Ma, Yuxin Ding, Min Zhang:

Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language Explanations. 3587-3597 - Xuelin Zhu, Jiuxin Cao, Jiawei Ge, Weijia Liu

, Bo Liu
:
Two-Stream Transformer for Multi-Label Image Classification. 3598-3607 - Dulanga Weerakoon, Vigneshwaran Subbaraju

, Tuan Tran, Archan Misra
:
SoftSkip: Empowering Multi-Modal Dynamic Pruning for Single-Stage Referring Comprehension. 3608-3616 - Ronghao Dang, Zhuofan Shi, Liuyi Wang, Zongtao He

, Chengju Liu, Qijun Chen:
Unbiased Directed Object Attention Graph for Object Navigation. 3617-3627 - Meng Sun, Ju Ren, Xin Wang, Wenwu Zhu, Yaoxue Zhang:

FastPR: One-stage Semantic Person Retrieval via Self-supervised Learning. 3628-3636 - Yingchen Yu, Fangneng Zhan, Rongliang Wu, Jiahui Zhang

, Shijian Lu, Miaomiao Cui, Xuansong Xie, Xian-Sheng Hua, Chunyan Miao:
Towards Counterfactual Image Manipulation via CLIP. 3637-3645 - Jiaqing Fan, Tiankang Su, Kaihua Zhang, Qingshan Liu:

Bidirectionally Learning Dense Spatio-temporal Feature Propagation Network for Unsupervised Video Object Segmentation. 3646-3655 - Shuyong Gao, Haozhe Xing, Wei Zhang, Yan Wang, Qianyu Guo, Wenqiang Zhang:

Weakly Supervised Video Salient Object Detection via Point Supervision. 3656-3665 - Rui Yan, Peng Huang

, Xiangbo Shu, Junhao Zhang, Yonghua Pan, Jinhui Tang:
Look Less Think More: Rethinking Compositional Action Recognition. 3666-3675 - Xinhang Wan, Jiyuan Liu

, Weixuan Liang, Xinwang Liu, Yi Wen, En Zhu:
Continual Multi-view Clustering. 3676-3684 - Tiejian Zhang

, Xinwang Liu, En Zhu, Sihang Zhou, Zhibin Dong
:
Efficient Anchor Learning-based Multi-view Clustering - A Late Fusion Method. 3685-3693 - Xianshuai Cao, Yuliang Shi, Jihu Wang, Han Yu, Xinjun Wang, Zhongmin Yan:

Cross-modal Knowledge Graph Contrastive Learning for Machine Learning Method Recommendation. 3694-3702 - Zan Gao, Hongwei Wei, Weili Guan, Weizhi Nie

, Meng Liu, Meng Wang:
Multigranular Visual-Semantic Embedding for Cloth-Changing Person Re-identification. 3703-3711 - Liang Li

, Baihua Zheng
, Weiwei Sun:
Adaptive Structural Similarity Preserving for Unsupervised Cross Modal Hashing. 3712-3721 - Hao Sun

, Hongyi Wang
, Jiaqing Liu
, Yen-Wei Chen, Lanfen Lin:
CubeMLP: An MLP-based Model for Multimodal Sentiment Analysis and Depression Estimation. 3722-3729 - Bicheng Guo, Tao Chen, Shibo He, Haoyu Liu, Lilin Xu, Peng Ye, Jiming Chen:

Generalized Global Ranking-Aware Neural Architecture Ranker for Efficient Image Classifier Search. 3730-3741 - Jinxiang Liu

, Chen Ju, Weidi Xie, Ya Zhang
:
Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation. 3742-3753 - Yanbin Hao, Jingru Duan, Hao Zhang, Bin Zhu, Pengyuan Zhou

, Xiangnan He:
Unsupervised Video Hashing with Multi-granularity Contextualization and Multi-structure Preservation. 3754-3763 - Haiyang Liu, Naoya Iwamoto, Zihao Zhu, Zhengqing Li, You Zhou, Elif Bozkurt, Bo Zheng:

DisCo: Disentangled Implicit Content and Rhythm Learning for Diverse Co-Speech Gestures Synthesis. 3764-3773 - Man-Sheng Chen, Tuo Liu, Chang-Dong Wang, Dong Huang, Jian-Huang Lai:

Adaptively-weighted Integral Space for Fast Multiview Clustering. 3774-3782 - Zhiying Jiang, Zengxi Zhang, Xin Fan, Risheng Liu

:
Towards All Weather and Unobstructed Multi-Spectral Image Stitching: Algorithm and Benchmark. 3783-3791 - Shizhe Hu, Ruilin Geng, Zhaoxu Cheng, Chaoyang Zhang, Guoliang Zou, Zhengzheng Lou, Yangdong Ye:

A Parameter-free Multi-view Information Bottleneck Clustering Method by Cross-view Weighting. 3792-3800 - Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Wenqiao Zhang, Jiaxu Miao, Shiliang Pu, Fei Wu:

HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding. 3801-3810 - Xiaoyu Zhou, Xiaotong Song, Hao Wu, Jingran Zhang, Xing Xu:

MAVT-FG: Multimodal Audio-Visual Transformer for Weakly-supervised Fine-Grained Recognition. 3811-3819 - Haichao Shi, Xiaoyu Zhang, Changsheng Li, Lixing Gong, Yong Li, Yongjun Bao:

Dynamic Graph Modeling for Weakly-Supervised Temporal Action Localization. 3820-3828 - Miaoyu Li, Yachao Zhang

, Yuan Xie, Zuodong Gao, Cuihua Li, Zhizhong Zhang, Yanyun Qu:
Cross-Domain and Cross-Modal Knowledge Distillation in Domain Adaptation for 3D Semantic Segmentation. 3829-3837 - Eric Zhongcong Xu

, Zeyang Song, Satoshi Tsutsui, Chao Feng, Mang Ye
, Mike Zheng Shou:
AVA-AVD: Audio-visual Speaker Diarization in the Wild. 3838-3847 - Bo Peng, Liren He, Yining Qiu, Dong Wu, Mingmin Chi:

Image-Signal Correlation Network for Textile Fiber Identification. 3848-3856 - Derong Xu

, Tong Xu, Shiwei Wu, Jingbo Zhou, Enhong Chen:
Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion. 3857-3866 - Wuxuan Shi, Mang Ye

, Bo Du:
Symmetric Uncertainty-Aware Feature Transmission for Depth Super-Resolution. 3867-3876 - Jiawei Fan, Yu Zhao, Xie Yu, Lihua Ma, Junqi Liu, Fangqiu Yi, Boxun Li:

DTR: An Information Bottleneck Based Regularization Framework for Video Action Recognition. 3877-3885 - Jin Yuan, Feng Hou, Yangzhou Du, Zhongchao Shi, Xin Geng, Jianping Fan, Yong Rui:

Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation. 3907-3916 - Ho Yin Au, Jie Chen, Junkun Jiang

, Yike Guo
:
ChoreoGraph: Music-conditioned Automatic Dance Choreography over a Style and Tempo Consistent Dynamic Graph. 3917-3925 - Rui Peng, Tao Zhang, Bing Li, Yitong Wang:

Pixelwise Adaptive Discretization with Uncertainty Sampling for Depth Completion. 3926-3935 - Zhe Xue, Junping Du, Hai Zhu, Zhongchao Guan, Yunfei Long, Yu Zang, Meiyu Liang:

Robust Diversified Graph Contrastive Network for Incomplete Multi-view Clustering. 3936-3944 - Xiyu Wang, Yuecong Xu

, Jianfei Yang, Kezhi Mao:
Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation. 3945-3954 - Duo Chen

, Zixin Tang, Yiguang Liu:
Cyclical Fusion: Accurate 3D Reconstruction via Cyclical Monotonicity. 3955-3964 - Tengfei Liang, Yi Jin, Wu Liu, Songhe Feng, Tao Wang, Yidong Li:

Keypoint-Guided Modality-Invariant Discriminative Learning for Visible-Infrared Person Re-identification. 3965-3973 - Gang Yang

, Li Zhang, Man Zhou, Aiping Liu, Xun Chen, Zhiwei Xiong, Feng Wu:
Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction. 3974-3982 - Fei Zhao, Chunhui Li, Zhen Wu, Shangyu Xing, Xinyu Dai:

Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER. 3983-3992 - Shuo Wang, Xinyu Zhang

, Yanbin Hao, Chengbing Wang, Xiangnan He:
Multi-directional Knowledge Transfer for Few-Shot Learning. 3993-4002 - Yiming Sun

, Bing Cao, Pengfei Zhu, Qinghua Hu:
DetFusion: A Detection-driven Infrared and Visible Image Fusion Network. 4003-4011 - Cuiqun Chen, Mang Ye

, Meibin Qi, Bo Du:
Sketch Transformer: Asymmetrical Disentanglement Learning from Dynamic Synthesis. 4012-4020 - Jinxiang Lai, Siqian Yang, Guannan Jiang, Xi Wang, Yuxi Li, Zihui Jia, Xiaochen Chen, Jun Liu, Bin-Bin Gao

, Wei Zhang, Yuan Xie, Chengjie Wang:
Rethinking the Metric in Few-shot Learning: From an Adaptive Multi-Distance Perspective. 4021-4030 - Yuanbin Wang, Leyan Zhu, Shaofei Huang

, Tianrui Hui, Xiaojie Li, Fei Wang, Si Liu:
Cross-Modality Domain Adaptation for Freespace Detection: A Simple yet Effective Baseline. 4031-4042 - Jin Xie, Rao Muhammad Anwer

, Hisham Cholakkal, Jing Nie, Jiale Cao, Jorma Laaksonen
, Fahad Shahbaz Khan:
Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection. 4043-4052 - Haihan Wang, Shangfei Wang, Lin Fang:

Two-Stage Multi-Scale Resolution-Adaptive Network for Low-Resolution Face Recognition. 4053-4062 - Xuan Zhang, Xun Liang, Xiangping Zheng, Bo Wu, Yuhui Guo:

When True Becomes False: Few-Shot Link Prediction beyond Binary Relations through Mining False Positive Entities. 4063-4071 - Hanjia Lyu

, Jiebo Luo
:
Understanding Political Polarization via Jointly Modeling Users, Connections and Multimodal Contents on Heterogeneous Graphs. 4072-4082
Oral Session XI: Understanding Multimedia Content -- Vision and Language
- Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei:

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking. 4083-4091 - Daizong Liu, Xiaoye Qu, Wei Hu:

Reducing the Vision and Language Bias for Temporal Sentence Grounding. 4092-4101 - Luchuan Song, Xiaodan Li, Zheng Fang, Zhenchao Jin

, Yuefeng Chen, Chenliang Xu:
Face Forgery Detection via Symmetric Transformer. 4102-4111 - Zaisheng Li

, Yi Li, Liang Qiao, Pengfei Li, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Xi Li:
End-to-End Compound Table Understanding with Multi-Modal Modeling. 4112-4121 - Yiyuan Zhang

, Yuqi Ji:
Modality Eigen-Encodings Are Keys to Open Modality Informative Containers. 4122-4131 - Yue Ma, Yali Wang, Yue Wu, Ziyu Lyu, Siran Chen

, Xiu Li, Yu Qiao:
Visual Knowledge Graph for Human Action Reasoning in Videos. 4132-4141 - Feilong Chen, Duzhen Zhang, Xiuyi Chen, Jing Shi, Shuang Xu, Bo Xu:

Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog. 4142-4153 - Jingqun Tang

, Su Qiao, Benlei Cui, Yuhang Ma, Sheng Zhang, Dimitrios Kanoulas
:
You Can even Annotate Text with Voice: Transcription-only-Supervised Text Spotting. 4154-4163 - Chao Bi, Shuhui Wang, Zhe Xue, Shengbo Chen, Qingming Huang:

Inferential Visual Question Generation. 4164-4174 - Gal-Lev Shalev, Gabi Shalev, Joseph Keshet

:
A Baseline for Detecting Out-of-Distribution Examples in Image Captioning. 4175-4184 - Jingyuan Xu, Hongtao Xie, Chuanbin Liu, Yongdong Zhang:

Proxy Probing Decoder for Weakly Supervised Object Localization: A Baseline Investigation. 4185-4193 - Yusheng Zhao, Jinyu Chen, Chen Gao, Wenguan Wang, Lirong Yang, Haibing Ren, Huaxia Xia, Si Liu:

Target-Driven Structured Transformer Planner for Vision-Language Navigation. 4194-4203 - Xingchen Li, Long Chen

, Wenbo Ma, Yi Yang, Jun Xiao:
Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation. 4204-4213 - Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai:

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition. 4214-4223 - Xudong Tian, Jun Liu, Zhizhong Zhang, Chengjie Wang, Yanyun Qu, Yuan Xie, Lizhuang Ma:

Hierarchical Walking Transformer for Object Re-Identification. 4224-4232 - Siying Wu

, Xueyang Fu, Feng Wu, Zheng-Jun Zha:
Cross-modal Semantic Alignment Pre-training for Vision-and-Language Navigation. 4233-4241 - Rundong He, Zhongyi Han, Xiankai Lu, Yilong Yin:

RONF: Reliable Outlier Synthesis under Noisy Feature Space for Out-of-Distribution Detection. 4242-4251 - Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada

, Kunio Kashino:
ConceptBeam: Concept Driven Target Speech Extraction. 4252-4260 - Haoyu Cao, Xin Li, Jiefeng Ma, Deqiang Jiang, Antai Guo, Yiqing Hu, Hao Liu, Yinsong Liu, Bo Ren:

Query-driven Generative Network for Document Information Extraction in the Wild. 4261-4271 - Dezhi Peng, Xinyu Wang, Yuliang Liu

, Jiaxin Zhang
, Mingxin Huang
, Songxuan Lai, Jing Li, Shenggao Zhu, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin:
SPTS: Single-Point Text Spotting. 4272-4281 - Yiyang Ma, Huan Yang, Bei Liu, Jianlong Fu, Jiaying Liu

:
AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation. 4282-4290 - Xiaoyu Zhang, Yulin Jin, Tao Wang, Jian Lou, Xiaofeng Chen:

Purifier: Plug-and-play Backdoor Mitigation for Pre-trained Models Via Anomaly Activation Suppression. 4291-4299 - Junsheng Wang, Tiantian Gong

, Zhixiong Zeng, Changchang Sun, Yan Yan:
C3CMR: Cross-Modality Cross-Instance Contrastive Learning for Cross-Media Retrieval. 4300-4308 - Aihua Zheng, Peng Pan, Hongchao Li, Chenglong Li, Bin Luo, Chang Tan, Ruoran Jia:

Progressive Attribute Embedding for Accurate Cross-modality Person Re-ID. 4309-4317 - Lihua Zhou, Mao Ye, Xiatian Zhu, Shuaifeng Li

, Yiguang Liu:
Class Discriminative Adversarial Learning for Unsupervised Domain Adaptation. 4318-4326 - Zhuowei Chen, Zhendong Mao, Shancheng Fang, Bo Hu

:
Background Layout Generation and Object Knowledge Transfer for Text-to-Image Generation. 4327-4335 - Rengang Li, Baoyu Fan, Xiaochuan Li

, Runze Zhang, Zhenhua Guo, Kun Zhao, Yaqian Zhao, Weifeng Gong, Endong Wang:
Towards Further Comprehension on Referring Expression with Rationale. 4336-4344 - Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang:

DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation. 4345-4354 - Hao Wei, Shuhui Wang, Xinzhe Han, Zhe Xue, Bin Ma, Xiaoming Wei, Xiaolin Wei:

Synthesizing Counterfactual Samples for Effective Image-Text Matching. 4355-4364 - Jingjing Zhang, Shancheng Fang, Zhendong Mao, Zhiwei Zhang, Yongdong Zhang:

Fine-tuning with Multi-modal Entity Prompts for News Image Captioning. 4365-4373 - Yangjun Mao, Long Chen

, Zhihong Jiang, Dong Zhang, Zhimeng Zhang, Jian Shao, Jun Xiao:
Rethinking the Reference-based Distinctive Image Captioning. 4374-4384 - Alex Falcon, Giuseppe Serra, Oswald Lanz

:
A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval. 4385-4394
Poster Session XI: Understanding Multimedia Content -- Vision and Language
- Zejun Li, Zhihao Fan, Huaixiao Tou, Jingjing Chen

, Zhongyu Wei, Xuanjing Huang:
MVPTR: Multi-Level Semantic Alignment for Vision-Language Pre-Training via Multi-Stage Learning. 4395-4405 - Prince Jha

, Gaël Dias, Alexis Lechervy
, José G. Moreno, Anubhav Jangra, Sebastião Pais
, Sriparna Saha:
Combining Vision and Language Representations for Patch-based Identification of Lexico-Semantic Relations. 4406-4415 - Weidong Chen, Dexiang Hong, Yuankai Qi

, Zhenjun Han, Shuhui Wang, Laiyun Qing, Qingming Huang, Guorong Li:
Multi-Attention Network for Compressed Video Referring Object Segmentation. 4416-4425 - Kai Niu, Linjiang Huang, Yan Huang, Peng Wang, Liang Wang, Yanning Zhang:

Cross-modal Co-occurrence Attributes Alignments for Person Search by Language. 4426-4434 - Heqian Qiu, Hongliang Li, Taijin Zhao, Lanxiao Wang, Qingbo Wu, Fanman Meng:

RefCrowd: Grounding the Target in Crowd with Referring Expressions. 4435-4444 - Qiming Yang, Kai Zhang, Chaoxiang Lan, Zhi Yang, Zheyang Li, Wenming Tan, Jun Xiao, Shiliang Pu:

Unified Normalization for Accelerating and Stabilizing Transformers. 4445-4455 - Hui Zhu, Yongchun Lü, Hongbin Wang, Xunyi Zhou, Qin Ma, Yanhong Liu, Ning Jiang, Xin Wei, Linchengxi Zeng, Xiaofang Zhao:

Enhancing Semi-Supervised Learning with Cross-Modal Knowledge. 4456-4465 - Zi Qian

, Xin Wang, Xuguang Duan, Hong Chen, Wenwu Zhu:
Dynamic Spatio-Temporal Modular Network for Video Question Answering. 4466-4477 - Xiao Wang

, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie:
Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation. 4478-4486 - Qiang Zhou, Chaohui Yu, Hao Luo, Zhibin Wang, Hao Li:

MimCo: Masked Image Modeling Pre-training with Contrastive Teacher. 4487-4495 - Gaoxiang Cong, Liang Li, Zhenhuan Liu, Yunbin Tu, Weijun Qin, Shenyuan Zhang, Chengang Yan, Wenyu Wang, Bin Jiang:

LS-GAN: Iterative Language-based Image Manipulation via Long and Short Term Consistency Reasoning. 4496-4504 - Chuanpeng Yang, Fuqing Zhu, Guihua Liu, Jizhong Han

, Songlin Hu
:
Multimodal Hate Speech Detection via Cross-Domain Knowledge Transfer. 4505-4514 - Zhiyuan Ma, Jianjun Li, Guohui Li, Kaiyan Huang:

CMAL: A Novel Cross-Modal Associative Learning Framework for Vision-Language Pre-Training. 4515-4524 - Xujie Zhang, Yu Sha, Michael C. Kampffmeyer, Zhenyu Xie, Zequn Jie, Chengwen Huang, Jianqing Peng, Xiaodan Liang:

ARMANI: Part-level Garment-Text Alignment for Unified Cross-Modal Fashion Design. 4525-4535 - Daizong Liu, Wei Hu:

Skimming, Locating, then Perusing: A Human-Like Framework for Natural Language Video Localization. 4536-4545 - Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan S. Kankanhalli

:
Distance Matters in Human-Object Interaction Detection. 4546-4554 - Chen-Wei Xie, Jianmin Wu, Yun Zheng, Pan Pan, Xian-Sheng Hua:

Token Embeddings Alignment for Cross-Modal Retrieval. 4555-4563 - Zan-Xia Jin, Mike Zheng Shou, Fang Zhou, Satoshi Tsutsui, Jingyan Qin, Xu-Cheng Yin:

From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA. 4564-4572 - Xinyu Huang, Youcai Zhang, Ying Cheng, Weiwei Tian, Ruiwei Zhao, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Xiaobo Zhang:

IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training. 4573-4583 - Guohao Li, Hu Yang, Feng He, Zhifan Feng, Yajuan Lyu, Hua Wu, Haifeng Wang:

CLOP: Video-and-Language Pre-Training with Knowledge Regularizations. 4584-4593 - Yudong Li

, Xianxu Hou, Zhe Zhao, Linlin Shen, Xuefeng Yang, Kimmo Yan:
Talk2Face: A Unified Sequence-based Framework for Diverse Face Generation and Analysis Tasks. 4594-4604 - Zhenyu Wu, Zhou Ren, Yi Wu, Zhangyang Wang, Gang Hua:

TxVAD: Improved Video Action Detection by Transformers. 4605-4613 - Xin Li, Yan Zheng, Yiqing Hu, Haoyu Cao, Yunfei Wu, Deqiang Jiang, Yinsong Liu, Bo Ren:

Relational Representation Learning in Visually-Rich Documents. 4614-4624 - Zihao Wang

, Junli Wang, Changjun Jiang:
Unified Multimodal Model with Unlikelihood Training for Visual Dialog. 4625-4634 - Manyi Zhang, Yuxin Ren, Zihao Wang, Chun Yuan:

Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration. 4635-4644 - Muhammad Umer Anwaar, Zhihui Pan, Martin Kleinsteuber:

On Leveraging Variational Graph Embeddings for Open World Compositional Zero-Shot Learning. 4645-4654 - Feifei Zhang, Ming Yan, Ji Zhang, Changsheng Xu:

Comprehensive Relationship Reasoning for Composed Query Based Image Retrieval. 4655-4664 - Ramtin Hosseini, Pengtao Xie:

Image Understanding by Captioning with Differentiable Architecture Search. 4665-4673 - Muqi Huang, Lefei Zhang:

Atrous Pyramid Transformer with Spectral Convolution for Image Inpainting. 4674-4683 - Ding Ma, Xiangqian Wu:

QuadTreeCapsule: QuadTree Capsules for Deep Regression Tracking. 4684-4693 - Qixin Deng, Binh Huy Le, Aobo Jin, Zhigang Deng

:
End-to-End 3D Face Reconstruction with Expressions and Specular Albedos from Single In-the-wild Images. 4694-4703 - Yunqing He

, Tongwei Ren, Jinhui Tang, Gangshan Wu:
Heterogeneous Learning for Scene Graph Generation. 4704-4713 - Yicong Li, Xiang Wang, Junbin Xiao, Tat-Seng Chua:

Equivariant and Invariant Grounding for Video Question Answering. 4714-4722 - Yan Yu, Yuchen Zhai, Yin Zhang:

Align and Adapt: A Two-stage Adaptation Framework for Unsupervised Domain Adaptation. 4723-4732 - Yutong Tan, Zheng Lin, Peng Fu, Mingyu Zheng, Lanrui Wang, Yanan Cao, Weiping Wang

:
Detach and Attach: Stylized Image Captioning without Paired Stylized Dataset. 4733-4741 - Wei Zhang, Xiaohong Zhang, Sheng Huang, Yuting Lu, Kun Wang

:
PixelSeg: Pixel-by-Pixel Stochastic Semantic Segmentation for Ambiguous Medical Images. 4742-4750 - Wei Zhang, Xiaohong Zhang, Sheng Huang, Yuting Lu, Kun Wang

:
A Probabilistic Model for Controlling Diversity and Accuracy of Ambiguous Medical Image Segmentation. 4751-4759 - Ziyu Zhao, Zhenyao Wu, Xinyi Wu, Canyu Zhang, Song Wang:

Crossmodal Few-shot 3D Point Cloud Semantic Segmentation. 4760-4768 - Ben Fei, Weidong Yang

, Wen-Ming Chen, Lipeng Ma:
VQ-DcTr: Vector-Quantized Autoencoder With Dual-channel Transformer Points Splitting for 3D Point Cloud Completion. 4769-4778 - Baoli Sun, Xinchen Ye, Tiantian Yan, Zhihui Wang, Haojie Li, Zhiyong Wang:

Fine-grained Action Recognition with Robust Motion Representation Decoupling and Concentration. 4779-4788 - Sheng Fang, Shuhui Wang, Junbao Zhuo, Qingming Huang, Bin Ma, Xiaoming Wei, Xiaolin Wei:

Concept Propagation via Attentional Knowledge Graph Reasoning for Video-Text Retrieval. 4789-4800 - Jingye Wang, Ruoyi Du, Dongliang Chang, Kongming Liang, Zhanyu Ma:

Domain Generalization via Frequency-domain-based Feature Disentanglement and Interaction. 4821-4829 - Runpeng Hou, Ziyuan Ye, Chengyu Yang, Linhao Fu, Chao Liu, Quanying Liu:

Immunofluorescence Capillary Imaging Segmentation: Cases Study. 4830-4838 - Siyuan Liang, Aishan Liu, Jiawei Liang, Longkang Li, Yang Bai, Xiaochun Cao:

Imitated Detectors: Stealing Knowledge of Black-box Object Detectors. 4839-4847 - Wu Zheng, Li Jiang

, Fanbin Lu, Yangyang Ye, Chi-Wing Fu:
Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame Point Clouds. 4848-4856 - Fengbin Zhu, Wenqiang Lei, Fuli Feng, Chao Wang, Haozhou Zhang, Tat-Seng Chua:

Towards Complex Document Understanding By Discrete Reasoning. 4857-4866 - Hanlin Li, Guanting Dong, Yueyi Zhang, Xiaoyan Sun, Zhiwei Xiong:

RPPformer-Flow: Relative Position Guided Point Transformer for Scene Flow Estimation. 4867-4876 - Wenjin Wang, Zhengjie Huang, Bin Luo, Qianglong Chen

, Qiming Peng, Yinxu Pan, Weichong Yin, Shikun Feng, Yu Sun, Dianhai Yu, Yin Zhang:
mmLayout: Multi-grained MultiModal Transformer for Document Understanding. 4877-4886 - Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding:

Boosting Video-Text Retrieval with Explicit High-Level Semantics. 4887-4898 - Hengyi Zhou, Longjun Liu, Haonan Zhang, Nanning Zheng:

Rethinking the Mechanism of the Pattern Pruning and the Circle Importance Hypothesis. 4899-4908 - Xinya Wu, Duo Zheng, Ruonan Wang, Jiashen Sun, Minzhen Hu, Fangxiang Feng, Xiaojie Wang, Huixing Jiang, Fan Yang:

A Region-based Document VQA. 4909-4920 - Hui Lu, Xuan Cheng, Wentao Xia, Pan Deng, Minghui Liu, Tianshu Xie, Xiaomin Wang, Ming Liu:

CyclicShift: A Data Augmentation Method For Enriching Data Patterns. 4921-4929 - Jinqiang Wang, Rui Hu, Chaoquan Jiang, Rui Hu, Jitao Sang:

Counterexample Contrastive Learning for Spurious Correlation Elimination. 4930-4938 - Tao Jin, Zhou Zhao, Meng Zhang, Xingshan Zeng:

MC-SLT: Towards Low-Resource Signer-Adaptive Sign Language Translation. 4939-4947 - Yang Qin, Dezhong Peng, Xi Peng, Xu Wang, Peng Hu:

Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval. 4948-4956 - Hongyu Gao, Chao Zhu, Mengyin Liu

, Weibo Gu, Hongfa Wang, Wei Liu, Xu-Cheng Yin:
CAliC: Accurate and Efficient Image-Text Retrieval via Contrastive Alignment and Visual Contexts Modeling. 4957-4966 - Meng Cao, Ji Jiang, Long Chen

, Yuexian Zou:
Correspondence Matters for Video Referring Expression Comprehension. 4967-4976 - Zheng Wang, Zhenwei Gao, Xing Xu, Yadan Luo

, Yang Yang, Heng Tao Shen:
Point to Rectangle Matching for Image Text Retrieval. 4977-4986 - Ruijie Hou

, Yanran Li
, Ningyu Zhang, Yulin Zhou, Xiaosong Yang, Zhao Wang
:
Shifting Perspective to See Difference: A Novel Multi-view Method for Skeleton based Action Recognition. 4987-4995 - Yi Zhang

, Junyang Wang, Jitao Sang:
Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models. 4996-5004 - Jiaming Zhang, Qi Yi, Jitao Sang:

Towards Adversarial Attack on Vision-Language Pre-training Models. 5005-5013 - Wei Wang

, Yu Zhou
, Jiahao Lyu, Dayan Wu, Guoqing Zhao, Ning Jiang, Weiping Wang
:
TPSNet: Reverse Thinking of Thin Plate Splines for Arbitrary Shape Scene Text Representation. 5014-5025 - Zhengcong Fei:

Efficient Modeling of Future Context for Image Captioning. 5026-5035 - Banglei Guan

, Ji Zhao
:
Relative Pose Estimation for Multi-Camera Systems from Point Correspondences with Scale Ratio. 5036-5044 - Jun Peng, Han Pan, Yiyi Zhou, Jing He, Xiaoshuai Sun, Yan Wang, Yongjian Wu, Rongrong Ji:

Towards Open-Ended Text-to-Face Generation, Combination and Manipulation. 5045-5054 - Dongqing Wu, Huihui Li, Cang Gu, Lei Guo, Hang Liu:

Improving Fusion of Region Features and Grid Features via Two-Step Interaction for Image-Text Retrieval. 5055-5064 - Weixin An, Yingjie Yue, Yuanyuan Liu, Fanhua Shang, Hongying Liu:

A Numerical DEs Perspective on Unfolded Linearized ADMM Networks for Inverse Problems. 5065-5073 - Yonghui Wang, Wengang Zhou, Zhenbo Lu, Houqiang Li:

UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior. 5074-5082 - Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang

, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang:
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos. 5083-5092 - Dong Wang

, Yicheng Liu, Liangji Fang, Fanhua Shang, Yuanyuan Liu, Hongying Liu:
Balanced Gradient Penalty Improves Deep Long-Tailed Learning. 5093-5101 - Jinlu Zhang, Yujin Chen, Zhigang Tu:

Uncertainty-Aware 3D Human Pose Estimation from Monocular Video. 5102-5113 - Wenpeng Xing, Jie Chen:

MVSPlenOctree: Fast and Generic Reconstruction of Radiance Fields in PlenOctree from Multi-view Stereo. 5114-5122 - Junkun Jiang

, Jie Chen, Yike Guo
:
A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion. 5123-5131 - Jun Peng, Xiaoxiong Du, Yiyi Zhou, Jing He, Yunhang Shen, Xiaoshuai Sun, Rongrong Ji:

Learning Dynamic Prior Knowledge for Text-to-Face Pixel Synthesis. 5132-5141 - Jingzheng Li, Hailong Sun:

Correct Twice at Once: Learning to Correct Noisy Labels for Robust Deep Learning. 5142-5151 - Zhihong Chen, Guanbin Li, Xiang Wan:

Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge. 5152-5161 - Lingwei Dang, Yongwei Nie, Chengjiang Long, Qing Zhang, Guiqing Li:

Diverse Human Motion Prediction via Gumbel-Softmax Sampling from an Auxiliary Space. 5162-5171 - Meng Wang

, Chaoyue Wang, Xiaojie Guo, Jiawan Zhang:
Towards High-Fidelity Face Normal Estimation. 5172-5180 - Yuxuan Wang, Jiakai Wang, Zixin Yin, Ruihao Gong

, Jingyi Wang, Aishan Liu, Xianglong Liu:
Generating Transferable Adversarial Examples against Vision Transformers. 5181-5190 - Yan Xia, Zhou Zhao, Shangwei Ye, Yang Zhao, Haoyuan Li, Yi Ren:

Video-Guided Curriculum Learning for Spoken Video Grounding. 5191-5200 - Chen Li, Li Song, Xueyi Zou, Jiaming Guo, Youliang Yan, Wenjun Zhang:

Multi-Scale Coarse-to-Fine Transformer for Frame Interpolation. 5201-5209 - Pengpeng Zeng, Jinkuan Zhu, Jingkuan Song, Lianli Gao:

Progressive Tree-Structured Prototype Network for End-to-End Image Captioning. 5210-5218 - Miaohui Wang

, Zhuowei Xu
, Yuanhao Gong, Wuyuan Xie:
S-CCR: Super-Complete Comparative Representation for Low-Light Image Quality Inference In-the-wild. 5219-5227 - Mohammed M. Alghamdi, He Wang, Andrew J. Bulpitt

, David C. Hogg:
Talking Head from Speech Audio using a Pre-trained Image Generator. 5228-5236 - Junjie Li, Zilei Wang, Yuan Gao, Xiaoming Hu

:
Exploring High-quality Target Domain Information for Unsupervised Domain Adaptive Semantic Segmentation. 5237-5245 - Aishwarya Agarwal, Biplab Banerjee, Fabio Cuzzolin, Subhasis Chaudhuri:

Semantics-Driven Generative Replay for Few-Shot Class Incremental Learning. 5246-5254 - Lingling Gao, Yanli Ji, Yang Yang, Heng Tao Shen:

Global-Local Cross-View Fisher Discrimination for View-Invariant Action Recognition. 5255-5264 - Chenchen Ye

, Lizi Liao
, Suyu Liu, Tat-Seng Chua:
Reflecting on Experiences for Response Generation. 5265-5273 - Rengang Li, Cong Xu

, Zhenhua Guo, Baoyu Fan, Runze Zhang, Wei Liu, Yaqian Zhao, Weifeng Gong, Endong Wang:
AI-VQA: Visual Question Answering based on Agent Interaction with Interpretability. 5274-5282 - Bo Xu

, Jiake Xie, Han Huang, Ziwen Li, Cheng Lu, Yong Tang, Yandong Guo:
Situational Perception Guided Image Matting. 5283-5293 - Zhenjie Yu, Kai Chen, Shuang Li, Bingfeng Han, Chi Harold Liu

, Shuigen Wang:
ROMA: Cross-Domain Region Similarity Matching for Unpaired Nighttime Infrared to Daytime Visible Video Translation. 5294-5302 - Liming Zhai

, Qing Guo, Xiaofei Xie
, Lei Ma, Yi Estelle Wang
, Yang Liu
:
A3GAN: Attribute-Aware Anonymization Networks for Face De-identification. 5303-5313 - Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, Yifeng Li:

CAIBC: Capturing All-round Information Beyond Color for Text-based Person Retrieval. 5314-5322 - Miao Zhang, Shuang Xu, Yongri Piao, Dongxiang Shi, Shusen Lin, Huchuan Lu:

PreyNet: Preying on Camouflaged Objects. 5323-5332 - Hanzhe Sun, Jun Liu, Zhizhong Zhang, Chengjie Wang, Yanyun Qu, Yuan Xie, Lizhuang Ma:

Not All Pixels Are Matched: Dense Contrastive Learning for Cross-Modality Person Re-Identification. 5333-5341 - Shiting Xu, Zhiheng Zhou, Junyuan Shang:

Asymmetric Adversarial-based Feature Disentanglement Learning for Cross-Database Micro-Expression Recognition. 5342-5350 - Yuhua Sun, Tailai Zhang, Xingjun Ma, Pan Zhou, Jian Lou, Zichuan Xu, Xing Di, Yu Cheng, Lichao Sun

:
Backdoor Attacks on Crowd Counting. 5351-5360 - Kangcheng Liu

:
Robust Industrial UAV/UGV-Based Unsupervised Domain Adaptive Crack Recognitions with Depth and Edge Awareness: From System and Database Constructions to Real-Site Inspections. 5361-5370 - Ziqiang Li

, Yongxin Ge, Jiaruo Yu, Zhongming Chen:
Forcing the Whole Video as Background: An Adversarial Learning Strategy for Weakly Temporal Action Localization. 5371-5379 - Yifu Ding, Haotong Qin

, Qinghua Yan, Zhenhua Chai, Junjie Liu
, Xiaolin Wei, Xianglong Liu:
Towards Accurate Post-Training Quantization for Vision Transformer. 5380-5388 - Zhaoyang Jia, Yan Lu, Houqiang Li:

Neighbor Correspondence Matching for Flow-based Video Frame Synthesis. 5389-5397 - Xuewen Yang

, Yingru Liu, Xin Wang
:
ReFormer: The Relational Transformer for Image Captioning. 5398-5406 - Yu Xiong, Fabian Caba Heilbron, Dahua Lin:

Transcript to Video: Efficient Clip Sequencing from Texts. 5407-5416 - Senbo Yan, Liang Peng, Chuer Yu, Zheng Yang, Haifeng Liu, Deng Cai:

Domain Reconstruction and Resampling for Robust Salient Object Detection. 5417-5426 - Ye Liu, Liang Wan, Huazhu Fu

, Jing Qin, Lei Zhu:
Phase-based Memory Network for Video Dehazing. 5427-5435 - Jun-Hao Zhuang

, Yi-Si Luo, Xile Zhao, Tai-Xiang Jiang, Bichuan Guo:
UConNet: Unsupervised Controllable Network for Image and Video Deraining. 5436-5445 - Ziqi Jiang, Shengyu Zhang, Siyuan Yao, Wenqiao Zhang, Sihan Zhang, Juncheng Li, Zhou Zhao, Fei Wu:

Weakly-supervised Disentanglement Network for Video Fingerspelling Detection. 5446-5455 - Hongxiang Huang, Daihui Yang, Gang Dai

, Zhen Han, Yuyi Wang, Kin-Man Lam, Fan Yang
, Shuangping Huang, Yongge Liu, Mengchao He:
AGTGAN: Unpaired Image Translation for Photographic Ancient Character Generation. 5456-5467 - Yiren Song:

CLIPTexture: Text-Driven Texture Synthesis. 5468-5476 - Junjie Wang, Zhenbo Yu, Zhengyan Tong, Hang Wang, Jinxian Liu, Wenjun Zhang, Xiaoyan Wu:

OCR-Pose: Occlusion-aware Contrastive Representation for Unsupervised 3D Human Pose Estimation. 5477-5485 - Wencan Huang, Zhou Zhao, Jinzheng He, Mingmin Zhang:

DualSign: Semi-Supervised Sign Language Production with Balanced Multi-Modal Multi-Task Dual Transformation. 5486-5495 - Ce Zheng

, Matías Mendieta, Pu Wang
, Aidong Lu, Chen Chen:
A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose. 5496-5507 - Yue He, Minyue Jiang, Xiaoqing Ye, Liang Du, Zhikang Zou, Wei Zhang, Xiao Tan, Errui Ding:

Repainting and Imitating Learning for Lane Detection. 5508-5516 - Hao Wang

, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao:
Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval. 5517-5526 - Yulu Zhang, Liang Sang, Marcin Grzegorzek, John See

, Cong Yang:
BlumNet: Graph Component Detection for Object Skeleton Extraction. 5527-5536 - Zihan Ding, Zi-han Ding, Tianrui Hui, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Si Liu:

PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding. 5537-5546 - Guangchen Shi

, Yirui Wu, Jun Liu
, Shaohua Wan, Wenhai Wang, Tong Lu:
Incremental Few-Shot Semantic Segmentation via Embedding Adaptive-Update and Hyper-class Representation. 5547-5556 - Zhenyu Wu, Lin Wang, Wei Wang, Tengfei Shi

, Chenglizhao Chen, Aimin Hao, Shuo Li:
Synthetic Data Supervised Salient Object Detection. 5557-5565 - Zhiyin Shao, Xinyu Zhang

, Meng Fang, Zhifeng Lin, Jian Wang, Changxing Ding:
Learning Granularity-Unified Representations for Text-to-Image Person Re-identification. 5566-5574 - Cheng Chen

, Ji Zhang, Jingkuan Song, Lianli Gao:
Class Gradient Projection For Continual Learning. 5575-5583 - Song Chang, Youfang Lin, Shuo Zhang:

Flexible Hybrid Lenses Light Field Super-Resolution using Layered Refinement. 5584-5592 - Jingliang Li, Zhengda Lu

, Yiqun Wang, Ying Wang, Jun Xiao:
DS-MVSNet: Unsupervised Multi-view Stereo via Depth Synthesis. 5593-5601 - Min Zhang

, Zhihong Pan, Xin Zhou, C.-C. Jay Kuo:
Enhancing Image Rescaling using Dual Latent Variables in Invertible Neural Network. 5602-5610 - Qi Liu, Nianjuan Jiang, Jiangbo Lu, Mingang Chen, Ran Yi, Lizhuang Ma:

ScatterNet: Point Cloud Learning via Scatters. 5611-5619 - Wenxuan Ma

, Jinming Zhang, Shuang Li, Chi Harold Liu
, Yulin Wang
, Wei Li:
Making The Best of Both Worlds: A Domain-Oriented Transformer for Unsupervised Domain Adaptation. 5620-5629 - Shengeng Tang

, Richang Hong, Dan Guo
, Meng Wang:
Gloss Semantic-Enhanced Network with Online Back-Translation for Sign Language Production. 5630-5638 - Bo Ju, Zhikang Zou, Xiaoqing Ye, Minyue Jiang, Xiao Tan, Errui Ding, Jingdong Wang:

Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network. 5639-5648 - Shuangrui Ding, Rui Qian, Hongkai Xiong:

Dual Contrastive Learning for Spatio-temporal Representation. 5649-5658 - Huilin Zhu, Jingling Yuan, Zhengwei Yang

, Xian Zhong
, Zheng Wang:
Fine-Grained Fragment Diffusion for Cross Domain Crowd Counting. 5659-5668 - Teng Yang, Yue Wang, Lu Zhang, Jinqing Qi, Huchuan Lu:

Depth-inspired Label Mining for Unsupervised RGB-D Salient Object Detection. 5669-5677 - Yongqi Wang, Zhou Zhao:

FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis. 5678-5687 - Ruitong Gan

, Junsong Fan, Yuxi Wang, Zhaoxiang Zhang:
Interact with Open Scenes: A Life-long Evolution Framework for Interactive Segmentation Models. 5688-5697 - Duo Zheng, Fandong Meng, Qingyi Si, Hairun Fan, Zipeng Xu, Jie Zhou, Fangxiang Feng, Xiaojie Wang:

Visual Dialog for Spotting the Differences between Pairs of Similar Images. 5698-5709 - Xiang-Jun Shen, Zhaorui Xu, Liangjun Wang, Zechao Li:

Time and Memory Efficient Large-Scale Canonical Correlation Analysis in Fourier Domain. 5710-5718
Oral Session XII: Understanding Multimedia Content -- Media Interpretation
- Jie Zhang, Yin Zhao, Kai Qian:

Enlarging the Long-time Dependencies via RL-based Memory Network in Movie Affective Analysis. 5739-5750 - Shuhan Zhong, Sizhe Song, Guanyao Li, S.-H. Gary Chan:

A Tree-Based Structure-Aware Transformer Decoder for Image-To-Markup Generation. 5751-5760 - Junbao Zhuo, Yan Zhu, Shuhao Cui, Shuhui Wang, Bin Ma, Qingming Huang, Xiaoming Wei, Xiaolin Wei:

Zero-shot Video Classification with Appropriate Web and Task Knowledge Transfer. 5761-5772 - Hao Zhang, Lechao Cheng

, Yanbin Hao, Chong-Wah Ngo:
Long-term Leap Attention, Short-term Periodic Shift for Video Classification. 5773-5782 - Jianjun Xu

, Hongtao Xie, Hai Xu, Yuxin Wang, Sun'ao Liu, Yongdong Zhang:
Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation. 5783-5792 - Shuang Wang, Lianli Gao, Xinyu Lyu, Yuyu Guo, Pengpeng Zeng, Jingkuan Song:

Dynamic Scene Graph Generation via Temporal Prior Inference. 5793-5801 - Xinyao Li

, Zhekai Du, Jingjing Li, Lei Zhu, Ke Lu:
Source-Free Active Domain Adaptation via Energy-Based Locality Preserving Transfer. 5802-5810 - Jingbei Li, Yi Meng, Xixin Wu, Zhiyong Wu, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:

Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks. 5811-5820 - Cláudio Bartolomeu, Rui Nóbrega

, David Semedo
:
Understanding News Text and Images Connection with Context-enriched Multimodal Transformers. 5821-5832 - Daichi Zhang, Fanzhao Lin, Yingying Hua, Pengju Wang, Dan Zeng, Shiming Ge:

Deepfake Video Detection with Spatiotemporal Dropout Transformer. 5833-5841 - Jiaqi Ma

, Shengyuan Yan, Lefei Zhang, Guoli Wang, Qian Zhang:
ELMformer: Efficient Raw Image Restoration with a Locally Multiplicative Transformer. 5842-5852 - Hongbo Sun

, Xiangteng He, Yuxin Peng:
SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization. 5853-5861 - Zhipeng Yu, Qianqian Xu, Yangbangyan Jiang, Haoyu Qin, Qingming Huang:

Pay Attention to Your Positive Pairs: Positive Pair Aware Contrastive Knowledge Distillation. 5862-5870 - Menglu Wang

, Xueyang Fu, Jiawei Liu, Zheng-Jun Zha:
JPEG Compression-aware Image Forgery Localization. 5871-5879
Poster Session XII: Understanding Multimedia Content -- Media Interpretation
- Yi Tan

, Yanbin Hao, Hao Zhang, Shuo Wang, Xiangnan He:
Hierarchical Hourglass Convolutional Network for Efficient Video Classification. 5880-5891 - Jin Wei

, Yuan Zhang, Yu Zhou
, Gangyan Zeng, Zhi Qiao
, Youhui Guo, Haiying Wu, Hongbin Wang, Weiping Wang
:
TextBlock: Towards Scene Text Spotting without Fine-grained Detection. 5892-5902 - Jianyuan Ni, Anne H. H. Ngu, Yan Yan:

Progressive Cross-modal Knowledge Distillation for Human Action Recognition. 5903-5912 - Zijie Yang, Lingxi Xie, Xinyue Huo, Sheng Tang, Qi Tian, Yongdong Zhang:

Finding the Host from the Lesion by Iteratively Mining the Registration Graph. 5913-5922 - Jonathan Samuel Lumentut

, In Kyu Park:
3D Body Reconstruction Revisited: Exploring the Test-time 3D Body Mesh Refinement Strategy via Surrogate Adaptation. 5923-5933 - Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, Christopher Mutschler:

Domain Adaptation for Time-Series Classification to Mitigate Covariate Shift. 5934-5943 - Pavel Korshunov, Sébastien Marcel:

Face Anthropometry Aware Audio-visual Age Verification. 5944-5951 - Xiaoxuan Chai, Junchi Zhou, Hang Zhou, Jui-Hsin Lai:

PDD-GAN: Prior-based GAN Network with Decoupling Ability for Single Image Dehazing. 5952-5960 - Yechao Xu, Zhengxing Sun, Qian Li, Yunhan Sun, Shoutong Luo:

Active Patterns Perceived for Stochastic Video Prediction. 5961-5969 - Nan Song

, Chi Zhang, Guosheng Lin:
Few-shot Open-set Recognition Using Background as Unknowns. 5970-5979 - Yibo Wang, Yunhu Ye, Yuanpeng Mao, Yanwei Yu, Yuanping Song:

Self-supervised Scene Text Segmentation with Object-centric Layered Representations Augmented by Text Regions. 5980-5989 - Cunling Bian

, Wei Feng, Song Wang:
Self-Supervised Representation Learning for Skeleton-Based Group Activity Recognition. 5990-5998 - Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao:

Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection. 5999-6008 - Qianyu Zhou

, Ke-Yue Zhang, Taiping Yao, Ran Yi, Shouhong Ding, Lizhuang Ma:
Adaptive Mixture of Experts Learning for Generalizable Face Anti-Spoofing. 6009-6018 - Meijie Zhang, Jianwu Li, Tianfei Zhou:

Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation. 6019-6028 - Siwei Su, Haijian Wang, Meng Yang

:
Consistency Learning based on Class-Aware Style Variation for Domain Generalizable Semantic Segmentation. 6029-6038 - Yinsong Xu

, Zhuqing Jiang, Aidong Men, Yang Liu, Qingchao Chen
:
Delving into the Continuous Domain Adaptation. 6039-6049 - Zihua Liu

, Songyan Zhang, Zhicheng Wang, Masatoshi Okutomi:
Digging Into Normal Incorporated Stereo Matching. 6050-6060 - Wenjing Huang, Shikui Tu, Lei Xu:

Box-FaceS: A Bidirectional Method for Box-Guided Face Component Editing. 6061-6071 - Xuhao Jiang, Weimin Tan, Ri Cheng, Shili Zhou, Bo Yan:

Learning Parallax Transformer Network for Stereo Image JPEG Artifacts Removal. 6072-6082 - Ri Cheng, Yuqi Sun, Bo Yan, Weimin Tan, Chenxi Ma:

Geometry-Aware Reference Synthesis for Multi-View Image Super-Resolution. 6083-6093 - Xinyan Zu, Haiyang Yu

, Bin Li, Xiangyang Xue:
Chinese Character Recognition with Augmented Character Profile Matching. 6094-6102 - Qianyue Bao, Fang Liu, Yang Liu, Licheng Jiao, Xu Liu, Lingling Li:

Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos. 6103-6112 - Haiyang Ying, Jinzhi Zhang, Yuzhe Chen, Zheng Cao

, Jing Xiao, Ruqi Huang, Lu Fang:
ParseMVS: Learning Primitive-aware Surface Representations for Sparse Multi-view Stereopsis. 6113-6124 - Jiong Wang, Zhou Zhao, Fei Wu:

Set-Based Face Recognition Beyond Disentanglement: Burstiness Suppression With Variance Vocabulary. 6125-6135 - Jinkai Zheng, Xinchen Liu, Xiaoyan Gu

, Yaoqi Sun, Chuang Gan, Jiyong Zhang, Wu Liu, Chenggang Yan:
Gait Recognition in the Wild with Multi-hop Temporal Switch. 6136-6145 - Zan Gao, Shenghao Chen, Yangyang Guo, Weili Guan, Jie Nie, Anan Liu:

Generic Image Manipulation Localization through the Lens of Multi-scale Spatial Inconsistence. 6146-6154 - Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Yifang Yin, Andrei Georgescu, An Tran, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann:

Beyond Geo-localization: Fine-grained Orientation of Street-view Images by Cross-view Matching with Satellite Imagery. 6155-6164 - Chen Qian

, Hui Zhang:
Region-based Pixels Integration Mechanism for Weakly Supervised Semantic Segmentation. 6165-6173 - Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu:

IVT: An End-to-End Instance-guided Video Transformer for 3D Pose Estimation. 6174-6182 - Rui Cao, Kaiyi Zhang

, Yang Chen, Ximing Yang
, Cheng Jin
:
Point Cloud Completion via Multi-Scale Edge Convolution and Attention. 6183-6192 - Suiyi Zhao

, Zhao Zhang, Richang Hong, Mingliang Xu, Haijun Zhang, Meng Wang, Shuicheng Yan:
CRNet: Unsupervised Color Retention Network for Blind Motion Deblurring. 6193-6201 - Yanyan Wei

, Zhao Zhang, Huan Zheng, Richang Hong, Yi Yang, Meng Wang:
SGINet: Toward Sufficient Interaction Between Single Image Deraining and Semantic Segmentation. 6202-6210 - Jiahuan Ren, Zhao Zhang, Richang Hong, Mingliang Xu, Haijun Zhang, Mingbo Zhao, Meng Wang:

Robust Low-Rank Convolution Network for Image Denoising. 6211-6219 - Suiyi Zhao

, Zhao Zhang, Richang Hong, Mingliang Xu, Yi Yang, Meng Wang:
FCL-GAN: A Lightweight and Real-Time Baseline for Unsupervised Blind Image Deblurring. 6220-6229 - Huabin Liu, Weixian Lv, John See

, Weiyao Lin:
Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition. 6230-6240 - Jiashuo Yu

, Ying Cheng, Rui-Wei Zhao, Rui Feng, Yuejie Zhang:
MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing. 6241-6249 - Sindhu B. Hegde, K. R. Prajwal, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar:

Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild. 6250-6258 - Chaofan Chen, Xiaoshan Yang

, Ming Yan, Changsheng Xu:
Attribute-guided Dynamic Routing Graph Network for Transductive Few-shot Learning. 6259-6268 - Ye Liu, Lingfeng Qiao, Di Yin, Zhuoxuan Jiang, Xinghua Jiang, Deqiang Jiang, Bo Ren:

OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification. 6269-6277 - Jiashuo Yu

, Jinyu Liu, Ying Cheng, Rui Feng, Yuejie Zhang:
Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection. 6278-6287 - Zhicai Wang, Yanbin Hao, Xingyu Gao, Hao Zhang, Shuo Wang, Tingting Mu

, Xiangnan He:
Parameterization of Cross-token Relations with Relative Positional Encoding for Vision MLP. 6288-6299 - Jian-Jun Qiao

, Zhi-Qi Cheng
, Xiao Wu, Wei Li, Ji Zhang:
Real-time Semantic Segmentation with Parallel Multiple Views Feature Augmentation. 6300-6308 - Jie Huang, Man Zhou, Yajing Liu, Mingde Yao, Feng Zhao, Zhiwei Xiong:

Exposure-Consistency Representation Learning for Exposure Correction. 6309-6317 - Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao

, Tianliang Zhang, Wenlong Wu, Wei Zhang, Chengjie Wang, Yuan Xie:
Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision. 6318-6326 - Qi He, Zhaoquan Yuan, Xiao Wu, Jun-Yan He:

Domain-Specific Conditional Jigsaw Adaptation for Enhancing transferability and Discriminability. 6327-6336 - Guang Yu

, Siqi Wang, Zhiping Cai, Xinwang Liu, Chengkun Wu:
Effective Video Abnormal Event Detection by Learning A Consistency-Aware High-Level Feature Extractor. 6337-6346 - Yiran Wang, Zhiyu Pan, Xingyi Li

, Zhiguo Cao, Ke Xian, Jianming Zhang:
Less is More: Consistent Video Depth Estimation with Masked Frames Modeling. 6347-6358 - Huan Zheng, Zhao Zhang, Haijun Zhang, Yi Yang, Shuicheng Yan, Meng Wang:

Deep Multi-Resolution Mutual Learning for Image Inpainting. 6359-6367 - Linhai Zhuo, Yuqian Fu, Jingjing Chen

, Yixin Cao, Yu-Gang Jiang:
TGDM: Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning. 6368-6376 - Zizheng Yang, Mingde Yao, Jie Huang, Man Zhou, Feng Zhao:

SIR-Former: Stereo Image Restoration Using Transformer. 6377-6385 - Zhengming Zhou, Qiulei Dong:

Learning Occlusion-aware Coarse-to-Fine Depth Map for Self-supervised Monocular Depth Estimation. 6386-6395 - Arghya Pal, Sailaja Rajanala, Raphael C.-W. Phan, KokSheik Wong

:
Guess-It-Generator: Generating in a Lewis Signaling Framework through Logical Reasoning. 6396-6405 - Mengmeng Liu, Zhi Ma, Tao Li, Yanfeng Jiang, Kai Wang:

Long-Term Person Re-identification with Dramatic Appearance Change: Algorithm and Benchmark. 6406-6415 - Chuanming Wang

, Huiyuan Fu, Huadong Ma:
PaCL: Part-level Contrastive Learning for Fine-grained Few-shot Image Classification. 6416-6424 - Gang Xu, Qibin Hou, Le Zhang

, Ming-Ming Cheng
:
FMNet: Frequency-Aware Modulation Network for SDR-to-HDR Translation. 6425-6435 - Ji Zhang, Zhi-Qi Cheng

, Xiao Wu, Wei Li, Jian-Jun Qiao
:
CrossNet: Boosting Crowd Counting with Localization. 6436-6444 - Chen Wang

, Xian Wu, Yuan-Chen Guo, Song-Hai Zhang, Yu-Wing Tai, Shi-Min Hu:
NeRF-SR: High Quality Neural Radiance Fields using Supersampling. 6445-6454 - Xinpeng Li, Xiaojiang Peng:

Rail Detection: An Efficient Row-based Network and a New Benchmark. 6455-6463 - Yanyan Wei

, Zhao Zhang, Mingliang Xu, Richang Hong, Jicong Fan, Shuicheng Yan:
Robust Attention Deraining Network for Synchronous Rain Streaks and Raindrops Removal. 6464-6472 - Weihong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun, Qiang Huo:

TSRFormer: Table Structure Recognition with Transformers. 6473-6482 - Jinghao Zhang, Jie Huang, Mingde Yao, Man Zhou, Feng Zhao:

Structure- and Texture-Aware Learning for Low-Light Image Enhancement. 6483-6492 - Fengyi Zhang

, Hui Zeng, Tianjun Zhang, Lin Zhang:
CLUT-Net: Learning Adaptively Compressed Representations of 3DLUTs for Lightweight Image Enhancement. 6493-6501 - Pedro Ramoneda, Dasaem Jeong

, Eita Nakamura, Xavier Serra, Marius Miron:
Automatic Piano Fingering from Partially Annotated Scores using Autoregressive Neural Networks. 6502-6510 - Sindhu B. Hegde, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar:

Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors. 6511-6520 - Naishan Zheng, Jie Huang, Qi Zhu, Man Zhou, Feng Zhao, Zheng-Jun Zha:

Enhancement by Your Aesthetic: An Intelligible Unsupervised Personalized Enhancer for Low-Light Images. 6521-6529 - Han Ling, Quansen Sun, Zhenwen Ren

, Yazhou Liu, Hongyuan Wang, Zichen Wang:
Scale-flow: Estimating 3D Motion from Video. 6530-6538 - Danna Xue, Fei Yang, Pei Wang, Luis Herranz

, Jinqiu Sun, Yu Zhu, Yanning Zhang:
SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision. 6539-6548 - Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Jing Li, Guangtao Zhai:

Saliency in Augmented Reality. 6549-6558 - Ye Deng

, Siqi Hui, Sanping Zhou, Deyu Meng, Jinjun Wang:
T-former: An Efficient Transformer for Image Inpainting. 6559-6568 - Hao Liu

, Bin Chen
, Bo Wang, Chunpeng Wu, Feng Dai, Peng Wu:
Cycle Self-Training for Semi-Supervised Object Detection with Distribution Consistency Reweighting. 6569-6578 - Jiahui Zhang

, Fangneng Zhan, Rongliang Wu, Yingchen Yu, Wenqing Zhang, Song Bai, Xiaoqin Zhang, Shijian Lu:
VMRF: View Matching Neural Radiance Fields. 6579-6587 - Yuqian Fu, Yu Xie, Yanwei Fu

, Jingjing Chen
, Yu-Gang Jiang:
ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning. 6609-6617 - Xiao Wang, Zheng Wang, Wu Liu, Xin Xu, Qijun Zhao, Shin'ichi Satoh:

Towards Causality Inference for Very Important Person Localization. 6618-6626 - Keyang Cheng, Yu Si, Hao Zhou, Rabia Tahir:

MMDV: Interpreting DNNs via Building Evaluation Metrics, Manual Manipulation and Decision Visualization. 6627-6635 - Chengjie Ge, Xueyang Fu, Zheng-Jun Zha:

Learning Dual Convolutional Dictionaries for Image De-raining. 6636-6644 - Hu Yu

, Jie Huang, Yajing Liu, Qi Zhu, Man Zhou, Feng Zhao:
Source-Free Domain Adaptation for Real-World Image Dehazing. 6645-6654 - Xiangyu Miao, Shangfei Wang:

Knowledge Guided Representation Disentanglement for Face Recognition from Low Illumination Images. 6655-6663 - Tao Zhou, Wenhan Luo

, Zhiguo Shi, Jiming Chen, Qi Ye:
APPTracker: Improving Tracking Multiple Objects in Low-Frame-Rate Videos. 6664-6674 - Jiaxu Leng, Jia Wang, Xinbo Gao, Bo Hu, Ji Gan

, Chenqiang Gao:
ICNet: Joint Alignment and Reconstruction via Iterative Collaboration for Video Super-Resolution. 6675-6684 - Junshan Hu, Chaoxu Guo, Liansheng Zhuang, Biao Wang, Tiezheng Ge, Yuning Jiang, Houqiang Li:

Estimation of Reliable Proposal Quality for Temporal Action Detection. 6685-6695 - Zenggui Chen, Zhouhui Lian:

Semi-supervised Semantic Segmentation via Prototypical Contrastive Learning. 6696-6705 - Chiawei Kuo, Yi-Ting Tsai, Hong-Han Shuai, Yi-Ren Yeh, Ching-Chun Huang:

Towards Understanding Cross Resolution Feature Matching for Surveillance Face Recognition. 6706-6716 - Yurui Zhu, Xueyang Fu, Chengzhi Cao, Xi Wang, Qibin Sun, Zheng-Jun Zha:

Single Image Shadow Detection via Complementary Mechanism. 6717-6726 - Qiqi Bao, Rui Zhu, Bowen Gang, Pengyang Zhao, Wenming Yang, Qingmin Liao:

Distilling Resolution-robust Identity Knowledge for Texture-Enhanced Face Hallucination. 6727-6736 - Jia Wang, Tianhao Lan, Jie Chen, Chengwen Luo, Chao Wu, Jianqiang Li

:
Phoneme-Aware Adaptation with Discrepancy Minimization and Dynamically-Classified Vector for Text-independent Speaker Verification. 6737-6745 - Jiaxu Leng, Mingpi Tan, Xinbo Gao, Wen Lu, Zongyi Xu:

Anomaly Warning: Learning and Memorizing Future Semantic Patterns for Unsupervised Ex-ante Potential Anomaly Prediction. 6746-6754 - Yuxi Mi

, Yuge Huang, Jiazhen Ji, Hongquan Liu
, Xingkun Xu
, Shouhong Ding, Shuigeng Zhou:
DuetFace: Collaborative Privacy-Preserving Face Recognition via Channel Splitting in the Frequency Domain. 6755-6764 - Youze Xue, Jiansheng Chen, Yudong Zhang, Cheng Yu, Huimin Ma, Hongbing Ma:

3D Human Mesh Reconstruction by Learning to Sample Joint Adaptive Tokens for Transformers. 6765-6773 - Yanling Tian, Di Chen, Yunan Liu, Shanshan Zhang, Jian Yang:

Grouped Adaptive Loss Weighting for Person Search. 6774-6782 - Weilai Xiang, Hongyu Yang, Di Huang, Yunhong Wang:

Multi-view Gait Video Synthesis. 6783-6791 - Yuwei Zhou, Xin Wang, Hong Chen, Xuguang Duan, Chaoyu Guan, Wenwu Zhu:

Curriculum-NAS: Curriculum Weight-Sharing Neural Architecture Search. 6792-6801 - Ya-Nan Zhang, Linlin Shen, Qiufu Li:

Content and Gradient Model-driven Deep Network for Single Image Reflection Removal. 6802-6812 - Haoru Zhao

, Zhaorui Gu, Bing Zheng, Haiyong Zheng:
TransCNN-HAE: Transformer-CNN Hybrid AutoEncoder for Blind Image Inpainting. 6813-6821 - Tangwen Qian, Yongjun Xu, Zhao Zhang, Fei Wang:

Trajectory Prediction from Hierarchical Perspective. 6822-6830 - Zhiyuan Zhao, Qingjie Liu, Yunhong Wang:

Exploring Effective Knowledge Transfer for Few-shot Object Detection. 6831-6839 - Xinhua Cheng, Mengxi Jia, Qian Wang, Jian Zhang:

More is better: Multi-source Dynamic Parsing Attention for Occluded Person Re-identification. 6840-6849 - Gyumin Shim, Minsoo Lee, Jaegul Choo:

ReFu: Refine and Fuse the Unobserved View for Detail-Preserving Single-Image 3D Human Reconstruction. 6850-6859 - Mingii Choi, Sangyeong Lee, Heesun Jung, Jong-Uk Hou

:
Transformers in Spectral Domain for Estimating Image Geometric Transformation. 6860-6867
Brave New Ideas Session
- Renrui Zhang, Ziyao Zeng, Ziyu Guo, Yafeng Li:

Can Language Understand Depth? 6868-6874 - Yongkang Wong, Shaojing Fan, Yangyang Guo, Ziwei Xu, Karen Stephen, Rishabh Sheoran, Anusha Bhamidipati, Vivek Barsopia, Jianquan Liu, Mohan S. Kankanhalli

:
Compute to Tell the Tale: Goal-Driven Narrative Generation. 6875-6882 - Jitao Sang, Xian Zhao, Jiaming Zhang, Zhiyu Lin:

Benign Adversarial Attack: Tricking Models for Goodness. 6883-6889 - Kurtis Haut, Caleb Wohn, Victor Antony, Aidan Goldfarb, Melissa Welsh, Dillanie Sumanthiran, Md. Rafayet Ali, Ehsan Hoque:

Demographic Feature Isolation for Bias Research using Deepfakes. 6890-6897 - Yoko Yamakata, Akihisa Ishino, Akiko Sunto, Sosuke Amano, Kiyoharu Aizawa:

Recipe-oriented Food Logging for Nutritional Management. 6898-6904
Doctoral Consortium
- Vignesh V. Menon

:
Video Coding Enhancements for HTTP Adaptive Streaming. 6905-6909 - Xiaoyu Lin:

Unsupervised Multi-object Tracking via Dynamical VAE and Variational Inference. 6910-6914 - Igor Morawski

:
Enabling Effective Low-Light Perception using Ubiquitous Low-Cost Visible-Light Cameras. 6915-6919 - Manuel Silva

:
Interaction with Immersive Cultural Heritage Environments: Using XR Technologies to Represent Multiple Perspectives on Serralves Museum. 6920-6924 - Maurits J. R. Bleeker:

Multi-modal Learning Algorithms and Network Architectures for Information Extraction and Retrieval. 6925-6929 - Travis Seng:

Enriching Existing Educational Video Datasets to Improve Slide Classification and Analysis. 6930-6934 - Diogo Tavares:

Zero-shot Generalization of Multimodal Dialogue Agents. 6935-6939 - Diogo Silva

:
The First Impression: Understanding the Impact of Multimodal System Responses on User Behavior in Task-oriented Agents. 6940-6943
Technical Demonstrators
- Wei Xu, Bowen Tian, Lijie Luo, Weiming Yang, Xianke Wang, Lei Wu:

SingMaster: A Sight-singing Evaluation System of "Shoot and Sing" Based on Smartphone. 6944-6946 - Kele Xu

, Ming Feng, Weiquan Huang:
Seeing Speech: Magnetic Resonance Imaging-Based Vocal Tract Deformation Visualization Using Cross-Modal Transformer. 6947-6949 - Antonio Origlia, Martina Di Bratto, Maria Di Maro, Sabrina Mennella

:
Developing Embodied Conversational Agents in the Unreal Engine: The FANTASIA Plugin. 6950-6951 - Yuanfeng Song, Rongzhong Lian, Yixin Chen, Di Jiang, Xuefang Zhao, Conghui Tan, Qian Xu, Raymond Chi-Wing Wong:

A Platform for Deploying the TFE Ecosystem of Automatic Speech Recognition. 6952-6954 - Ignacio Reimat, Yanni Mei, Evangelos Alexiou

, Jack Jansen, Jie Li, Shishir Subramanyam, Irene Viola, Johan Oomen, Pablo César:
Mediascape XR: A Cultural Heritage Experience in Social VR. 6955-6957 - Ziyi Wang, Xingqi Wang

, Zeyu Jin, Xiaohan Li, Shikun Sun, Jia Jia:
AI Carpet: Automatic Generation of Aesthetic Carpet Pattern. 6958-6960 - Yuki Tajima, Shota Okubo, Tomoaki Konno, Toshiharu Horiuchi

, Tatsuya Kobayashi:
Sync Sofa: Sofa-type Side-by-side Communication Experience Based on Multimodal Expression. 6961-6963 - Xin Jin, Shu Zhao, Le Zhang

, Xin Zhao, Qiang Deng, Chaoen Xiao:
Attribute Controllable Beautiful Caucasian Face Generation by Aesthetics Driven Reinforcement Learning. 6964-6966 - Giuseppe Becchi, Andrea Ferracani, Filippo Principi, Alberto Del Bimbo:

An AI Powered Re-Identification System for Real-time Contextual Multimedia Applications. 6967-6969 - Zhilong Zhou, Shiyao Wang, Tiezheng Ge, Yuning Jiang:

A High-resolution Image-based Virtual Try-on System in Taobao E-commerce Scenario. 6970-6972 - Wei Duan, Zhe Zhang

, Yi Yu, Keizo Oyama:
Interpretable Melody Generation from Lyrics with Discrete-Valued Adversarial Training. 6973-6975 - Chuanhang Yan, Yu Sun

, Qian Bao, Jinhui Pang, Wu Liu, Tao Mei:
WOC: A Handy Webcam-based 3D Online Chatroom. 6976-6978 - Pin-Xuan Liu, Tse-Yu Pan, Hsin-Shih Lin, Hung-Kuo Chu

, Min-Chun Hu:
BetterSight: Immersive Vision Training for Basketball Players. 6979-6981 - Florent Geniet, Valérie Gouet-Brunet, Mathieu Brédif:

ALEGORIA: Joint Multimodal Search and Spatial Navigation into the Geographic Iconographic Heritage. 6982-6984 - Lorenzo Agnolucci

, Leonardo Galteri, Marco Bertini, Alberto Del Bimbo:
Restoration of Analog Videos Using Swin-UNet. 6985-6987 - Shing Ming Wong, Chien-Wen Chen, Tse-Yu Pan, Hung-Kuo Chu

, Min-Chun Hu:
GetWild: A VR Editing System with AI-Generated 3D Object and Terrain. 6988-6990 - Ting-Yang Kao, Tse-Yu Pan, Chen-Ni Chen, Tsung-Hsun Tsai, Hung-Kuo Chu

, Min-Chun Hu:
ScoreActuary: Hoop-Centric Trajectory-Aware Network for Fine-Grained Basketball Shot Analysis. 6991-6993 - Tiago Fornelos, Pedro Valente

, Rafael Ferreira
, Diogo Tavares, Diogo Silva
, David Semedo, João Magalhães, Nuno Correia
:
A Conversational Shopping Assistant for Online Virtual Stores. 6994-6996 - Rafael Ferreira

, Diogo Silva
, Diogo Tavares, Frederico Vicente
, Mariana Bonito, Gustavo Gonçalves, Rui Margarido, Paula Figueiredo, Helder Rodrigues, David Semedo
, João Magalhães:
TWIZ: The Multimodal Conversational Task Wizard. 6997-6999 - Maria Giovanna Donadio, Filippo Principi, Andrea Ferracani, Marco Bertini, Alberto Del Bimbo:

Engaging Museum Visitors with Gamification of Body and Facial Expressions. 7000-7002
Grand Challenges
- Lutharsanen Kunam, Luca Rossetto

, Abraham Bernstein:
A Multi-Stream Approach for Video Understanding. 7003-7007 - Weilong Chen, Chenghao Huang, Weimin Yuan, Xiaolu Chen, Wenhao Hu, Xinran Zhang, Yanru Zhang:

Title-and-Tag Contrastive Vision-and-Language Transformer for Social Media Popularity Prediction. 7008-7012 - Meng Liu, Shuyan Zhai

, Yongqiang Li
, Weili Guan, Liqiang Nie:
A Baseline for ViCo Conversational Head Generation Challenge. 7013-7015 - Chuin Hong Yap

, Moi Hoon Yap, Adrian K. Davison, Connah Kendrick
, Jingting Li, Su-Jing Wang, Ryan Cunningham:
3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference Frame. 7016-7020 - Chao Zhou, Yixuan Ban, Yangchao Zhao, Liang Guo, Bing Yu:

PDAS: Probability-Driven Adaptive Streaming for Short Video. 7021-7025 - Tamás Grósz

, Dejan Porjazovski, Yaroslav Getman
, Sudarsana Reddy Kadiri
, Mikko Kurimo:
Wav2vec2-based Paralinguistic Systems to Recognise Vocalised Emotions and Stuttering. 7026-7029 - Si-Ze Qian, Yuhong Xie, Zipeng Pan, Yuan Zhang, Tao Lin:

DAM: Deep Reinforcement Learning based Preload Algorithm with Action Masking for Short Video Streaming. 7030-7034 - Ricong Huang, Weizhi Zhong, Guanbin Li:

Audio-driven Talking Head Generation with Transformer and 3D Morphable Model. 7035-7039 - Siyang Sun, Xiong Xiong

, Yun Zheng:
Two stage Multi-Modal Modeling for Video Interaction Analysis in Deep Video Understanding Challenge. 7040-7044 - Jianmin Wu, Liming Zhao, Dangwei Li, Chen-Wei Xie, Siyang Sun, Yun Zheng:

Deeply Exploit Visual and Language Information for Social Media Popularity Prediction. 7045-7049 - Ailin Huang, Zhewei Huang, Shuchang Zhou:

Perceptual Conversational Head Generation with Regularized Driver and Enhanced Renderer. 7050-7054 - Chen-Wei Xie, Siyang Sun, Liming Zhao, Jianmin Wu, Dangwei Li, Yun Zheng:

Deep Video Understanding with a Unified Multi-Modal Retrieval Framework. 7055-7059 - Kang You, Kele Xu

, Boqing Zhu, Ming Feng, Dawei Feng, Bo Liu, Tian Gao, Bo Ding:
Masked Modeling-based Audio Representation for ACM Multimedia 2022 Computational Paralinguistics ChallengE. 7060-7064 - Wei Zhao, Peng Xiao, Rongju Zhang, Yijun Wang, Jianxin Lin:

Semantic-aware Responsive Listener Head Synthesis. 7065-7069 - Yingwei Pan, Yehao Li, Jianjie Luo, Jun Xu, Ting Yao, Tao Mei:

Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training. 7070-7074 - Keith Curtis, George Awad, Shahzad Rajput, Ian Soboroff:

The ACM Multimedia 2022 Deep Video Understanding Grand Challenge. 7075-7078 - Tian Lv, Yu-Hui Wen, Zhiyao Sun, Zipeng Ye, Yong-Jin Liu:

Generating Smooth and Facial-Details-Enhanced Talking Head Video: A Perspective of Pre and Post Processes. 7079-7083 - Xutong Zuo, Yishu Li, Mohan Xu, Wei Tsang Ooi, Jiangchuan Liu, Junchen Jiang, Xinggong Zhang, Kai Zheng, Yong Cui:

Bandwidth-Efficient Multi-video Prefetching for Short Video Streaming. 7084-7088 - Xiaochen Cai, Hengxing Cai, Boqing Zhu, Kele Xu

, Weiwei Tu, Dawei Feng:
Multiple Temporal Fusion based Weakly-supervised Pre-training Techniques for Video Categorization. 7089-7093 - Sean Campos

, Devesh Khandelwal, Shwetha C. Nagaraj, Fred Nugen
, Alberto Todeschini:
Deep Learning-Based Acoustic Mosquito Detection in Noisy Conditions Using Trainable Kernels and Augmentations. 7094-7098 - Fuyan Ma, Ziyu Ma, Bin Sun, Shutao Li:

TA-CNN: A Unified Network for Human Behavior Analysis in Multi-Person Conversations. 7099-7103 - Shakeel A. Sheikh, Md. Sahidullah, Slim Ouni, Fabrice Hirsch:

End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge. 7104-7108 - Philipp Müller, Michael Dietz, Dominik Schiller, Dominike Thomas, Hali Lindsay, Patrick Gebhard, Elisabeth André

, Andreas Bulling:
MultiMediate'22: Backchannel Detection and Agreement Estimation in Group Interactions. 7109-7114 - Ximing Wu, Lei Zhang, Laizhong Cui:

QoE-aware Download Control and Bitrate Adaptation for Short Video Streaming. 7115-7119 - Björn W. Schuller, Anton Batliner, Shahin Amiriparian

, Christian Bergler, Maurice Gerczuk, Natalie Holz, Pauline Larrouy-Maestri, Sebastian P. Bayerl, Korbinian Riedhammer
, Adria Mallol-Ragolta, Maria Pateraki, Harry Coppock, Ivan Kiskin, Marianne Sinka, Stephen J. Roberts:
The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes. 7120-7124 - Xinqi Fan

, Ali Raza Shahid
, Hong Yan:
Adaptive Dual Motion Model for Facial Micro-Expression Generation. 7125-7129 - Chih-Chung Hsu

, Pi-Ju Tsai, Ting-Chun Yeh, Xiu-Yu Hou:
A Comprehensive Study of Spatiotemporal Feature Learning for Social Medial Popularity Prediction. 7130-7134 - Moreno La Quatra

, Lorenzo Vaiani
, Alkis Koudounas
, Luca Cagliero
, Paolo Garza, Elena Baralis:
How Much Attention Should we Pay to Mosquitoes? 7135-7139 - Tuan-Vinh La, Minh-Son Dao, Quang-Tien Tran

, Thanh-Phuc Tran
, Anh-Duy Tran
, Duc-Tien Dang-Nguyen
:
A Combination of Visual-Semantic Reasoning and Text Entailment-based Boosting Algorithm for Cheapfake Detection. 7140-7144 - Quang-Tien Tran

, Thanh-Phuc Tran
, Minh-Son Dao, Tuan-Vinh La, Anh-Duy Tran
, Duc-Tien Dang-Nguyen
:
A Textual-Visual-Entailment-based Unsupervised Algorithm for Cheapfake Detection. 7145-7149 - Sirui Zhao, Shukang Yin, Huaying Tang, Rijin Jin, Yifan Xu, Tong Xu, Enhong Chen:

Fine-grained Micro-Expression Generation based on Thin-Plate Spline and Relative AU Constraint. 7150-7154 - Gulshan Sharma

, Abhinav Dhall, Ramanathan Subramanian
:
A Transformer Based Approach for Activity Detection. 7155-7159 - Wenhao Leng, Sirui Zhao, Yiming Zhang, Shifeng Liu, Xinglong Mao, Hao Wang, Tong Xu, Enhong Chen:

ABPN: Apex and Boundary Perception Network for Micro- and Macro-Expression Spotting. 7160-7164 - Beibei Zhang, Yaqun Fang, Tongwei Ren, Gangshan Wu:

Multimodal Analysis for Deep Video Understanding with Video Language Transformer. 7165-7169 - Jingting Li, Moi Hoon Yap, Wen-Huang Cheng, John See

, Xiaopeng Hong, Xiaobai Li, Su-Jing Wang, Adrian K. Davison, Yante Li
, Zizhao Dong:
MEGC2022: ACM Multimedia 2022 Micro-Expression Grand Challenge. 7170-7174 - Yuan Zhao

, Xin Tong, Zichong Zhu, Jianda Sheng, Lei Dai, Lingling Xu, Xuehai Xia, Yu Jiang, Jiao Li:
Rethinking Optical Flow Methods for Micro-Expression Spotting. 7175-7179 - Muhannad Alkaddour, Abhinav Dhall, Usman Tariq, Hasan Al-Nashash, Fares Al-Shargie

:
Sentiment-aware Classifier for Out-of-Context Caption Detection. 7180-7184 - Penggang Qin, Jiarui Yu

, Yan Gao, Derong Xu
, Yunkai Chen, Shiwei Wu, Tong Xu, Enhong Chen, Yanbin Hao:
Unified QA-aware Knowledge Graph Generation Based on Multi-modal Modeling. 7185-7189 - Garima Sharma, Kalin Stefanov

, Abhinav Dhall, Jianfei Cai:
Graph-based Group Modelling for Backchannel Detection. 7190-7194 - Claude Montacié

, Marie-José Caraty, Nikola Lackovic:
Audio Features from the Wav2Vec 2.0 Embeddings for the ACM Multimedia 2022 Stuttering Challenge. 7195-7199 - Yunpeng Tan, Fangyu Liu, Bowei Li, Zheng Zhang, Bo Zhang:

An Efficient Multi-View Multimodal Data Processing Framework for Social Media Popularity Prediction. 7200-7204 - Jun Yu, Zhongpeng Cai, Zepeng Liu, Guochen Xie, Peng He:

Facial Expression Spotting Based on Optical Flow Features. 7205-7209 - Jun Yu, Guochen Xie, Zhongpeng Cai, Peng He, Fang Gao, Qiang Ling:

Micro Expression Generation with Thin-plate Spline Motion Model and Face Parsing. 7210-7214 - Raksha Ramesh, Vishal Anand, Zifan Chen, Yifei Dong

, Yun Chen, Ching-Yung Lin:
Leveraging Text Representation and Face-head Tracking for Long-form Multimodal Semantic Relation Understanding. 7215-7219 - Miriam Redi, Georges Quénot:

Overview of the Multimedia Grand Challenges 2022. 7220-7222
Interactive Arts
- Manuel Silva

, Luana Santos, Luís Teixeira, José Vasco Carvalho:
All is Noise: In Search of Enlightenment, a VR Experience. 7223-7224 - Johnny DiBlasi, Carlos Castellanos, Bello Bello:

Beauty: Machine Microbial Interface as Artistic Experimentation. 7225-7226 - Xinrui Wang, Yulu Song, Xiaohui Wang:

Being's Spread: Mirror of Life Interconnection. 7227-7228 - Tiago Rorke:

CAPTCHA the Flag: Interactive Plotter Livestream. 7229-7230 - Bo Shui

, Xiaohui Wang:
Cellular Trending: Fragmented Information Dissemination on Social Media Through Generative Lens. 7231-7232 - Sofia Hinckel Dias, Sara Rodrigues Silva, Beatriz Rodrigues Silva, Rui Nóbrega

:
Collaboration Superpowers: The Process of Crafting an Interactive Storytelling Animation. 7233-7234 - Varvara Guljajeva, Mar Canet Sola

:
Dream Painter: An Interactive Art Installation Bridging Audience Interaction, Robotics, and Creative AI. 7235-7236 - Jorge Forero, Gilberto Bernardes

, Mónica Mendes
:
Emotional Machines: Toward Affective Virtual Environments. 7237-7238 - Jiaxiang You, Yinyu Chen, Xiaohui Wang:

Fragrance In Sight: Personalized Perfume Production Based on Style Recognition. 7239-7240 - Ze Gao

, Anqi Wang, Pan Hui, Tristan Braud:
Meditation in Motion: Interactive Media Art Visualization Based on Ancient Tai Chi Chuan. 7241-7242 - Hugo Pauget Ballesteros, Gilles Azzaro, Jean Mélou, Yvain Quéau, Jean-Denis Durou:

Read Your Voice: A Playful Interactive Sound Encoder/Decoder. 7243-7244 - Tai-Chen Tsai

, Tse-Yu Pan, Min-Chun Hu, Ya-Lun Tao:
StimulusLoop: Game-Actuated Mutuality Artwork for Evoking Affective State. 7245-7247 - Emily Graber, Charles Picasso, Elaine Chew

:
Viva Contemporary! Mobile Music Laboratory. 7248-7249 - Yuqian Sun

, Chenhang Cheng, Ying Xu, Yihua Li, Chang Hee Lee, Ali Asadipour:
Wander: An AI-driven Chatbot to Visit the Future Earth. 7250-7251
Industry session
- Zhenyu Zhang, Bowen Yu, Haiyang Yu, Tingwen Liu

, Cheng Fu
, Jingyang Li, Chengguang Tang, Jian Sun, Yongbin Li:
Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration. 7252-7260 - Shiyao Wang, Qi Liu, Yicheng Zhong, Zhilong Zhou, Tiezheng Ge, Defu Lian

, Yuning Jiang:
CreaGAN: An Automatic Creative Generation Framework for Display Advertising. 7261-7269 - Qinghui Sun, Jie Gu, Xiaoxiao Xu, Renjun Xu, Ke Liu

, Bei Yang, Hong Liu, Huan Xu:
Learning Interest-oriented Universal User Representation via Self-supervision. 7270-7278 - Ruicheng Liu, Jialing Liang, Peiquan Jin, Yi Wang:

MMH-index: Enhancing Apache Lucene with High-Performance Multi-Modal Indexing and Searching. 7279-7289 - Qi Yang

, Sergey I. Nikolenko, Alfred Huang
, Aleksandr Farseev:
Personality-Driven Social Multimedia Content Recommendation. 7290-7299 - Junwu Zhang, Mang Ye

, Yao Yang:
Learnable Privacy-Preserving Anonymization for Pedestrian Images. 7300-7308 - Wenke Huang

, Mang Ye
, Bo Du, Xiang Gao:
Few-Shot Model Agnostic Federated Learning. 7309-7316 - He Li

, Mang Ye
, Cong Wang, Bo Du:
Pyramidal Transformer with Conv-Patchify for Person Re-identification. 7317-7326
Open Source session
- Sachin Mehta, Farzad Abdolhosseini, Mohammad Rastegari:

CVNets: High Performance Library for Computer Vision. 7327-7330 - Yue Zhou, Xue Yang, Gefan Zhang, Jiabao Wang, Yanyi Liu

, Liping Hou, Xue Jiang, Xingzhao Liu, Junchi Yan, Chengqi Lyu, Wenwei Zhang, Kai Chen:
MMRotate: A Rotated Object Detection Benchmark using PyTorch. 7331-7334 - Stéphane Massonnet, Marco Romanelli, Rémi Lebret, Niels Poulsen, Karl Aberer:

MoZuMa: A Model Zoo for Multimedia Applications. 7335-7338 - Wei Gao

, Hang Yuan, Yang Guo, Lvfang Tao
, Zhanyuan Cai, Ge Li:
OpenHardwareVC: An Open Source Library for 8K UHD Video Coding Hardware Implementation. 7339-7342 - Abdelhak Bentaleb, Zhengdao Zhan, Farzad Tashtarian, May Lim, Saad Harous

, Christian Timmerer, Hermann Hellwagner, Roger Zimmermann:
Low Latency Live Streaming Implementation in DASH and HLS. 7343-7346 - Wei Gao

, Hua Ye, Ge Li, Huiming Zheng, Yuyang Wu, Liang Xie:
OpenPointCloud: An Open-Source Algorithm Library of Deep Learning Based Point Cloud Compression. 7347-7350 - Haodong Duan, Jiaqi Wang

, Kai Chen, Dahua Lin:
PYSKL: Towards Good Practices for Skeleton Action Recognition. 7351-7354 - Liang Qiao, Hui Jiang, Ying Chen, Can Li, Pengfei Li, Zaisheng Li

, Baorui Zou, Dashan Guo, Yingda Xu, Yunlu Xu, Zhanzhan Cheng, Yi Niu:
DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding. 7355-7358 - Yuwei Zhou, Hong Chen, Zirui Pan, Chuanhao Yan, Fanqi Lin, Xin Wang, Wenwu Zhu:

CurML: A Curriculum Machine Learning Library. 7359-7363
Reproducibility session
- Xin Jin, Ke Liu, Dongqing Zou, Zhonglan Li, Heng Huang, Vajira Thambawita:

Reproducibility Companion Paper: Focusing on Persons: Colorizing Old Images Learning from Modern Historical Movies. 7364-7367
Tutorial Overviews
- Fernando Pereira:

Deep Learning-based Point Cloud Coding for Immersive Experiences. 7368-7370 - Yiannis Andreopoulos, Cosmin Stejerean

:
Advances in Quality Assessment Of Video Streaming Systems: Algorithms, Methods, Tools. 7371 - Zheng Wang, Dan Xu, Zhedong Zheng, Kui Jiang:

Multimedia Content Understanding in Harsh Environments. 7372-7373 - Ioannis Pitas, Ioannis Mademlis

:
Autonomous UAV Cinematography. 7374-7376 - Xin Wang, Xiaohan Lan, Wenwu Zhu:

Video Grounding and Its Generalization. 7377-7379 - Federico Becattini

, Tiberio Uricchio
:
Memory Networks. 7380-7382 - Jakub Lokoc, Klaus Schoeffmann, Werner Bailer, Luca Rossetto

, Björn Þór Jónsson:
Open Challenges of Interactive Video Search and Evaluation. 7383-7385
Workshop Overviews
- Hideo Saito, Thomas B. Moeslund

, Rainer Lienhart:
MMSports'22: 5th International ACM Workshop on Multimedia Content Analysis in Sports. 7386-7388 - Shahin Amiriparian

, Lukas Christ, Andreas König, Eva-Maria Meßner, Alan Cowen, Erik Cambria, Björn W. Schuller:
MuSe 2022 Challenge: Multimodal Humour, Emotional Reactions, and Stress. 7389-7391 - Wei Gao

, Ge Li, Hui Yuan, Raouf Hamzaoui, Zhu Li, Shan Liu:
APCCPA '22: 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 7392-7393 - Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:

M4MM '22: 1st International Workshop on Methodologies for Multimedia. 7394-7396 - Jingting Li, Moi Hoon Yap, Wen-Huang Cheng, John See

, Xiaopeng Hong, Xiaobai Li, Su-Jing Wang:
FME '22: 2nd Workshop on Facial Micro-Expression: Advanced Techniques for Multi-Modal Facial Expression Analysis. 7397-7399 - Mohan S. Kankanhalli

, Jianquan Liu, Yongkang Wong, Karen Stephen, Rishabh Sheoran, Anusha Bhamidipati:
NarSUM '22: 1st Workshop on User-centric Narrative Summarization of Long Videos. 7400-7401 - Yoko Yamakata, Atsushi Hashimoto, Jingjing Chen:

CEA++'22: 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications. 7402-7404 - Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:

DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. 7405-7406 - Dingwen Zhang, Chaowei Fang, Wu Liu, Xinchen Liu, Jingkuan Song, Hongyuan Zhu, Wenbing Huang, John Smith:

HCMA'22: 3rd International Workshop on Human-Centric Multimedia Analysis. 7407-7409 - Luca Rossetto

, Werner Bailer, Jakub Lokoc, Klaus Schoeffmann:
IMuR 2022: Introduction to the 2nd Workshop on Interactive Multimedia Retrieval. 7410-7411 - Irene Viola, Hadi Amirpour

, Maria Torres Vega:
IXR '22: 1st Workshop on Interactive eXtended Reality. 7412-7413 - Stavroula G. Mougiakakou, Giovanni Maria Farinella, Keiji Yanai, Dario Allegra:

MADiMa'22: 7th International Workshop on Multimedia Assisted Dietary Management. 7414-7415 - Xuemeng Song, Jingjing Chen

, Federico Becattini
, Weili Guan, Yibing Zhan, Tat-Seng Chua:
MCFR'22: 1st Workshop on Multimedia Computing towards Fashion Recommendation. 7416-7417 - Si Liu, Qin Jin, Luoqi Liu, Zongheng Tang, Linli Lin:

PIC'22: 4th Person in Context Workshop. 7418-7419 - Ravi Prakash

, Mylène C. Q. Farias, Marcelo M. Carvalho, Ryan P. McMahan:
PIES-ME '22: 1st Workshop on Photorealistic Image and Environment Synthesis for Multimedia Experiments. 7420-7422 - Jing Li, Patrick Le Callet, Xinbo Gao, Zhi Li, Wen Lu, Jiachen Yang, Junle Wang:

QoEVMA'22: 2nd Workshop on Quality of Experience (QoE) in Visual Multimedia Applications. 7423-7425 - Valérie Gouet-Brunet, Ronak Kosti, Li Weng:

SUMAC '22: 4th ACM International workshop on Structuring and Understanding of Multimedia heritAge Contents. 7426-7427 - Liang Liao, Dan Xu, Yang Wu, Xiao Wang, Jing Xiao:

UoLMM'22: 2nd International Workshop on Robust Understanding of Low-quality Multimedia Data: Unitive Enhancement, Analysis and Evaluation. 7428-7430

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














