


default search action
ICMR 2023: Thessaloniki, Greece
- Ioannis Kompatsiaris, Jiebo Luo, Nicu Sebe, Angela Yao, Vasileios Mazaris, Symeon Papadopoulos, Adrian Popescu, Zi Helen Huang:

Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, ICMR 2023, Thessaloniki, Greece, June 12-15, 2023. ACM 2023
Regular Long Papers
- Nitish Nag

, Hyungik Oh
, Mengfan Tang
, Mingshu Shi
, Ramesh C. Jain
:
Integrative Multi-Modal Computing for Personal Health Navigation. 1-9 - Hugo Schindler

, Adrian Popescu
, Van-Khoa Nguyen
, Jerome Deshayes-Chossart
:
Raising User Awareness about the Consequences of Online Photo Sharing. 10-19 - Sven Schultze

, Ani Withöft
, Larbi Abdenebaoui
, Susanne Boll
:
Explaining Image Aesthetics Assessment: An Interactive Approach. 20-28 - Omar Adjali

, Paul Grimal
, Olivier Ferret
, Sahar Ghannay
, Hervé Le Borgne
:
Explicit Knowledge Integration for Knowledge-Aware Visual Question Answering about Named Entities. 29-38 - Shuo Chen

, Ying-Jun Du
, Pascal Mettes
, Cees G. M. Snoek
:
Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation. 39-47 - Ying He

, Gongqing Wu
, Desheng Cai
, Xuegang Hu
:
Cross-View Sample-Enriched Graph Contrastive Learning Network for Personalized Micro-video Recommendation. 48-56 - Konstantin Schall

, Kai Uwe Barthel
, Nico Hezel
, Klaus Jung
:
Improving Image Encoders for General-Purpose Nearest Neighbor Search and Classification. 57-66 - Giacomo Nebbia

, Adriana Kovashka
:
Hypernymization of named entity-rich captions for grounding-based multi-modal pretraining. 67-75 - Yizhao Gao

, Zhiwu Lu
:
CMMT: Cross-Modal Meta-Transformer for Video-Text Retrieval. 76-84 - Jiazhi Guan

, Hang Zhou
, Zhizhi Guo
, Tianshu Hu
, Lirui Deng
, Chengbin Quan
, Meng Fang
, Youjian Zhao
:
Dual-Modality Co-Learning for Unveiling Deepfake in Spatio-Temporal Space. 85-94 - Jiaxin Deng

, Dong Shen
, Haojie Pan
, Xiangyu Wu
, Ximan Liu
, Gaofeng Meng
, Fan Yang
, Tingting Gao
, Ruiji Fu
, Zhongyuan Wang
:
A Unified Model for Video Understanding and Knowledge Embedding with Heterogeneous Knowledge Graph Dataset. 95-104 - Chiyu Zhang

, Zaiyan Dai
, Peng Cao
, Jun Yang
:
Edge Enhanced Image Style Transfer via Transformers. 105-114 - Juheon Hwang

, Jiwoo Kang
, Kyoungoh Lee
, Sanghoon Lee
:
Unlocking Potential of 3D-aware GAN for More Expressive Face Generation. 115-124 - Yuze Wang

, Junyi Wang
, Yansong Qu
, Yue Qi
:
RIP-NeRF: Learning Rotation-Invariant Point-based Neural Radiance Field for Fine-grained Editing and Compositing. 125-134 - Tiancong Cheng

, Ying Zhang
, Yifang Yin
, Roger Zimmermann
, Zhiwen Yu
, Bin Guo
:
A Multi-Teacher Assisted Knowledge Distillation Approach for Enhanced Face Image Authentication. 135-143 - Ying Zhang

, Lilei Zheng
, Vrizlynn L. L. Thing
, Roger Zimmermann
, Bin Guo
, Zhiwen Yu
:
FaceLivePlus: A Unified System for Face Liveness Detection and Face Verification. 144-152 - Bing Han, Jianshu Li

, Wenqi Ren, Man Luo
, Jian Liu, Xiaochun Cao:
SIGMA-DF: Single-Side Guided Meta-Learning for Deepfake Detection. 153-161 - Yizhe Zhu

, Jialin Gao
, Xi Zhou
:
AVForensics: Audio-driven Deepfake Video Detection with Masking Strategy in Self-supervision. 162-171 - Marco Arazzi

, Marco Cotogni
, Antonino Nocera
, Luca Virgili
:
Predicting Tweet Engagement with Graph Neural Networks. 172-180 - Peiwang Tang

, Qinghua Zhang
, Xianchao Zhang
:
A Recurrent Neural Network based Generative Adversarial Network for Long Multivariate Time Series Forecasting. 181-189 - Victoria Sherratt

, Kevin Pimbblet
, Nina Dethlefs
:
Multi-channel Convolutional Neural Network for Precise Meme Classification. 190-198 - Yankun Wu

, Yuta Nakashima
, Noa Garcia
:
Not Only Generative Art: Stable Diffusion for Content-Style Disentanglement in Art Analysis. 199-208 - Wen-Jiin Tsai

, Yi-Cheng Tien
:
Attention-based Video Virtual Try-On. 209-216 - Soyun Choi

, Youjia Zhang
, Sungeun Hong
:
Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation. 217-225 - Cheng-Yu Fang

, Xian-Feng Han
:
Joint Geometric-Semantic Driven Character Line Drawing Generation. 226-233 - Zeqing Xia

, Zhouhui Lian
:
CurveSDF: Binary Image Vectorization Using Signed Distance Fields. 234-242 - Yusong Wang

, Dongyuan Li
, Kotaro Funakoshi
, Manabu Okumura
:
EMP: Emotion-guided Multi-modal Fusion and Contrastive Learning for Personality Traits Recognition. 243-252 - Zefan Zhang

, Yi Ji
, Chunping Liu
:
Knowledge-Aware Causal Inference Network for Visual Dialog. 253-261 - Chun Zhang

, Keyan Ren
, Qingyun Bian
, Yu Shi
:
Less is More: Decoupled High-Semantic Encoding for Action Recognition. 262-271 - Ziwei Xiong

, Han Wang
:
Dual-Stream Multimodal Learning for Topic-Adaptive Video Highlight Detection. 272-279 - Ruilin Zhang

, Haiyang Zheng
, Hongpeng Wang
:
TDEC: Deep Embedded Image Clustering with Transformer and Distribution Information. 280-288 - Beibei Zhang

, Yaqun Fang
, Fan Yu
, Jia Bei
, Tongwei Ren
:
MMSF: A Multimodal Sentiment-Fused Method to Recognize Video Speaking Style. 289-297 - Guoxing Yang

, Haoyu Lu
, Zelong Sun
, Zhiwu Lu
:
Shot Retrieval and Assembly with Text Script for Video Montage Generation. 298-306 - Shenshen Li

, Xing Xu
, Fumin Shen
, Yang Yang
:
Multi-granularity Separation Network for Text-Based Person Retrieval with Bidirectional Refinement Regularization. 307-315 - Tiening Sun

, Zhong Qian
, Peifeng Li
, Qiaoming Zhu
:
Graph Interactive Network with Adaptive Gradient for Multi-Modal Rumor Detection. 316-324 - Harsh Sinha

, Adriana Kovashka
:
Towards Shape-regularized Learning for Mitigating Texture Bias in CNNs. 325-334 - Mingqi Chen

, Feng Shuang
, Shaodong Li
, Xi Liu
:
ASCS-Reinforcement Learning: A Cascaded Framework for Accurate 3D Hand Pose Estimation. 335-342 - Yangming Zhou

, Yuzhou Yang
, Qichao Ying
, Zhenxing Qian
, Xinpeng Zhang
:
Multi-modal Fake News Detection on Social Media via Multi-grained Information Fusion. 343-352 - Mingjun Li

, Shuo Xu
, Feng Su
:
Learning and Fusing Multi-Scale Representations for Accurate Arbitrary-Shaped Scene Text Recognition. 353-361 - Chunhong Cao

, Huawei Fu
, Gai Li
, Mengyang Wang
, Xieping Gao
:
Modeling Functional Brain Networks with Multi-Head Attention-based Region-Enhancement for ADHD Classification. 362-369 - Chunhong Cao

, Gai Li
, Huawei Fu
, Xingxing Li
, Xieping Gao
:
SPAE: Spatial Preservation-based Autoencoder for ADHD functional brain networks modelling. 370-377 - Bingchao Wu

, Yangyuxuan Kang
, Bei Guan
, Yongji Wang
:
We Are Not So Similar: Alleviating User Representation Collapse in Social Recommendation. 378-387 - Pengzhi Li

, Yikang Ding
, Linge Li
, Jingwei Guan
, Zhiheng Li
:
Towards Practical Consistent Video Depth Estimation. 388-397 - Jiancheng Pan

, Qing Ma
, Cong Bai
:
Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval. 398-406 - Jialin Tian

, Xing Xu
, Zuo Cao
, Gong Zhang
, Fumin Shen
, Yang Yang
:
Zero-shot Sketch-based Image Retrieval with Adaptive Balanced Discriminability and Generalizability. 407-415 - Liang Li

, Weiwei Sun
:
Label-wise Deep Semantic-Alignment Hashing for Cross-Modal Retrieval. 416-424 - Ying Li

, Chunming Guan
, Jiaquan Gao
:
TsP-Tran: Two-Stage Pure Transformer for Multi-Label Image Retrieval. 425-433 - Maria Pegia

, Björn Þór Jónsson
, Anastasia Moumtzidou
, Ilias Gialampoukidis
, Stefanos Vrochidis
, Ioannis Kompatsiaris
:
MuseHash: Supervised Bayesian Hashing for Multimodal Image Representation. 434-442 - Siteng Huang

, Qiyao Wei
, Donglin Wang
:
Reference-Limited Compositional Zero-Shot Learning. 443-451 - Haram Choi

, Cheolwoong Na
, Jinseop Kim
, Jihoon Yang
:
Exploration of Lightweight Single Image Denoising with Transformers and Truly Fair Training. 452-461 - Feng Zhao

, Min Zhang
, Tiancheng Huang
, Donglin Wang
:
TAGM: Task-Aware Graph Model for Few-shot Node Classification. 462-471 - Yutian Luo

, Yizhao Gao
, Zhiwu Lu
:
Learning with Adaptive Knowledge for Continual Image-Text Modeling. 472-480 - Wenxiu Geng

, Xiangxian Li
, Yulong Bian
:
A Dual-branch Enhanced Multi-task Learning Network for Multimodal Sentiment Analysis. 481-489 - Yu Zang

, Zhe Xue
, Shilong Ou
, Yunfei Long
, Hai Zhou
, Junping Du
:
FedPcf : An Integrated Federated Learning Framework with Multi-Level Prospective Correction Factor. 490-498 - Lina Sun

, Yewen Li
, Yumin Dong
:
Learning From Expert: Vision-Language Knowledge Distillation for Unsupervised Cross-Modal Hashing Retrieval. 499-507 - Yaoqing Li

, Sheng-Hua Zhong
, Shuai Li
, Yan Liu:
A Robust Deep Learning Enhanced Monocular SLAM System for Dynamic Environments. 508-515 - Yingnan Fu

, Wenyuan Cai
, Ming Gao
, Aoying Zhou
:
Symbol Location-Aware Network for Improving Handwritten Mathematical Expression Recognition. 516-524
Regular Short Papers
- Daichi Suzuki

, Go Irie
, Kiyoharu Aizawa
:
Text-to-Image Fashion Retrieval with Fabric Textures. 525-529 - Panagiota Alexoudi

, Ioannis Mademlis
, Ioannis Pitas
:
Escaping local minima in deep reinforcement learning for video summarization. 530-534 - Florian Spiess

, Ralph Gasser
, Silvan Heller
, Heiko Schuldt
, Luca Rossetto
:
A Comparison of Video Browsing Performance between Desktop and Virtual Reality Interfaces. 535-539 - Zhexu Shen

, Liang Yang
, Zhihan Yang
, Hongfei Lin
:
More Than Simply Masking: Exploring Pre-training Strategies for Symbolic Music Understanding. 540-544 - Pu Ching

, Hung-Kuo Chu
, Min-Chun Hu
:
SOFA: Style-based One-shot 3D Facial Animation Driven by 2D landmarks. 545-549 - Kun He

, Changyu Li
, Jie Shao
:
Strong-Weak Cross-View Interaction Network for Stereo Image Super-Resolution. 550-554 - Jiabao Sheng

, Saikit Lam
, Zhe Li
, Jiang Zhang
, Xinzhi Teng
, Yuanpeng Zhang
, Jing Cai
:
Multi-view Contrastive Learning with Additive Margin for Adaptive Nasopharyngeal Carcinoma Radiotherapy Prediction. 555-559 - Shuiying Liao

, Yujuan Ding
, P. Y. Mok
:
Recommendation of Mix-and-Match Clothing by Modeling Indirect Personal Compatibility. 560-564 - Arun Zachariah

, Praveen Rao
:
Video Retrieval for Everyday Scenes With Common Objects. 565-570 - subst Nico, Tse-Yu Pan

, Herman Prawiro
, Jian-Wei Peng
, Wen-Cheng Chen
, Hung-Kuo Chu
, Min-Chun Hu
:
Offensive Tactics Recognition in Broadcast Basketball Videos Based on 2D Camera View Player Heatmaps. 571-575 - Meishan Liu

, Meng Jian
, Ge Shi
, Ye Xiang
, Lifang Wu
:
Graph Contrastive Learning on Complementary Embedding for Recommendation. 576-580 - Sahar Tahmasebi

, Sherzod Hakimov
, Ralph Ewerth
, Eric Müller-Budack
:
Improving Generalization for Multimodal Fake News Detection. 581-585 - Christos Koutlis

, Manos Schinas
, Symeon Papadopoulos
:
MemeFier: Dual-stage Modality Fusion for Image Meme Classification. 586-591 - Aristotelis Ballas

, Christos Diou
:
CNNs with Multi-Level Attention for Domain Generalization. 592-596 - Werner Bailer

, Rahel Arnold
, Vera Benz
, Davide Coccomini
, Anastasios Gkagkas
, Gylfi Þór Guðmundsson
, Silvan Heller
, Björn Þór Jónsson
, Jakub Lokoc
, Nicola Messina
, Nick Pantelidis
, Jiaxin Wu
:
Improving Query and Assessment Quality in Text-Based Interactive Video Retrieval Evaluation. 597-601 - Iacopo Ghinassi

, Lin Wang
, Chris Newell
, Matthew Purver
:
Multimodal Topic Segmentation of Podcast Shows with Pre-trained Neural Encoders. 602-606 - Georgios Orfanidis

, Konstantinos Ioannidis
, Anastasios Tefas
, Stefanos Vrochidis
, Ioannis Kompatsiaris
:
Tweaking EfficientDet for frugal training. 607-611 - Mingyuan Ge

, Yewen Li
, Longfei Ma
, Mingyong Li
:
Deep Enhanced-Similarity Attention Cross-modal Hashing Learning. 612-616 - Kai Feng

, Tao Liu
, Heng Zhang
, Zihao Meng
, Zemin Miao
:
TNOD: Transformer Network with Object Detection for Tag Recommendation. 617-621 - Tianqi Zhao

, Ming Kong
, Tian Liang
, Qiang Zhu
, Kun Kuang
, Fei Wu
:
CLAP: Contrastive Language-Audio Pre-training Model for Multi-modal Sentiment Analysis. 622-626
Brave New Ideas Paper
- David Alonso del Barrio

, Daniel Gatica-Perez
:
Framing the News: From Human Perception to Large Language Model Inferences. 627-635
Doctoral Symposium Paper
- Shenshen Li

:
Dual-Path Semantic Construction Network for Composed Query-Based Image Retrieval. 636-639
Reproducibility Track Paper
- Mitchell Lee

, Chris Lee
, Sanjay Penmetsa
, Min Chen
, Mizuki Miyashita
, Naatosi Fish
, Bo Wu
, Omar Shahbaz Khan
:
Reproducibility Companion Paper: MeTILDA - Platform for Melodic Transcription in Language Documentation and Application. 640-643
Technical Demonstrations
- Kento Terauchi

, Keiji Yanai
:
CalorieCam360: Simultaneous Eating Action Recognition of Multiple People Using an Omnidirectional Camera. 644-648 - Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Fabrizio Falchi, Claudio Gennaro, Nicola Messina, Lucia Vadicamo, Claudio Vairo:

VISIONE: A Large-Scale Video Retrieval System with Advanced Search Functionalities. 649-653 - Kai Uwe Barthel

, Nico Hezel
, Konstantin Schall
, Klaus Jung
:
navigu.net: NAvigation in Visual Image Graphs gets User-friendly. 654-658 - Manos Schinas

, Panagiotis Galopoulos
, Symeon Papadopoulos
:
MAAM: Media Asset Annotation and Management. 659-663 - Stefanos Stoikos

, David Kauchak
, Douglas Turnbull
, Alexandra Papoutsaki
:
Cross-Language Music Recommendation Exploration. 664-668
Keynote Talk Abstracts
- Nozha Boujemaa, Abdelrahman Hassan

, Giorgi Kokaia
, Pratyush Kumar Sinha
:
How Responsible LLMs are beneficial to search and exploration in Retail industry. 669 - Jürgen Gall

:
Efficient CNNs and Transformers for Video Understanding and Image Synthesis. 670 - Elisa Ricci

:
Recognizing Actions in Videos under Domain Shift. 671
Tutorial Abstract
- Kai Uwe Barthel

:
Algorithms for Generating and Evaluating Visually Sorted Grid Layouts. 672-673
Workshop Abstracts
- Guillaume Habault

, Minh-Son Dao
, Michael Alexander Riegler
, Duc-Tien Dang-Nguyen
, Yuta Nakashima
, Cathal Gurrin
:
ICDAR'23: Intelligent Cross-Data Analysis and Retrieval. 674-675 - Luca Cuccovillo

, Bogdan Ionescu
, Giorgos Kordopatis-Zilos
, Symeon Papadopoulos
, Adrian Popescu
:
MAD '23 Workshop: Multimedia AI against Disinformation. 676-677 - Cathal Gurrin

, Björn Þór Jónsson
, Duc-Tien Dang-Nguyen
, Graham Healy
, Jakub Lokoc
, Liting Zhou
, Luca Rossetto
, Minh-Triet Tran
, Wolfgang Hürst
, Werner Bailer
, Klaus Schoeffmann
:
Introduction to the Sixth Annual Lifelog Search Challenge, LSC'23. 678-679

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














