


default search action
19th ICDAR 2025: Wuhan, China - Part IV
- Xu-Cheng Yin

, Dimosthenis Karatzas
, Daniel Lopresti
:
Document Analysis and Recognition - ICDAR 2025 - 19th International Conference, Wuhan, China, September 16-21, 2025, Proceedings, Part IV. Lecture Notes in Computer Science 16026, Springer 2026, ISBN 978-3-032-04626-0
Poster Papers
- Nil Biescas, Sanket Biswas, Josep Lladós, Jordy Van Landeghem:

Where Layout Meets Language: Lightweight Spatial Enhancement to Large Language Models for Document Understanding. 3-20 - K. M. Megha Mariam, C. V. Jawahar

:
Attend to What I Say: Highlighting Relevant Content on Slides. 21-37 - Long Yin, Kai Yin

, Hui Zhao
:
A Unified Framework for Knowledge-Intensive Numerical Reasoning over Financial Document. 38-59 - Irfan Ali, Liliana Lo Presti

, Igor Spano
, Marco La Cascia
:
A Unified Attention-Based Model for Segmenting Compound Words in Sanskrit. 60-76 - Malte Sönnichsen, Mayar Elfares

, Yao Wang
, Ralf Küsters
, Alina Roitberg
, Andreas Bulling
:
AttentionLeak: What Does Human Attention Reveal About Information Visualisation? 77-95 - Saifullah Saifullah

, Stefan Agne
, Andreas Dengel
, Sheraz Ahmed
:
DP-DocLDM: Differentially Private Document Image Generation Using Latent Diffusion Models. 96-119 - François Wieckowiak

, Véronique Eglin, Tony Bonnet, Stéphane Bres, Laëtitia Rousseau:
A Multimodal Evaluation Pipeline for Mathematical Expression Recognition: Comparisons of Datasets, Metrics, and Models. 120-136 - Nam Quan Nguyen, Xuan Phong Pham, Tuan-Anh Tran:

SepFormer: Coarse-to-Fine Separator Regression Network for Table Structure Recognition. 137-153 - Florent Meyer

, Laurent Guichard, Denis Coquenet
, Guillaume Gravier
, Yann Soullard, Bertrand Coüasnon
:
Relaxed Syntax Modeling in Transformers for Future-Proof License Plate Recognition. 154-171 - Zeng Li

, Jin Wei
, Zhijie Shen, Yaqiang Wu
, Gangyan Zeng
, Dongbao Yang
, Zhi Qiao
, Yu Zhou
:
PerturbCTC: Improving Alignment in Scene Text Recognition with Feature Perturbation Based CTC. 172-190 - Mingzhao Xie, Wandong Xue, Dongming Chen, Dongqi Wang:

Text Detection in Industrial Design Drawings via Multi-dimensional Feature Fusion and Differentiable Binarization. 191-207 - Xin Liu, Wen Huang, Junhui Chen, Xingyi Wang, Jian Peng:

OracleProtoPNet: Oracle Character Recognition with Interpretability. 208-223 - Tom Simon, William Mocaer

, Pierrick Tranouez
, Clément Chatelain, Thierry Paquet
:
Classifying the Unknown: In-Context Learning for Open-Vocabulary Text and Symbol Recognition. 224-243 - Diego Belzarena, Seginus Mowlavi, Aitor Artola, Camilo Mariño, Marina Gardella

, Ignacio Ramírez
, Antoine Tadros
, Roy He, Natalia Bottaioli
, Boshra Rajaei
, Gregory Randall
, Jean-Michel Morel
:
Improving OCR Using Internal Document Redundancy. 244-260 - Julien Delaunay

, Tran Thi Hong Hanh
, Carlos-Emiliano González-Gallardo
, Georgeta Bordea
, Nicolas Sidere
, Antoine Doucet
, Olivier de Viron
:
Multidisciplinary End-to-End Document-Level Relation Extraction from Scientific Literature. 261-280 - Hung Tuan Nguyen

, Thanh-Nghia Truong
, Nam Tuan Ly
, Masaki Nakagawa
, Toshihiko Horie:
Automated Recognition and Scoring of Handwritten Short Answer: Insights from Japanese Elementary and Junior High Schools. 281-293 - Qi Dong

, Lei Kang
, Maura Pintor
, Dimosthenis Karatzas
:
Position-Aware Stamp-Like Adversarial Attack for Document Classification. 294-310 - Youjie Li, Shiqiang Zheng, Guijia Zhang, Qifeng Chen, Changsheng Chen:

DocAI-TL: Structured Document Tampering Localization with DocAI Model. 311-328 - Nauman Riaz, Stefan Agne

, Andreas Dengel
, Sheraz Ahmed
:
DocForgeNet: Dual Cross-Stream Fusion Network for Robust Forgery Detection in Scanned Documents. 329-346 - Tuo Wang

, Yixiao Zhou
, Tongwei Zhang, Zhicheng He
, Yumeng Zhao, Xiaoqing Lyu
:
SSSI: Self-prompted Segmentation of Scientific Illustrations. 347-361 - Menghui Liu, Guanghui Wang, Lang Yu, Yilan Yang, Lingfeng Shen, Heng Li:

E-FCOS: Enhanced Historical Text Detection with Fast Fourier Transform Denoising and Adaptive Multi-scale Fusion. 362-381 - Guangrui Fan

:
DocPINN: A Neural PDE-Based Framework for Document Image Dewarping. 382-397 - Zhongjiang He, Ye Yuan, An Zhao

, Han Fang
, Hao Sun, Kongming Liang
, Zhanyu Ma
:
SelectVision: Adaptive Vision Resolution Selection for Visual Document Understanding. 398-413 - Yiming Wang

, Hongxi Wei
, Heng Wang
, Shiwen Sun:
SFRD: Handwritten Mathematical Expressions Generation by Spatial-Aware Feature Refinement Diffusion. 414-428 - Tim Raven

, Vincent Christlein
, Gernot A. Fink
:
Interpretable Writer Recognition via Vectors of Locally Aggregated Characters. 429-445 - Demin Zhang

, Jiahao Lyu
, Zhijie Shen, Yu Zhou
:
Class-Agnostic Region-of-Interest Matching in Document Images. 446-464 - Aniket Gurav, Sukalpa Chanda

, Narayanan C. Krishnan
:
Beyond Memorization: Training-Free Style Mixing for Variability in Handwritten Text Generation Using Writer Embedding Injection in Pretrained Diffusion Models. 465-484 - Utathya Aich, Shinjini Chakraborty, Deepan Sadhukhan, Swarnendu Ghosh, Tulika Saha:

HiLEx: Image-Based Hierarchical Layout Extraction from Question Papers. 485-505 - Souparni Mazumder, Sanket Biswas, Aniket Pal, Alloy Das, Umapada Pal, Josep Lladós:

Doc2GraphFormer: Bridging Structured Graph Learning with Transformer Attention for Efficient Document Understanding. 506-522 - Alexander Vogel

, Omar Moured
, Yufan Chen
, Jiaming Zhang
, Rainer Stiefelhagen
:
RefChartQA: Grounding Visual Answer on Chart Images Through Instruction Tuning. 523-537 - Yixin Zhao, Yuyi Zhang

, Lianwen Jin
:
MCCD: A Multi-attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and Calligraphers. 538-555 - Michal Turski, Mateusz Chilinski, Lukasz Borchmann:

Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA. 556-569 - Junyi Yuan

, Jian Zhang
, Fangyu Wu
, Huanda Lu
, Dongming Lu
, Qiufeng Wang
:
Towards Cross-Modal Retrieval in Chinese Cultural Heritage Documents: Dataset and Solution. 570-586 - Sachin Raja

, Ajoy Mondal
, C. V. Jawahar
:
EviFiVQA: A Benchmark for Evidence-Grounded Multi-hop Reasoning in Financial VQA. 587-604 - Yulia S. Chernyshova

, Daniil A. Ilyukhin
, Vladimir V. Arlazarov
:
MIDV-UP: A Dataset of Pakistani and Iranian ID Documents. 605-619 - Oreen Yousuf

, Abdulmalik Aminu
, Musa Salih Muhammad, Bashir Usman, Mustapha Kurfi Hashim, Joakim Nivre
, Beáta Megyesi
, Christian Høgel:
A Handwritten Text Recognition Dataset for Ajami Manuscripts in Fulfulde and Hausa. 620-637 - Josh Knize

, Kenny Davila
:
Optimizing Chart Image Classification: A Study of Data Augmentation and Training Strategies. 638-655

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














