


default search action
8th PRCV 2025: Shanghai, China - Part VII
- Josef Kittler, Hongkai Xiong

, Jian Yang
, Xilin Chen
, Jiwen Lu
, Weiyao Lin
, Jingyi Yu
, Weishi Zheng
:
Pattern Recognition and Computer Vision - 8th Chinese Conference, PRCV 2025, Shanghai, China, October 15-18, 2025, Proceedings, Part VII. Lecture Notes in Computer Science 16278, Springer 2026, ISBN 978-981-95-5675-5
Document Analysis and Recognition
- Xugong Qin, Jiuqiang Tian, Jiayi Sheng, Tiantian Xia, Yuyi Wang, Chengrui Li, Gangyan Zeng:

Towards Fine-Grained Document Tampering Detection: New Dataset and Benchmark. 3-19 - Yu Liu

, Yuqiu Kong
, Yang Ding, Fang Liu, Lei Wang, Cunrui Wang
:
Elegantly Written V2: Next-Scale Prediction for Enhancing Online Chinese Handwriting. 20-34 - Yazhou Zhang, Chunwang Zou, Qimeng Liu, Lu Rong, Ben Yao, Zheng Lian, Qiuchi Li, Peng Zhang, Jing Qin:

Are MLLMs Trapped in the Visual Room? 35-49 - Yaxuan Hu, Zhongyuan Wang, Yan Xiong:

Connectivity Relationship Recognition in Piping and Instrumentation Diagrams Using Graph Neural Networks. 50-64 - Yichen Shi, Yuzhi Liu, Zhuofu Tao, Li Huang, Yuhao Gao, Ting-Jung Lin, Lei He:

Symbol and Footprint Database for Electronic Components by Agentic Recognition and Generation. 65-78 - Linying Wang, Mayire Ibrayim, Peichao Jiang, Wenjie Xu:

TriBiaNet: Hierarchical-Biaffine Multimodal Fusion for Key Information Extraction in Visually Rich Documents. 79-93 - Qiuman Tan, Kun Xu, Wancheng Jing, Xin Cheng, WenSheng Hu:

Bottom-Up Multi-document Localization Method in Unconstrained Environments. 94-108 - Haowei Xu, Jiaxin Zhang, Hiuyi Cheng, Peirong Zhang, Xuhan Zheng, Lianwen Jin:

Towards Real-World Document Specular Highlight Removal: The DocHighlight Dataset and DocSHRNet Method. 109-124 - Zhiyuan Chen, Yaping Zhang, Zhiyang Zhang, Yupu Liang, Yue Xu, Yunfei Lu, Dandan Tu, Chengqing Zong, Yu Zhou:

Boosting Document Image Translation via Layout-Aware Semantic Paragraph Clustering. 125-139
Action Recognition
- Lingjie Zeng

, Hailun Zhang, Xinrui Wang, Zhen Zhai
, Qijun Zhao
, Hanyang Lin
:
Divide-and-Specialize CLIP: A Text-Guided Multi-expert Framework for Fine-Grained Action Recognition. 143-157 - Shenghong Zhong, Bi Zeng, Jinjie Wang, Yujun Zhu:

A Two-Stage Multimodal Framework for Real-Time Item Pickup and Return Recognition in Unmanned Retail Stores. 158-173 - Bingbing Zhang, Yuanchen Ma, Meng Li, Jianxin Zhang, Qiang Zhang:

Few-Shot Action Recognition Based on Visual-Language Prototype Hierarchical Temporal Enhancement. 174-187 - Bingbing Zhang, Yongqi Li, Meng Li, Jianxin Zhang, Qiang Zhang:

Cross Attention Guided Multimodal Network for Video Action Recognition. 188-202 - Tuyun Shang, Yanbin Hao, Ming Pei, Kun Li, Huixia Ben, Shuo Wang:

Cross-modal Feature Enhancement and Contrastive Alignment for Micro-gesture Recognition. 203-217 - Chenyou Fan

, Kehui Tan
, Yanzhao Chen
, Tianqi Pang, Haiqi Jiang, Junjie Hu
:
Enhancing Human Trajectory Prediction with Reinforcement Learning from Quantified Human Preferences. 218-232 - Yuehan Jiang, Hongjun Li

:
Fusing Rigid Skeletal Nodes to Graph Convolutional Networks for Fuzzy Action Recognition. 233-247 - Zhewen Zhou, Bing Li, Qing Guo, Chunlei Li, Guangshuai Gao:

PACTFormer: Peak-Aware Cross-Temporal Transformer for Temporal Action Detection. 248-261 - Bingbing Zhang, Yongqi Li, Meng Li, Jianxin Zhang, Qiang Zhang:

High-Order Multimodal Multi-task Video Action Recognition. 262-276 - Chong Cao, Mingliang Xue

, Shu Cao, Wanquan Liu, Xiaodong Duan:
Hierarchy-Aware Harmonization Network for Open-Vocabulary HOI Detection. 277-291 - Ruizhao Zhai, Wanru Xu, Zhenjiang Miao, Yi Tian, Ping Guo, Qinghao Kong:

Causal Debiasing Network for Action Quality Assessment. 292-306 - Junshi Yang, Shenglan Liu, Xuhan Sheng, Gang Yan, Yiheng Zhou, Lin Feng, Jiajun Fan:

Bridging the Point to Boundary Gap for Point-Supervised Temporal Action Localization with Single-Stage Inference. 307-320 - Tian Bai, Zhengjie Ni, Hongjun Li

:
MPAF-SECL: Multi-path Adaptive Fusion with Semantic-Enhanced Cross-Modal Learning for Skeleton-Based Action Recognition. 321-335
Face Recognition and Pose Recognition
- Mingyue Li

, Fengchen Shi
, Jiashuo Mi, Ruizhong Du
:
Saliency Semantic Ranking with Quality Restoration Synergistic Adversarial Attack on Face Recognition. 339-353 - Yang Gao, Xiaoqi An, Di Wang, Fei Gao, Lin Zhao:

Deformable Epipolar Transformer for Robust 3D Human Pose Estimation. 354-368 - Yucheng Jin, Jinyan Chen, Ziyue He, Baojun Han, Furan An:

STAR-Pose: Efficient Low-Resolution Video Human Pose Estimation via Spatial-Temporal Adaptive Super-Resolution. 369-383 - Yuhao Wang, Qixuan Su, Decheng Liu, Chunlei Peng:

Imperceptible Face Forgery Attack via Adversarial Semantic Mask. 384-399 - Zhaoli Zhu, Jikai Zhang, Xianghao Zeng, Chenjie Xie:

ViTRotPose: Light-Weight Human Pose Estimation Based On Vision Transformer and Rotational Position Coding. 400-413 - Wenjie Sun, Jiayu Ying

, Wenqi Wang:
MTRAN: A Multi-type Regional Attention Network for Facial Expression Recognition. 414-426 - Kaidi Hu, Xiaoyu Liu, Guojiao Zhao, Ruigang Yang:

LLMamba-Net: A Lightweight Network Integrating Linear Mamba for Facial Expression Recognition. 427-439 - Yuying Zhao, Jiani Hu, Chun-Guang Li:

Variational Vision Transformer with Anti-Over-Smoothing Strategy for Robust Face Recognition. 440-454 - Xin Li

, Bingxin Xu
, Hongzhe Liu, Weiguo Pan, Cheng Xu:
YOLODF: YOLO-Based Spatial-Frequency Interaction Mining for General Deepfake Detection. 455-470 - Enfan Lan, Yifan Yang, Chenyang Zhao, Dong Liu, Jingtai Liu

:
FGI-Gaze: Gaze Target Detection via Filtered Human-Environment Gaze Interaction. 471-485 - Wenxiao Tang

, Shichen Wang, Zhonghua Miao
:
Linlot: Limb Connection Relationship Constraints and Keypoint Localization Refinement for Pose Estimation. 486-503 - Jinghong Zheng, Changlong Jiang, Jiaqi Li, Haohong Kuang, Hang Xu, Tingbing Yan:

UniPose: Unified Cross-Modality Pose Prior Propagation Towards RGB-D Data for Weakly Supervised 3D Human Pose Estimation. 504-518 - Tianbao Zhang, Jian Zhao, Yuer Li, Zheng Zhu, Ping Hu, Zhaoxin Fan, Wenjun Wu, Xuelong Li:

AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars. 519-535 - Libo Lv, Tianyi Wang, Mengxiao Huang, Ruixia Liu, Yinglong Wang:

A Spatial-Frequency Aware Multi-scale Fusion Network for Real-Time Deepfake Detection. 536-550 - Cheng Li, Xianzhong Liu, Jiayang Yu:

DDANet: A Dual-Path Direction-Aware Network with Cross-Direction Attention for Facial Expression Recognition. 551-564
Character Recognition
- Yue Zhang

, Zhehao Zhang, Sen Feng, Fanghui Zhang, Guoqi Liu, Yigang Cen:
Pedestrian Open-Attribute Recognition via Dynamic Semantic Masking. 567-580 - Xiangshu Ruan, Mingyu Fan, Yijian Wu, Dan Cheng:

STAF: Symbol-Targeted Adversarial Flow for Handwritten Mathematical Expression Recognition. 581-595 - Tianyu Fang, Kunchi Li

, Yun Wu, Da-Han Wang:
Dual Manifold Volume-Balanced Framework for Long-Tailed Oracle Character Recognition. 596-610 - Wenkai Li, Yongbin Mu, Miaomiao Xu, Mieradilijiang Maimaiti, Yanbing Li, Wushour Silamu:

RAST: Residual-Attentive and Scale-Aware Transformer for Robust Scene Text Recognition. 611-625 - Chuanlong Liu, Shuangying Li, Miaomiao Xu, Mieradilijiang Maimaiti, Wushour Silamu:

Visual-Semantic Dual-Decoder Collaboration for Scene Text Recognition. 626-641 - Yaolin Weng, Chuanlong Liu, Miaomiao Xu, Mieradilijiang Maimaiti, Wushour Silamu:

MixFormer: A Cross-Modal Transformer for Arbitrary-Shaped Scene Text Detection. 642-656

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














