


default search action
18th ECCV 2024: Milan, Italy - Part VIII
- Ales Leonardis

, Elisa Ricci
, Stefan Roth
, Olga Russakovsky
, Torsten Sattler
, Gül Varol
:
Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part VIII. Lecture Notes in Computer Science 15066, Springer 2025, ISBN 978-3-031-73241-6 - Mattia Segù, Luigi Piccinelli, Siyuan Li, Luc Van Gool, Fisher Yu, Bernt Schiele:

Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Appearance Graphs. 1-18 - Sumin Lee

, Yooseung Wang, Sangmin Woo
, Changick Kim:
Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition. 19-36 - Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat:

DiffiT: Diffusion Vision Transformers for Image Generation. 37-55 - Zirui Shao

, Feiyu Gao, Hangdi Xing
, Zepeng Zhu, Zhi Yu, Jiajun Bu
, Qi Zheng, Cong Yao
:
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation. 56-74 - Changshuo Wang

, Meiqing Wu, Siew-Kei Lam
, Xin Ning
, Shangshu Yu
, Ruiping Wang
, Weijun Li
, Thambipillai Srikanthan:
GPSFormer: A Global Perception and Local Structure Fitting-Based Transformer for Point Cloud Understanding. 75-92 - Ke Fan

, Junshu Tang
, Weijian Cao
, Ran Yi
, Moran Li
, Jingyu Gong
, Jiangning Zhang
, Yabiao Wang
, Chengjie Wang
, Lizhuang Ma
:
FreeMotion: A Unified Framework for Number-Free Text-to-Motion Synthesis. 93-109 - Zheng Jiang

, Jinqing Zhang
, Yanan Zhang
, Qingjie Liu
, Zhenghui Hu
, Baohui Wang, Yunhong Wang
:
FSD-BEV: Foreground Self-distillation for Multi-view 3D Object Detection. 110-126 - Yang Miao

, Francis Engelmann
, Olga Vysotska
, Federico Tombari
, Marc Pollefeys
, Dániel Béla Baráth:
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs. 127-150 - Chenming Zhu, Tai Wang, Wenwei Zhang, Kai Chen, Xihui Liu

:
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities. 151-168 - Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Yu Qiao, Peng Gao, Hongsheng Li

:
MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? 169-186 - Zhonghan Zhao

, Wenhao Chai
, Xuan Wang
, Boyi Li
, Shengyu Hao
, Shidong Cao
, Tian Ye
, Gaoang Wang
:
See and Think: Embodied Agent in Virtual Environment. 187-204 - Guangcheng Chen

, Yicheng He
, Li He
, Hong Zhang
:
PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects. 205-222 - Xinpeng Liu

, Yong-Lu Li
, Ailing Zeng
, Zizheng Zhou, Yang You
, Cewu Lu
:
Bridging the Gap Between Human Motion and Action Semantics via Kinematic Phrases. 223-240 - Ofir Abramovich, Niv Nayman, Sharon Fogel, Inbal Lavi, Ron Litman, Shahar Tsiper, Royee Tichauer, Srikar Appalaraju, Shai Mazor, R. Manmatha:

VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding. 241-259 - Zhihao Li

, Biao Hou
, Siteng Ma
, Zitong Wu
, Xianpeng Guo
, Bo Ren
, Licheng Jiao
:
Masked Angle-Aware Autoencoder for Remote Sensing Images. 260-278 - Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li:

Infinite-ID: Identity-Preserved Personalization via ID-Semantics Decoupling Paradigm. 279-296 - Zhi-Fan Wu, Lianghua Huang, Wei Wang, Yanheng Wei, Yu Liu:

MultiGen: Zero-Shot Image Generation from Multi-modal Prompts. 297-313 - Xianyu Chen

, Ming Jiang
, Qi Zhao
:
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths. 314-333 - Yifeng Zhang

, Ming Jiang
, Qi Zhao
:
Learning Chain of Counterfactual Thought for Bias-Robust Vision-Language Reasoning. 334-351 - Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian L. Price, Dan Xu

:
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis. 352-370 - Ishan Rajendrakumar Dave

, Fabian Caba Heilbron
, Mubarak Shah
, Simon Jenni
:
Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets. 371-388 - Ishan Rajendrakumar Dave

, Mamshad Nayeem Rizve
, Mubarak Shah
:
FinePseudo: Improving Pseudo-labelling Through Temporal-Alignablity for Semi-supervised Fine-Grained Action Recognition. 389-408 - Yu Liu

, Fatimah Binti Khalid
, Lei Wang, Youxi Zhang, Cunrui Wang
:
Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting. 409-425 - Sipeng Zheng, Bohan Zhou, Yicheng Feng, Ye Wang, Zongqing Lu:

UniCode: Learning a Unified Codebook for Multimodal Large Language Models. 426-443 - Baifeng Shi, Ziyang Wu, Maolin Mao, Xin Wang, Trevor Darrell:

When Do We Not Need Larger Vision Models? 444-462 - Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li

, Xiaoshui Huang
, Chun Yuan, Wanli Ouyang, Tong He:
GVGEN: Text-to-3D Generation with Volumetric Representation. 463-479 - Zhening Liu

, Xinjie Zhang
, Jiawei Shao
, Zehong Lin
, Jun Zhang
:
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model. 480-496

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














