default search action
Computer Vision and Image Understanding, Volume 249
Volume 249, 2024
- Deyu Lin, Huanxin Wang, Xin Lei, Weidong Min, Chenguang Yao, Yuan Zhong, Yong Liang Guan:
DSU-GAN: A robust frontal face recognition approach based on generative adversarial network. 104128 - Boyuan Zhang, Jiaxu Li, Yucheng Shi, Yahong Han, Qinghua Hu:
VADS: Visuo-Adaptive DualStrike attack on visual question answer. 104137 - Kainat Riaz, Muhammad Latif Anjum, Wajahat Hussain, Rohan Manzoor:
Targeted adversarial attack on classic vision pipelines. 104140 - Giulia Martinelli, Nicola Garau, Niccoló Bisagno, Nicola Conci:
MoMa: Skinned motion retargeting using masked pose modeling. 104141 - Francesco Tassone, Luca Maiano, Irene Amerini:
Continuous fake media detection: Adapting deepfake detectors to new generative techniques. 104143 - Nada Osman, Guglielmo Camporese, Lamberto Ballan:
Multi-modal transformer with language modality distillation for early pedestrian action anticipation. 104144 - Carmen Bisogni, Vincenzo Loia, Michele Nappi, Chiara Pero:
Acoustic features analysis for explainable machine learning-based audio spoofing detection. 104145 - Jianjian Yin, Shuai Yan, Tao Chen, Yi Chen, Yazhou Yao:
Class Probability Space Regularization for semi-supervised semantic segmentation. 104146 - Xuezhi Xiang, Xiaoheng Li, Weijie Bao, Yulong Qiao, Abdulmotaleb El-Saddik:
DBMHT: A double-branch multi-hypothesis transformer for 3D human pose estimation in video. 104147 - Ngo Thien Thu, Hoang Ngoc Tran, Md. Delowar Hossain, Eui-Nam Huh:
LightSOD: Towards lightweight and efficient network for salient object detection. 104148 - Inpyo Song, Moonwook Ryu, Jangwon Lee:
Action-conditioned contrastive learning for 3D human pose and shape estimation in videos. 104149 - Rongchang Li, Tianyang Xu, Xiaojun Wu, Linze Li, Xiao Yang, Zhongwei Shen, Josef Kittler:
M-adapter: Multi-level image-to-video adaptation for video action recognition. 104150 - Junding Sun, Yabei Li, Xiaosheng Wu, Chaosheng Tang, Shuihua Wang, Eugene Yu-Dong Zhang:
HAD-Net: An attention U-based network with hyper-scale shifted aggregating and max-diagonal sampling for medical image segmentation. 104151 - Yukun Liu, Zhaohui Luo, Daming Shi:
A convex Kullback-Leibler optimization for semi-supervised few-shot learning. 104152 - Qi Zheng, Chaoyue Wang, Dadong Wang:
Bypass network for semantics driven image paragraph captioning. 104154 - Oriol Barbany, Adrià Colomé, Carme Torras:
Deformable surface reconstruction via Riemannian metric preservation. 104155 - Xujie He, Jing Jin, Yu Jiang, Dandan Li:
A lightweight convolutional neural network-based feature extractor for visible images. 104157 - Zhichao Fu, Anran Wu, Shuwen Yang, Tianlong Ma, Liang He:
CAFNet: Context aligned fusion for depth completion. 104158 - Zeno Sambugaro, Nicola Garau, Niccoló Bisagno, Nicola Conci:
Agglomerator++: Interpretable part-whole hierarchies and latent space representations in neural networks. 104159 - Andrea Alfarano, Luca Maiano, Lorenzo Papa, Irene Amerini:
Estimating optical flow: A comprehensive review of the state of the art. 104160 - Xubo Luo, Jinshuo Zhang, Liping Wang, Dongmei Niu:
HBANet: A hybrid boundary-aware attention network for infrared and visible image fusion. 104161 - Qing Ye, Xiuju Xu, Rui Li, Yongmei Zhang:
Human-object interaction detection algorithm based on graph structure and improved cascade pyramid network. 104162 - Luca Zanella, Benedetta Liberatori, Willi Menapace, Fabio Poiesi, Yiming Wang, Elisa Ricci:
Delving into CLIP latent space for Video Anomaly Recognition. 104163 - Xuezhi Xiang, Dianang Li, Xi Wang, Xiankun Zhou, Yulong Qiao:
VIDF-Net: A Voxel-Image Dynamic Fusion method for 3D object detection. 104164 - Jianchao Li, Wei Zhou, Kai Wang, Haifeng Hu:
Triple-Stream Commonsense Circulation Transformer Network for Image Captioning. 104165 - Zhan Li, Feng Liu:
Scalable video transformer for full-frame video prediction. 104166 - Zhenyu Li, Pengjie Xu:
Pyramid transformer-based triplet hashing for robust visual place recognition. 104167 - Zongwei Wu, Zhuyun Zhou, Guillaume Allibert, Christophe Stolz, Cédric Demonceaux, Chao Ma:
Transformer fusion for indoor RGB-D semantic segmentation. 104174 - Qiuxia Lai, Yongwei Nie, Yu Li, Hanqiu Sun, Qiang Xu:
Spatial attention for human-centric visual understanding: An Information Bottleneck method. 104180 - Wenxi Li, Yuchen Guo, Jilai Zheng, Haozhe Lin, Chao Ma, Lu Fang, Xiaokang Yang:
Bridging the gap between object detection in close-up and high-resolution wide shots. 104181 - Jiacheng Yao, Jing Zhang, Hui Zhang, Li Zhuo:
LCMA-Net: A light cross-modal attention network for streamer re-identification in live video. 104183 - Nan Che, Jiang Liu, Fei Yu, Lechao Cheng, Yuxuan Wang, Yuehua Li, Chenrui Liu:
Multimodality-guided Visual-Caption Semantic Enhancement. 104139 - Maximilian Menke, Thomas Wenzel, Andreas Schwung:
AWADA: Foreground-focused adversarial learning for cross-domain object detection. 104153 - Jiahui Hu, Yonghua Lu, Xiyuan Ye, Qiang Feng, Lihua Zhou:
A fast differential network with adaptive reference sample for gaze estimation. 104156 - Chunying Liu, Guangwei Gao, Fei Wu, Zhenhua Guo, Yi Yu:
An efficient feature reuse distillation network for lightweight image super-resolution. 104178 - The Van Le, Jin Young Lee:
Specular highlight removal using Quaternion transformer. 104179 - Xuan Wang, Lijun Sun, Jinglei Yi, Yongchao Song, Qiang Zheng, Abdellah Chehri:
Efficient degradation representation learning network for remote sensing image super-resolution. 104182 - Eduardo S. Ribeiro, Lourenço R. G. Araújo, Gabriel T. L. Chaves, Antônio P. Braga:
Distance-based loss function for deep feature space learning of convolutional neural networks. 104184 - Pengxiang Xu, Yang He, Jian Yang, Shanshan Zhang:
Uncertainty guided test-time training for face forgery detection. 104185 - Yanchao Liu, Xina Cheng, Yuan Li, Takeshi Ikenaga:
Bidirectional temporal and frame-segment attention for sparse action segmentation of figure skating. 104186 - Lia Morra, Antonio Santangelo, Pietro Basci, Luca Piano, Fabio Garcea, Fabrizio Lamberti, Massimo Leone:
For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives. 104187 - Maria De Marsico, Giordano Dionisi, Donato Francesco Pio Stanco:
FTM: The Face Truth Machine - Hand-crafted features from micro-expressions to support lie detection. 104188 - Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Kailun Yang, Kaiwei Wang:
Exploring event-based human pose estimation with 3D event representations. 104189 - Georgios Albanis, Nikolaos Zioulis, Kostas Kolomvatsos:
BundleMoCap++: Efficient, robust and smooth motion capture from sparse multiview videos. 104190 - Chen Liang, Shuang Bai:
Found missing semantics: Supplemental prototype network for few-shot semantic segmentation. 104191 - Yudong Li, Sanyuan Zhao, Jianbing Shen:
A simple but effective vision transformer framework for visible-infrared person re-identification. 104192 - Zhijie Han, Wansong Qin, Yalu Wang, Qixiang Wang, Yongbin Shi:
MultiSubjects: A multi-subject video dataset for single-person basketball action recognition from basketball gym. 104193 - Nianchang Huang, Yang Yang, Qiang Zhang, Jungong Han, Jin Huang:
Lightweight cross-modal transformer for RGB-D salient object detection. 104194 - Jian Wang, Mengyu Luo, Xinlei Chen, Heming Xu, Junseok Kim:
A novel image inpainting method based on a modified Lengyel-Epstein model. 104195 - Siyan Sun, Wenqian Yang, Hong Peng, Jun Wang, Zhicai Liu:
A semantic segmentation method integrated convolutional nonlinear spiking neural model with Transformer. 104196 - Yu Liu, Jianghao Li, Yanyi Zhang, Qi Jia, Weimin Wang, Nan Pu, Nicu Sebe:
PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning. 104197 - He Huang, Sha Tao:
Hyperspectral image classification with token fusion on GPU. 104198 - Xu Liang, Chen Li, Lihua Tian:
Generative adversarial network for semi-supervised image captioning. 104199 - Shiqin Yue, Ziyi Zhang, Ying Shi, Yonghua Cai:
WGS-YOLO: A real-time object detector based on YOLO framework for autonomous driving. 104200 - Hongchun Lu, Min Han:
MT-DSNet: Mix-mask teacher-student strategies and dual dynamic selection plug-in module for fine-grained image recognition. 104201 - Hongsong Wang, Jianhua Zhao, Jie Gui:
Region-aware image-based human action retrieval with transformers. 104202 - Yihan Yang, Ming Xu, Jason F. Ralph, Yuchen Ling, Xiaonan Pan:
An end-to-end tracking framework via multi-view and temporal feature aggregation. 104203 - Qingzheng Xu, Huiqiang Chen, Heming Du, Hu Zhang, Szymon Lukasik, Tianqing Zhu, Xin Yu:
M3A: A multimodal misinformation dataset for media authenticity analysis. 104205 - Hannah Schieber, Fabian Deuser, Bernhard Egger, Norbert Oswald, Daniel Roth:
NeRFtrinsic Four: An end-to-end trainable NeRF jointly optimizing diverse intrinsic and extrinsic camera parameters. 104206 - Wenmin Chen, Xiaowei Xu, Xiaodong Wang, Huasong Zhou, Zewen Li, Yangming Chen:
Invisible backdoor attack with attention and steganography. 104208 - Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, Mubarak Shah:
Reverse Stable Diffusion: What prompt was used to generate this image? 104210 - Xuezhi Xiang, Xiaoheng Li, Xuzhao Liu, Yulong Qiao, Abdulmotaleb El-Saddik:
A GCN and Transformer complementary network for skeleton-based action recognition. 104213
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.