"Grid-VLP: Revisiting Grid Features for Vision-Language Pre-training."

Ming Yan et al. (2021)
a service of  Schloss Dagstuhl - Leibniz Center for Informatics