default search action
A. J. Piergiovanni
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c33]Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, A. J. Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut:
On Scaling Up a Multilingual Vision and Language Model. CVPR 2024: 14432-14444 - [c32]A. J. Piergiovanni, Isaac Noble, Dahun Kim, Michael S. Ryoo, Victor Gomes, Anelia Angelova:
Mirasol3B: A Multimodal Autoregressive Model for Time-Aligned and Contextual Modalities. CVPR 2024: 26794-26804 - [c31]Jie Mei, A. J. Piergiovanni, Jenq-Neng Hwang, Wei Li:
SLVP: Self-Supervised Language-Video Pre-Training for Referring Video Object Segmentation. WACV (Workshops) 2024: 507-517 - 2023
- [j3]Weicheng Kuo, A. J. Piergiovanni, Dahun Kim, Xiyang Luo, Benjamin Caine, Wei Li, Abhijit S. Ogale, Luowei Zhou, Andrew M. Dai, Zhifeng Chen, Claire Cui, Anelia Angelova:
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks. Trans. Mach. Learn. Res. 2023 (2023) - [c30]A. J. Piergiovanni, Weicheng Kuo, Anelia Angelova:
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning. CVPR 2023: 2214-2224 - [c29]Maxwell Mbabilla Aladago, A. J. Piergiovanni:
Compound Tokens: Channel Fusion for Vision-Language Representation Learning. Tiny Papers @ ICLR 2023 - [c28]Xi Chen, Xiao Wang, Soravit Changpinyo, A. J. Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish V. Thapliyal, James Bradbury, Weicheng Kuo:
PaLI: A Jointly-Scaled Multilingual Language-Image Model. ICLR 2023 - [c27]Weicheng Kuo, Yin Cui, Xiuye Gu, A. J. Piergiovanni, Anelia Angelova:
Open-Vocabulary Object Detection upon Frozen Vision and Language Models. ICLR 2023 - [i38]Weicheng Kuo, A. J. Piergiovanni, Dahun Kim, Xiyang Luo, Benjamin Caine, Wei Li, Abhijit S. Ogale, Luowei Zhou, Andrew M. Dai, Zhifeng Chen, Claire Cui, Anelia Angelova:
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks. CoRR abs/2303.16839 (2023) - [i37]Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, A. J. Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut:
PaLI-X: On Scaling up a Multilingual Vision and Language Model. CoRR abs/2305.18565 (2023) - [i36]A. J. Piergiovanni, Anelia Angelova:
Joint Adaptive Representations for Image-Language Learning. CoRR abs/2305.19924 (2023) - [i35]Vardaan Pahuja, A. J. Piergiovanni, Anelia Angelova:
Diversifying Joint Vision-Language Tokenization Learning. CoRR abs/2306.03421 (2023) - [i34]A. J. Piergiovanni, Isaac Noble, Dahun Kim, Michael S. Ryoo, Victor Gomes, Anelia Angelova:
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities. CoRR abs/2311.05698 (2023) - 2022
- [c26]A. J. Piergiovanni, Kairo Morton, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova:
Video Question Answering with Iterative Video-Text Co-tokenization. ECCV (36) 2022: 76-94 - [c25]Weicheng Kuo, Fred Bertsch, Wei Li, A. J. Piergiovanni, Mohammad Saffar, Anelia Angelova:
FindIt: Generalized Localization with Natural Language Queries. ECCV (36) 2022: 502-520 - [i33]Weicheng Kuo, Fred Bertsch, Wei Li, A. J. Piergiovanni, Mohammad Saffar, Anelia Angelova:
FindIt: Generalized Localization with Natural Language Queries. CoRR abs/2203.17273 (2022) - [i32]A. J. Piergiovanni, Wei Li, Weicheng Kuo, Mohammad Saffar, Fred Bertsch, Anelia Angelova:
Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering. CoRR abs/2205.00949 (2022) - [i31]A. J. Piergiovanni, Kairo Morton, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova:
Video Question Answering with Iterative Video-Text Co-Tokenization. CoRR abs/2208.00934 (2022) - [i30]A. J. Piergiovanni, Weicheng Kuo, Anelia Angelova:
Pre-training image-language transformers for open-vocabulary tasks. CoRR abs/2209.04372 (2022) - [i29]Xi Chen, Xiao Wang, Soravit Changpinyo, A. J. Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish V. Thapliyal, James Bradbury, Weicheng Kuo, Mojtaba Seyedhosseini, Chao Jia, Burcu Karagol Ayan, Carlos Riquelme, Andreas Steiner, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut:
PaLI: A Jointly-Scaled Multilingual Language-Image Model. CoRR abs/2209.06794 (2022) - [i28]Weicheng Kuo, Yin Cui, Xiuye Gu, A. J. Piergiovanni, Anelia Angelova:
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models. CoRR abs/2209.15639 (2022) - [i27]Maxwell Mbabilla Aladago, A. J. Piergiovanni:
Compound Tokens: Channel Fusion for Vision-Language Representation Learning. CoRR abs/2212.01447 (2022) - [i26]A. J. Piergiovanni, Weicheng Kuo, Anelia Angelova:
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning. CoRR abs/2212.03229 (2022) - 2021
- [c24]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo, Irfan Essa:
Unsupervised Discovery of Actions in Instructional Videos. BMVC 2021: 283 - [c23]Juhana Kangaspunta, A. J. Piergiovanni, Rico Jonschkowski, Michael S. Ryoo, Anelia Angelova:
Adaptive Intermediate Representations for Video Understanding. CVPR Workshops 2021: 1602-1612 - [c22]A. J. Piergiovanni, Michael S. Ryoo:
Recognizing Actions in Videos From Unseen Viewpoints. CVPR 2021: 4124-4132 - [c21]A. J. Piergiovanni, Vincent Casser, Michael S. Ryoo, Anelia Angelova:
4D-Net for Learned Multi-Modal Alignment. ICCV 2021: 15415-15425 - [c20]Michael S. Ryoo, A. J. Piergiovanni, Anurag Arnab, Mostafa Dehghani, Anelia Angelova:
TokenLearner: Adaptive Space-Time Tokenization for Videos. NeurIPS 2021: 12786-12797 - [i25]A. J. Piergiovanni, Michael S. Ryoo:
Recognizing Actions in Videos from Unseen Viewpoints. CoRR abs/2103.16516 (2021) - [i24]Juhana Kangaspunta, A. J. Piergiovanni, Rico Jonschkowski, Michael S. Ryoo, Anelia Angelova:
Adaptive Intermediate Representations for Video Understanding. CoRR abs/2104.07135 (2021) - [i23]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo, Irfan A. Essa:
Unsupervised Action Segmentation for Instructional Videos. CoRR abs/2106.03738 (2021) - [i22]Michael S. Ryoo, A. J. Piergiovanni, Anurag Arnab, Mostafa Dehghani, Anelia Angelova:
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? CoRR abs/2106.11297 (2021) - [i21]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo, Irfan A. Essa:
Unsupervised Discovery of Actions in Instructional Videos. CoRR abs/2106.14733 (2021) - [i20]A. J. Piergiovanni, Vincent Casser, Michael S. Ryoo, Anelia Angelova:
4D-Net for Learned Multi-Modal Alignment. CoRR abs/2109.01066 (2021) - 2020
- [j2]Alan Wu, A. J. Piergiovanni, Michael S. Ryoo:
Model-Based Robot Imitation with Future Image Similarity. Int. J. Comput. Vis. 128(5): 1360-1374 (2020) - [j1]Alan Wu, A. J. Piergiovanni, Michael S. Ryoo:
Correction to: Model-Based Robot Imitation with Future Image Similarity. Int. J. Comput. Vis. 128(5): 1375 (2020) - [c19]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo:
Differentiable Grammars for Videos. AAAI 2020: 11874-11881 - [c18]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo:
Evolving Losses for Unsupervised Video Representation Learning. CVPR 2020: 130-139 - [c17]Xiaofang Wang, Xuehan Xiong, Maxim Neumann, A. J. Piergiovanni, Michael S. Ryoo, Anelia Angelova, Kris M. Kitani, Wei Hua:
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification. ECCV (8) 2020: 449-465 - [c16]A. J. Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo:
Adversarial Generative Grammars for Human Activity Prediction. ECCV (2) 2020: 507-523 - [c15]Michael S. Ryoo, A. J. Piergiovanni, Juhana Kangaspunta, Anelia Angelova:
AssembleNet++: Assembling Modality Representations via Attention Connections. ECCV (20) 2020: 654-671 - [c14]Michael S. Ryoo, A. J. Piergiovanni, Mingxing Tan, Anelia Angelova:
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures. ICLR 2020 - [c13]A. J. Piergiovanni, Michael S. Ryoo:
AViD Dataset: Anonymized Videos from Diverse Countries. NeurIPS 2020 - [c12]A. J. Piergiovanni, Michael S. Ryoo:
Learning Multimodal Representations for Unseen Activities. WACV 2020: 506-515 - [i19]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo:
Evolving Losses for Unsupervised Video Representation Learning. CoRR abs/2002.12177 (2020) - [i18]A. J. Piergiovanni, Michael S. Ryoo:
AViD Dataset: Anonymized Videos from Diverse Countries. CoRR abs/2007.05515 (2020) - [i17]Xiaofang Wang, Xuehan Xiong, Maxim Neumann, A. J. Piergiovanni, Michael S. Ryoo, Anelia Angelova, Kris M. Kitani, Wei Hua:
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification. CoRR abs/2007.12034 (2020) - [i16]A. J. Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo:
Adversarial Generative Grammars for Human Activity Prediction. CoRR abs/2008.04888 (2020) - [i15]Michael S. Ryoo, A. J. Piergiovanni, Juhana Kangaspunta, Anelia Angelova:
AssembleNet++: Assembling Modality Representations via Attention Connections. CoRR abs/2008.08072 (2020)
2010 – 2019
- 2019
- [c11]Alan Wu, A. J. Piergiovanni, Michael S. Ryoo:
Model-based Behavioral Cloning with Future Image Similarity Learning. CoRL 2019: 1062-1077 - [c10]A. J. Piergiovanni, Michael S. Ryoo:
Early Detection of Injuries in MLB Pitchers From Video. CVPR Workshops 2019: 2431-2438 - [c9]A. J. Piergiovanni, Michael S. Ryoo:
Representation Flow for Action Recognition. CVPR 2019: 9945-9953 - [c8]A. J. Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo:
Evolving Space-Time Neural Architectures for Videos. ICCV 2019: 1793-1802 - [c7]A. J. Piergiovanni, Michael S. Ryoo:
Temporal Gaussian Mixture Layer for Videos. ICML 2019: 5152-5161 - [c6]A. J. Piergiovanni, Alan Wu, Michael S. Ryoo:
Learning Real-World Robot Policies by Dreaming. IROS 2019: 7680-7687 - [i14]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo:
Learning Differentiable Grammars for Continuous Data. CoRR abs/1902.00505 (2019) - [i13]A. J. Piergiovanni, Michael S. Ryoo:
Early Detection of Injuries in MLB Pitchers from Video. CoRR abs/1904.08916 (2019) - [i12]Michael S. Ryoo, A. J. Piergiovanni, Mingxing Tan, Anelia Angelova:
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures. CoRR abs/1905.13209 (2019) - [i11]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo:
Evolving Losses for Unlabeled Video Representation Learning. CoRR abs/1906.03248 (2019) - [i10]Alan Wu, A. J. Piergiovanni, Michael S. Ryoo:
Model-based Behavioral Cloning with Future Image Similarity Learning. CoRR abs/1910.03157 (2019) - [i9]A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo:
Tiny Video Networks. CoRR abs/1910.06961 (2019) - 2018
- [c5]A. J. Piergiovanni, Michael S. Ryoo:
Fine-Grained Activity Recognition in Baseball Videos. CVPR Workshops 2018: 1740-1748 - [c4]Alan Wu, A. J. Piergiovanni, Michael S. Ryoo:
Action-Conditioned Convolutional Future Regression Models for Robot Imitation Learning. CVPR Workshops 2018: 2035-2037 - [c3]A. J. Piergiovanni, Michael S. Ryoo:
Learning Latent Super-Events to Detect Multiple Activities in Videos. CVPR 2018: 5304-5313 - [i8]A. J. Piergiovanni, Michael S. Ryoo:
Activity Detection with Latent Sub-event Hierarchy Learning. CoRR abs/1803.06316 (2018) - [i7]A. J. Piergiovanni, Michael S. Ryoo:
Fine-grained Activity Recognition in Baseball Videos. CoRR abs/1804.03247 (2018) - [i6]A. J. Piergiovanni, Alan Wu, Michael S. Ryoo:
Learning Real-World Robot Policies by Dreaming. CoRR abs/1805.07813 (2018) - [i5]A. J. Piergiovanni, Michael S. Ryoo:
Learning Shared Multimodal Embeddings with Unpaired Data. CoRR abs/1806.08251 (2018) - [i4]A. J. Piergiovanni, Michael S. Ryoo:
Representation Flow for Action Recognition. CoRR abs/1810.01455 (2018) - [i3]A. J. Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo:
Evolving Space-Time Neural Architectures for Videos. CoRR abs/1811.10636 (2018) - 2017
- [c2]A. J. Piergiovanni, Chenyou Fan, Michael S. Ryoo:
Title Learning Latent Subevents in Activity Videos Using Temporal Attention Filters. AAAI 2017: 4247-4254 - [i2]A. J. Piergiovanni, Michael S. Ryoo:
Learning Latent Super-Events to Detect Multiple Activities in Videos. CoRR abs/1712.01938 (2017) - 2016
- [i1]A. J. Piergiovanni, Chenyou Fan, Michael S. Ryoo:
Temporal attention filters for human activity recognition in videos. CoRR abs/1605.08140 (2016) - 2015
- [c1]A. J. Piergiovanni, Alan Jern:
Computational principles underlying people's behavior explanations. CogSci 2015
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-08 21:30 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint