Sotabase
Home
Researchers
Career
·
Research Lead
,
Allen Institute of Artificial Intelligence (AI2)
2024–
·
Assistant Professor
,
University of Washington
2022–
·
Research Scientist
,
Facebook (Meta)
2021–
·
PhD Candidate in Artificial Intelligence Laboratory
,
Stanford University
2016–
Publications
(154)
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
International Journal of Computer Vision · 2016
6,280
cited
On the Opportunities and Risks of Foundation Models
arXiv.org · 2021
5,795
cited
Dense-Captioning Events in Videos
IEEE International Conference on Computer Vision · 2017
1,451
cited
Visual Relationship Detection with Language Priors
European Conference on Computer Vision · 2016
1,197
cited
Image retrieval using scene graphs
Computer Vision and Pattern Recognition · 2015
1,167
cited
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Annual Meeting of the Association for Computational Linguistics · 2023
763
cited
DataComp: In search of the next generation of multimodal datasets
Neural Information Processing Systems · 2023
596
cited
Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs
Computer Vision and Pattern Recognition · 2019
391
cited
Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval
VL@EMNLP · 2015
362
cited
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
arXiv.org · 2024
356
cited
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
IEEE International Conference on Computer Vision · 2023
351
cited
BLINK: Multimodal Large Language Models Can See but Not Perceive
European Conference on Computer Vision · 2024
326
cited
Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias
Neural Information Processing Systems · 2023
323
cited
Explanations Can Reduce Overreliance on AI Systems During Decision-Making
Proc. ACM Hum. Comput. Interact. · 2022
254
cited
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
Neural Information Processing Systems · 2024
196
cited
@ CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Computer Vision and Pattern Recognition · 2022
187
cited
Holodeck: Language Guided Generation of 3D Embodied AI Environments
Computer Vision and Pattern Recognition · 2023
186
cited
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics
arXiv.org · 2024
164
cited
NVILA: Efficient Frontier Visual Language Models
Computer Vision and Pattern Recognition · 2024
154
cited
AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning
Computer Vision and Pattern Recognition · 2021
146
cited
Show all 154 papers →
Sotabase