Sotabase
Home
Researchers
Career
·
Assistant Professor
,
MIT EECS CSAIL
2025–
·
Researcher
,
MIT Medical Vision Group
2025–
·
Research Scientist
,
Databricks
2024–
·
Research Ph.D. Intern
,
Apple
2022–
·
Member of Theory of Distributed Systems Group
,
MIT CSAIL
Publications
(46)
On the Opportunities and Risks of Foundation Models
arXiv.org · 2021
5,795
cited
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval · 2020
1,816
cited
Holistic Evaluation of Language Models
Trans. Mach. Learn. Res. · 2023
1,324
cited
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
North American Chapter of the Association for Computational Linguistics · 2021
595
cited
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
arXiv.org · 2023
524
cited
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
arXiv.org · 2022
348
cited
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
North American Chapter of the Association for Computational Linguistics · 2023
206
cited
Learning Passage Impacts for Inverted Indexes
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval · 2021
179
cited
DSPy: Compiling Declarative Language Model Calls into State-of-the-Art Pipelines
International Conference on Learning Representations · 2024
135
cited
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Annual Meeting of the Association for Computational Linguistics · 2024
119
cited
Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs
Conference on Empirical Methods in Natural Language Processing · 2024
117
cited
PLAID: An Efficient Engine for Late Interaction Retrieval
International Conference on Information and Knowledge Management · 2022
117
cited
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
North American Chapter of the Association for Computational Linguistics · 2024
116
cited
Relevance-guided Supervision for OpenQA with ColBERT
Transactions of the Association for Computational Linguistics · 2020
109
cited
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval
Neural Information Processing Systems · 2021
68
cited
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
arXiv.org · 2025
66
cited
UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers
Conference on Empirical Methods in Natural Language Processing · 2023
57
cited
Hindsight: Posterior-guided training of retrievers for improved open-ended generation
International Conference on Learning Representations · 2021
47
cited
Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction
International Conference on Information and Knowledge Management · 2022
39
cited
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together
Conference on Empirical Methods in Natural Language Processing · 2024
38
cited
Show all 46 papers →
Sotabase
Omar Khattab | Researcher Profile | Sotabase | Sotabase