Omer Levy | Researcher Profile | Sotabase

Career

· Researcher, Character AI2024–

· Research Scientist, Google DeepMind2024–

· Research Scientist, Meta FAIR2024–

· Lecturer, Tel Aviv University2020–

Publications (135)

RoBERTa: A Robustly Optimized BERT Pretraining Approach

arXiv.org · 2019

28,249

cited

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Annual Meeting of the Association for Computational Linguistics · 2019

12,207

cited

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

BlackboxNLP@EMNLP · 2018

8,129

cited

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Neural Information Processing Systems · 2019

2,638

cited

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

arXiv.org · 2022

2,201

cited

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Transactions of the Association for Computational Linguistics · 2019

2,108

cited

Neural Word Embedding as Implicit Matrix Factorization

Neural Information Processing Systems · 2014

1,990

cited

What Does BERT Look at? An Analysis of BERT’s Attention

BlackboxNLP@ACL · 2019

1,852

cited

word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method

arXiv.org · 2014

1,688

cited

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

arXiv.org · 2025

1,636

cited

Improving Distributional Similarity with Lessons Learned from Word Embeddings

Transactions of the Association for Computational Linguistics · 2015

1,388

cited

code2vec: learning distributed representations of code

Proc. ACM Program. Lang. · 2018

1,281

cited

Are Sixteen Heads Really Better than One?

Neural Information Processing Systems · 2019

1,250

cited

Annotation Artifacts in Natural Language Inference Data

North American Chapter of the Association for Computational Linguistics · 2018

1,235

cited

Transformer Feed-Forward Layers Are Key-Value Memories

Conference on Empirical Methods in Natural Language Processing · 2020

1,180

cited

Dependency-Based Word Embeddings

Annual Meeting of the Association for Computational Linguistics · 2014

1,178

cited

LIMA: Less Is More for Alignment

Neural Information Processing Systems · 2023

1,157

cited

Generalization through Memorization: Nearest Neighbor Language Models

International Conference on Learning Representations · 2019

985

cited

Zero-Shot Relation Extraction via Reading Comprehension

Conference on Computational Natural Language Learning · 2017

785

cited

code2seq: Generating Sequences from Structured Representations of Code

International Conference on Learning Representations · 2018

775

cited

Sotabase