Sotabase
Home
Researchers
Career
·
Research Intern
,
FutureHouse
2025–
·
Research Intern
,
Microsoft
2022–
·
PhD Student
,
Stanford University
2018–
·
Research Assistant
,
University of Maryland
2018–
Publications
(27)
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
arXiv.org · 2022
2,792
cited
Prompting GPT-3 To Be Reliable
International Conference on Learning Representations · 2022
345
cited
The Prompt Report: A Systematic Survey of Prompting Techniques
arXiv.org · 2024
243
cited
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
arXiv.org · 2021
199
cited
CharBERT: Character-aware Pre-trained Language Model
International Conference on Computational Linguistics · 2020
123
cited
The Prompt Report: A Systematic Survey of Prompt Engineering Techniques
2024
96
cited
Better Robustness by More Coverage: Adversarial and Mixup Data Augmentation for Robust Finetuning
Findings · 2020
74
cited
Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations
Annual Meeting of the Association for Computational Linguistics · 2023
62
cited
What does BERT Learn from Multiple-Choice Reading Comprehension Datasets?
arXiv.org · 2019
55
cited
Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions
arXiv.org · 2024
52
cited
Getting MoRE out of Mixture of Language Model Reasoning Experts
Conference on Empirical Methods in Natural Language Processing · 2023
45
cited
Benchmarking Robustness of Machine Reading Comprehension Models
Findings · 2020
44
cited
Re-Examining Calibration: The Case of Question Answering
Conference on Empirical Methods in Natural Language Processing · 2022
41
cited
Better Robustness by More Coverage: Adversarial Training with Mixup Augmentation for Robust Fine-tuning
arXiv.org · 2020
36
cited
What’s in a Name? Answer Equivalence For Open-Domain Question Answering
Conference on Empirical Methods in Natural Language Processing · 2021
34
cited
Configurable Foundation Models: Building LLMs from a Modular Perspective
arXiv.org · 2024
23
cited
Sentiment Aware Neural Machine Translation
Conference on Empirical Methods in Natural Language Processing · 2019
17
cited
Sub-Character Tokenization for Chinese Pretrained Language Models
Transactions of the Association for Computational Linguistics · 2021
17
cited
Contextual Experience Replay for Self-Improvement of Language Agents
Annual Meeting of the Association for Computational Linguistics · 2025
15
cited
Dataset Mention Extraction and Classification
Proceedings of the Workshop on Extracting Structured Knowledge from Scientific Publications · 2019
13
cited
Show all 27 papers →
Sotabase
Chenglei Si | Researcher Profile | Sotabase | Sotabase