Sotabase
Home
Researchers
Career
·
Research Assistant
,
MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)
2020–
Publications
(34)
Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation
Computer Vision and Pattern Recognition · 2019
249
cited
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation
Neural Information Processing Systems · 2018
229
cited
Listen, Think, and Understand
International Conference on Learning Representations · 2023
223
cited
Contrastive Audio-Visual Masked Autoencoder
International Conference on Learning Representations · 2022
168
cited
Joint Audio and Speech Understanding
Automatic Speech Recognition & Understanding · 2023
118
cited
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Interspeech · 2020
93
cited
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Neural Information Processing Systems · 2021
86
cited
Towards End-to-End Unsupervised Speech Recognition
Spoken Language Technology Workshop · 2022
84
cited
Towards audio language modeling - an overview
arXiv.org · 2024
60
cited
Cross-Modal Discrete Representation Learning
Annual Meeting of the Association for Computational Linguistics · 2021
53
cited
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models
Annual Meeting of the Association for Computational Linguistics · 2024
52
cited
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
IEEE International Conference on Acoustics, Speech, and Signal Processing · 2019
52
cited
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
IEEE International Conference on Acoustics, Speech, and Signal Processing · 2018
47
cited
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Neural Information Processing Systems · 2023
37
cited
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering
Interspeech · 2023
31
cited
UAVM: Towards Unifying Audio and Visual Models
IEEE Signal Processing Letters · 2022
30
cited
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation
Annual Meeting of the Association for Computational Linguistics · 2020
24
cited
Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities
arXiv.org · 2025
22
cited
Simple and Effective Unsupervised Speech Synthesis
Interspeech · 2022
19
cited
End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training
Spoken Language Technology Workshop · 2020
18
cited
Show all 34 papers →
Sotabase
Alexander H. Liu | Researcher Profile | Sotabase | Sotabase