Alexander H. Liu | Researcher Profile | Sotabase

Career

· Research Assistant, MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)2020–

Publications (34)

Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation

Computer Vision and Pattern Recognition · 2019

249

cited

A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation

Neural Information Processing Systems · 2018

229

cited

Listen, Think, and Understand

International Conference on Learning Representations · 2023

223

cited

Contrastive Audio-Visual Masked Autoencoder

International Conference on Learning Representations · 2022

168

cited

Joint Audio and Speech Understanding

Automatic Speech Recognition & Understanding · 2023

118

cited

Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies

Interspeech · 2020

cited

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Neural Information Processing Systems · 2021

cited

Towards End-to-End Unsupervised Speech Recognition

Spoken Language Technology Workshop · 2022

cited

Towards audio language modeling - an overview

arXiv.org · 2024

cited

Cross-Modal Discrete Representation Learning

Annual Meeting of the Association for Computational Linguistics · 2021

cited

Codec-SUPERB: An In-Depth Analysis of Sound Codec Models

Annual Meeting of the Association for Computational Linguistics · 2024

cited

Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning

IEEE International Conference on Acoustics, Speech, and Signal Processing · 2019

cited

Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model

IEEE International Conference on Acoustics, Speech, and Signal Processing · 2018

cited

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Neural Information Processing Systems · 2023

cited

Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

Interspeech · 2023

cited

UAVM: Towards Unifying Audio and Visual Models

IEEE Signal Processing Letters · 2022

cited

Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation

Annual Meeting of the Association for Computational Linguistics · 2020

cited

Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities

arXiv.org · 2025

cited

Simple and Effective Unsupervised Speech Synthesis

Interspeech · 2022

cited

End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training

Spoken Language Technology Workshop · 2020

cited

Sotabase