Heng-Jui Chang | Researcher Profile | Sotabase

Career

· Research Intern, MIT Sussman Lab2023–

· Research Assistant, MIT CSAIL2022–

Publications (17)

Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert

IEEE International Conference on Acoustics, Speech, and Signal Processing · 2021

206

cited

SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities

Annual Meeting of the Association for Computational Linguistics · 2022

124

cited

A Large-Scale Evaluation of Speech Foundation Models

IEEE/ACM Transactions on Audio Speech and Language Processing · 2024

cited

SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model

Spoken Language Technology Workshop · 2022

cited

Towards Lifelong Learning of End-to-end ASR

Interspeech · 2021

cited

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Neural Information Processing Systems · 2023

cited

Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering

Interspeech · 2023

cited

End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training

Spoken Language Technology Workshop · 2020

cited

Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models

arXiv.org · 2021

cited

Non-Autoregressive Mandarin-English Code-Switching Speech Recognition

Automatic Speech Recognition & Understanding · 2021

cited

M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval

IEEE International Conference on Acoustics, Speech, and Signal Processing · 2022

cited

COLLD: Contrastive Layer-to-Layer Distillation for Compressing Multilingual Pre-Trained Speech Encoders

IEEE International Conference on Acoustics, Speech, and Signal Processing · 2023

cited

R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces

North American Chapter of the Association for Computational Linguistics · 2023

cited

DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models

Interspeech · 2024

cited

SpeechCLIP+: Self-Supervised Multi-Task Representation Learning for Speech Via Clip and Speech-Image Data

2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) · 2024

cited

USAD: Universal Speech and Audio Representation via Distillation

arXiv.org · 2025

cited

End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning

arXiv.org · 2020

cited

Sotabase