Sotabase
Home
Researchers
Career
·
Ph.D. student in Computer Science
,
UC Berkeley
2026–
Publications
(27)
Improving Multi-Modal Learning with Uni-Modal Teachers
arXiv.org · 2021
71
cited
On Uni-Modal Feature Learning in Supervised Multi-Modal Learning
International Conference on Machine Learning · 2023
70
cited
Neural Dubber: Dubbing for Videos According to Scripts
Neural Information Processing Systems · 2021
52
cited
Atss-Net: Target Speaker Separation via Attention-based Neural Network
Interspeech · 2020
41
cited
Deep Speech Synthesis from MRI-Based Articulatory Representations
Interspeech · 2023
28
cited
Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection
Automatic Speech Recognition & Understanding · 2023
27
cited
Learning Visual Styles from Audio-Visual Associations
European Conference on Computer Vision · 2022
26
cited
Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities
arXiv.org · 2025
22
cited
Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation
International Symposium on Chinese Spoken Language Processing · 2019
21
cited
CVC: Contrastive Learning for Non-parallel Voice Conversion
Interspeech · 2020
14
cited
Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models
2025
13
cited
Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals
Interspeech · 2022
8
cited
The DKU Speech Activity Detection and Speaker Identification Systems for Fearless Steps Challenge Phase-02
Interspeech · 2020
8
cited
Self-Supervised Audio-Visual Soundscape Stylization
European Conference on Computer Vision · 2024
7
cited
Optimal Mapping Loss: A Faster Loss for End-to-End Speaker Diarization
The Speaker and Language Recognition Workshop · 2020
5
cited
EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Spoken Dialogue Systems
arXiv.org · 2025
3
cited
Audio Texture Manipulation by Exemplar-Based Analogy
IEEE International Conference on Acoustics, Speech, and Signal Processing · 2025
2
cited
AV-EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Omni-modal LLMS with Audio-visual Cues
arXiv.org · 2025
1
cited
Neural Dubber: Dubbing for Silent Videos According to Scripts
arXiv.org · 2021
1
cited
The Sound of Simulation: Learning Multimodal Sim-to-Real Robot Policies with Generative Audio
2025
1
cited
Show all 27 papers →
Sotabase
Tingle Li | Researcher Profile | Sotabase | Sotabase