Tingle Li | Researcher Profile | Sotabase

Career

· Ph.D. student in Computer Science, UC Berkeley2026–

Publications (27)

Improving Multi-Modal Learning with Uni-Modal Teachers

arXiv.org · 2021

cited

On Uni-Modal Feature Learning in Supervised Multi-Modal Learning

International Conference on Machine Learning · 2023

cited

Neural Dubber: Dubbing for Videos According to Scripts

Neural Information Processing Systems · 2021

cited

Atss-Net: Target Speaker Separation via Attention-based Neural Network

Interspeech · 2020

cited

Deep Speech Synthesis from MRI-Based Articulatory Representations

Interspeech · 2023

cited

Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection

Automatic Speech Recognition & Understanding · 2023

cited

Learning Visual Styles from Audio-Visual Associations

European Conference on Computer Vision · 2022

cited

Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities

arXiv.org · 2025

cited

Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation

International Symposium on Chinese Spoken Language Processing · 2019

cited

CVC: Contrastive Learning for Non-parallel Voice Conversion

Interspeech · 2020

cited

Full-Duplex-Bench v1.5: Evaluating Overlap Handling for Full-Duplex Speech Models

2025

cited

Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals

Interspeech · 2022

cited

The DKU Speech Activity Detection and Speaker Identification Systems for Fearless Steps Challenge Phase-02

Interspeech · 2020

cited

Self-Supervised Audio-Visual Soundscape Stylization

European Conference on Computer Vision · 2024

cited

Optimal Mapping Loss: A Faster Loss for End-to-End Speaker Diarization

The Speaker and Language Recognition Workshop · 2020

cited

EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Spoken Dialogue Systems

arXiv.org · 2025

cited

Audio Texture Manipulation by Exemplar-Based Analogy

IEEE International Conference on Acoustics, Speech, and Signal Processing · 2025

cited

AV-EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Omni-modal LLMS with Audio-visual Cues

arXiv.org · 2025

cited

Neural Dubber: Dubbing for Silent Videos According to Scripts

arXiv.org · 2021

cited

The Sound of Simulation: Learning Multimodal Sim-to-Real Robot Policies with Generative Audio

2025

cited

Sotabase