Recovering Speech from Vibrations: Cross-Modal Distillation for Laser Transcription

Sun 28.06 13:00 - 13:30

Abstract: Sensing audio using non-acoustic modalities such as millimeter-wave radar and laser-based systems has emerged as an active research area with significant implications for privacy, security, and robust speech processing. These approaches recover speech-related information from vibration measurements captured by non-acoustic sensing modalities. Prior work spans a wide range of techniques, from classical signal-processing pipelines to modern machine-learning and deep-learning models, enabling applications such as speech reconstruction, eavesdropping, automatic speech recognition, and noise-robust enhancement. Some systems rely on radar or laser sensing as a standalone audio surrogate, while others fuse radar-derived features with microphone signals to improve robustness in noisy or non-line-of-sight environments. Experimental results across the literature demonstrate that recovering intelligible speech or discriminative speech features from radar or laser-sensed vibrations is feasible under controlled conditions. However, performance remains sensitive to practical factors including sensing distance, object material and geometries, environmental interference, multipath effects, and task complexity. Not all speech-related tasks are reliably solved, particularly in unconstrained real-world scenarios. Overall, the field is rapidly evolving, with open challenges in robustness, generalization, and deployment, offering several promising directions for future research.

Speaker

Emily Bederov

Technion

  • Advisors Israel Cohen

  • Academic Degree M.Sc.