Remote body temperature measurement using infrared thermography has been widely deployed worldwide to detect feverish persons, but the measurement accuracy is affected by various factors including ambient temperature and sensor-subject distance. We present a novel compensation model to address the undesirable interacting influence of ambient temperature and sensor-subject distance during remote facial temperature screening in real-world setting. We derived our model on site-data collected over 12 months and demonstrated the significant linear relationship between ambient temperature and the measured temperature from a thermal camera. In addition, the interaction between the effects of sensor-subject distance and ambient temperature on the measured temperature is significant. Our model can significantly reduce the measurement error (MAE) by 23.5% and is better than the best existing models. The model can also extend the detection distance by up to 46% with sensitivity and specificity over 90%.
SID
Data-Driven and Optics-Inspired Decomposition of Global Pupil Swim in VR/AR for an Improved Perception Model of Motion Discomfort
Phoebe L. Ching, Tsz Tai Chan, Yudong He, and 2 more authors
VR HMD users can observe dynamic distortion (or global pupil swim). Our earlier study correlated pupil swim to selected optic flow patterns and mathematically modeled discomfort. This study decomposed global pupil swim as a linear sum of orthogonal basis patterns for improved prediction of its perceptual effects for an improved perception model
2022
ICASSP
Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation Technique
Yudong He, He Wang, Qifeng Chen, and 1 more author
In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
The degenerate unmixing estimation technique (DUET) is one of the most efficient blind source separation algorithms tackling the challenging situation when the number of sources exceeds the number of microphones. However, as a time-frequency mask-based method, DUET erroneously results in interference components retention when source signals overlap each other in both frequency and time domains. In this paper, to avoid the erroneous retention, instead of masking, we propose to use multiple linear spatial filters (e.g., the minimum variance distortionless response filter) to extract the desired signals. These filters are constructed based on the information embedded in the detected single-source-points, that is, time-frequency points contributed by a single source. In comparison with the conventional DUET, our method achieved an impressive improvement greater than 5 dB in the source-to-interference ratio and 2 to 5 dB improvement in the source-to-distortion ratio, respectively. Findings are substantiated by unmixing simulation using live-recorded mixture signals from up to four sources. Audio examples can be found on the web page: "https://ydcnanhe.github.io/demo-icassp2022/"
Ph.D. THESIS
Improving Analytic Algorithms for Speech Separation
Humans can listen to others in noisy environments, but it is dicult for a humanoid robot to separate the speech of individual persons from a mixture of these sounds. Known as the cocktail party problem, this challenge has intrigued scientists and engineers for more than half a century. Using the sparse characteristic of speech signals, this thesis improves three binaural audio separation algorithms in dierent scenarios: 1) weak target signal extraction, 2) fast separation, and 3) under-determined reverberant speech separation (i.e., binaural speech separation problem with more than two mixed sources in the presence of echoes). First, the thesis improves a previously reported audio cancellation kernel to separate weak target signals. Our new version of the cancellation kernel achieves comparable or even better results with 3000 times the speed thanks to our analytical solutions. This solution originated from our realization that the whole extraction process can be performed in the time-frequency domain by the Short-time Fourier transform. Second, the thesis improves the degenerate unmixing estimation technique (DUET), one of the fastest algorithms in speech separation. As a binary masking technique, DUET cannot completely separate speech signals, resulting in poor performance. We applied post-filtering with multiple linear spatial filters to improve the mask separation results and successfully resulted in significantly better separation performance. Third, the thesis improves the l1 minimization commonly used in audio separation algorithms. Speech separation can be converted to an l1 minimization problem that aims to minimize the l1 norm of the reconstructed signal. We derived and test a new weighted l1 norm and showed that it can outperform the unweighted l1 norm. The new algorithm can be solved using the same l1 minimization solver but converges faster than the unweighted l1 miniization. The improved l1 minimization algorithm has shown to work in the presence of reverberation and with more than two mixed speech sources.
2019
Hearing Aid De-noising Algorithm for Chinese Speaker: Deep Learning Based End-to-end Speech Enhancement Model
Jun Hui, Yue Wei, Yudong He, and 1 more author
In 2019 Beijing International Audiology Conference 2019