publications
2025
- A Novel Weighted Sparse Component Analysis for Underdetermined Blind Source SeparationYudong He, Baeck Hyun Woo, and Richard H.Y. So2025
@article{HE2025ICASSP, title = {A Novel Weighted Sparse Component Analysis for Underdetermined Blind Source Separation}, year = {2025}, author = {He, Yudong and Woo, Baeck Hyun and So, Richard H.Y.}, keywords = {blind source separation, underdetermined linear system, sparse component analysis, weighted l1 minimization}, organization = {IEEE}, }
2024
- arXivImbalanced Data Clustering using Equilibrium K-MeansYudong He2024
Centroid-based clustering algorithms, such as hard K-means (HKM) and fuzzy K-means (FKM), have suffered from learning bias towards large clusters. Their centroids tend to be crowded in large clusters, compromising performance when the true underlying data groups vary in size (i.e., imbalanced data). To address this, we propose a new clustering objective function based on the Boltzmann operator, which introduces a novel centroid repulsion mechanism, where data points surrounding the centroids repel other centroids. Larger clusters repel more, effectively mitigating the issue of large cluster learning bias. The proposed new algorithm, called equilibrium K-means (EKM), is simple, alternating between two steps; resource-saving, with the same time and space complexity as FKM; and scalable to large datasets via batch learning. We substantially evaluate the performance of EKM on synthetic and real-world datasets. The results show that EKM performs competitively on balanced data and significantly outperforms benchmark algorithms on imbalanced data. Deep clustering experiments demonstrate that EKM is a better alternative to HKM and FKM on imbalanced data as more discriminative representation can be obtained. Additionally, we reformulate HKM, FKM, and EKM in a general form of gradient descent and demonstrate how this general form facilitates a uniform study of K-means algorithms.
- A novel property to modify weighted l1 minimization for improved compressed sensingYudong He, Baeck Hyun Woo, Fauzan Abdurrahim, and 1 more author2024
Weighted l1 minimization schemes are common methods to achieve compressed sensing (CS). However, they fail in the presence of inaccurate prior knowledge or improper scaling of weights due to inappropriately assigned large weights causing large and destructive errors in signal recovery. This paper proposes a theory-based algorithm to identify and correct such destructive weights for each signal entry. The enhancement is achieved through a novel sparsity-inducing property (SIP) which establishes a necessary condition for successful signal recovery. SIP outperforms existing properties such as coherence, restricted isometry property, and nullspace property by indicating which signal entries fail to be recovered. This unique advantage enables us to correct destructive weights that do not satisfy the SIP condition, making signal recovery successful where it previously failed. Results from many numerical experiments demonstrate that our proposed method can improve the signal recovery capability, robustness, and stability of the weighted l1 minimization for a wide range of applications, including sparse and compressive signal recovery, noise-aware recovery, sparse error correction, fast image acquisition, and sub-Nyquist sampling.
@article{HE2024109828, title = {A novel property to modify weighted l1 minimization for improved compressed sensing}, booktitle = {Signal Processing}, pages = {109828}, year = {2024}, issn = {0165-1684}, doi = {https://doi.org/10.1016/j.sigpro.2024.109828}, url = {https://www.sciencedirect.com/science/article/pii/S0165168424004481}, author = {He, Yudong and Woo, Baeck Hyun and Abdurrahim, Fauzan and So, Richard H.Y.}, keywords = {Sparsity, Sparsity-inducing property, Weighted minimization, Compressed sensing, Underdetermined linear system}, organization = {Elsevier}, }
2023
- Remote mass facial temperature screening in varying ambient temperatures and distancesChu Chu Qiu, Jing Wei Chin, Kwan Long Wong, and 3 more authorsIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023
Remote body temperature measurement using infrared thermography has been widely deployed worldwide to detect feverish persons, but the measurement accuracy is affected by various factors including ambient temperature and sensor-subject distance. We present a novel compensation model to address the undesirable interacting influence of ambient temperature and sensor-subject distance during remote facial temperature screening in real-world setting. We derived our model on site-data collected over 12 months and demonstrated the significant linear relationship between ambient temperature and the measured temperature from a thermal camera. In addition, the interaction between the effects of sensor-subject distance and ambient temperature on the measured temperature is significant. Our model can significantly reduce the measurement error (MAE) by 23.5% and is better than the best existing models. The model can also extend the detection distance by up to 46% with sensitivity and specificity over 90%.
@inproceedings{qiu2023remote, title = {Remote mass facial temperature screening in varying ambient temperatures and distances}, author = {Qiu, Chu Chu and Chin, Jing Wei and Wong, Kwan Long and Chan, Tsz Tai and He, Yudong and So, Richard H. Y.}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, pages = {6067-6075}, year = {2023}, }
- SIDData-Driven and Optics-Inspired Decomposition of Global Pupil Swim in VR/AR for an Improved Perception Model of Motion DiscomfortPhoebe L. Ching, Tsz Tai Chan, Yudong He, and 2 more authorsIn SID Display Week 2023, 2023
VR HMD users can observe dynamic distortion (or global pupil swim). Our earlier study correlated pupil swim to selected optic flow patterns and mathematically modeled discomfort. This study decomposed global pupil swim as a linear sum of orthogonal basis patterns for improved prediction of its perceptual effects for an improved perception model
2022
- Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation TechniqueYudong He, He Wang, Qifeng Chen, and 1 more authorIn ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022
The degenerate unmixing estimation technique (DUET) is one of the most efficient blind source separation algorithms tackling the challenging situation when the number of sources exceeds the number of microphones. However, as a time-frequency mask-based method, DUET erroneously results in interference components retention when source signals overlap each other in both frequency and time domains. In this paper, to avoid the erroneous retention, instead of masking, we propose to use multiple linear spatial filters (e.g., the minimum variance distortionless response filter) to extract the desired signals. These filters are constructed based on the information embedded in the detected single-source-points, that is, time-frequency points contributed by a single source. In comparison with the conventional DUET, our method achieved an impressive improvement greater than 5 dB in the source-to-interference ratio and 2 to 5 dB improvement in the source-to-distortion ratio, respectively. Findings are substantiated by unmixing simulation using live-recorded mixture signals from up to four sources. Audio examples can be found on the web page: "https://ydcnanhe.github.io/demo-icassp2022/"
@inproceedings{he2022harvesting, title = {Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation Technique}, author = {He, Yudong and Wang, He and Chen, Qifeng and So, Richard HY}, booktitle = {ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages = {506--510}, year = {2022}, organization = {IEEE}, url = {https://ieeexplore.ieee.org/abstract/document/9747679}, }
- Improving Analytic Algorithms for Speech SeparationYudong He2022
Humans can listen to others in noisy environments, but it is dicult for a humanoid robot to separate the speech of individual persons from a mixture of these sounds. Known as the cocktail party problem, this challenge has intrigued scientists and engineers for more than half a century. Using the sparse characteristic of speech signals, this thesis improves three binaural audio separation algorithms in dierent scenarios: 1) weak target signal extraction, 2) fast separation, and 3) under-determined reverberant speech separation (i.e., binaural speech separation problem with more than two mixed sources in the presence of echoes). First, the thesis improves a previously reported audio cancellation kernel to separate weak target signals. Our new version of the cancellation kernel achieves comparable or even better results with 3000 times the speed thanks to our analytical solutions. This solution originated from our realization that the whole extraction process can be performed in the time-frequency domain by the Short-time Fourier transform. Second, the thesis improves the degenerate unmixing estimation technique (DUET), one of the fastest algorithms in speech separation. As a binary masking technique, DUET cannot completely separate speech signals, resulting in poor performance. We applied post-filtering with multiple linear spatial filters to improve the mask separation results and successfully resulted in significantly better separation performance. Third, the thesis improves the l1 minimization commonly used in audio separation algorithms. Speech separation can be converted to an l1 minimization problem that aims to minimize the l1 norm of the reconstructed signal. We derived and test a new weighted l1 norm and showed that it can outperform the unweighted l1 norm. The new algorithm can be solved using the same l1 minimization solver but converges faster than the unweighted l1 miniization. The improved l1 minimization algorithm has shown to work in the presence of reverberation and with more than two mixed speech sources.
@thesis{Thesis, author = {He, Yudong}, title = {Improving Analytic Algorithms for Speech Separation}, year = {2022}, institution = {Department of Industry Engineering and Decision Analytics, The Hong Kong University of Science and Technology}, type = {phdthesis}, url = {https://lbezone.hkust.edu.hk/bib/991013114556803412}, }
2019
- Hearing Aid De-noising Algorithm for Chinese Speaker: Deep Learning Based End-to-end Speech Enhancement ModelJun Hui, Yue Wei, Yudong He, and 1 more authorIn 2019 Beijing International Audiology Conference, 2019