publications | Yudong He

2025

TFS
An Equilibrium Approach to Clustering Surpassing Fuzzy C-Means on Imbalanced Data

Yudong He

2025

Abs Bib PDF

Most fuzzy clustering algorithms are based on the well-known Bezdek’s fuzzy C-means (FCM). However, FCM fails when the data is imbalanced (class sizes are highly unequal). This issue arises because in FCM all data points have only attraction to each cluster prototype (i.e., centroid), causing centroids to be biased towards the large class with the most data points. This paper proposes a novel equilibrium K-means (EKM) for imbalanced data, where data points exert both attraction and repulsion on centroids. The equilibrium between opposing forces reduces the learning bias towards large clusters, leading to more meaningful clusters on imbalanced data. Unlike FCM, EKM explicitly models the relationship between membership and centroid via equality constraints, avoiding the pitfalls of uniform effect. We derive closed-form centroid update equations proven to converge exponentially fast. Experiments are conducted on four artificial and 16 real-world datasets. The results demonstrate that EKM significantly outperforms 13 state-of-the-art methods on imbalanced datasets while maintaining competitive performance on balanced data. EKM achieves an average improvement of 0.22 in normalized mutual information, 0.31 in adjusted rand index, and 0.21 in clustering accuracy over FCM on 10 real-world imbalanced datasets, with comparable computational efficiency in theory and practice.
@article{he2025an, title = {An Equilibrium Approach to Clustering Surpassing Fuzzy C-Means on Imbalanced Data}, year = {2025}, author = {He, Yudong}, keywords = {K-means, fuzzy clustering, equilibrium clustering, imbalance learning, uniform effect}, organization = {IEEE}, url = {https://ieeexplore.ieee.org/abstract/document/11098663?casa_token=34VgPpXUfXwAAAAA:BydtmRcjFn00k11mthZZRlwnGOozj_F458gm96RRvt19r2rJptkaVRFp_fwznSX8zDpBMCWs}, }
KBS
Semi-supervised equilibrium K-means for imbalanced data clustering

Yudong He

2025

Abs DOI Bib PDF

Equilibrium K-means (EKM) represents a novel advancement in fuzzy clustering methodologies, outperforming Bezdek’s fuzzy C-means (FCM) algorithm when applied to datasets with imbalanced distributions. Nonetheless, EKM is inherently limited by its inability to integrate supervision information, rendering it less effective in scenarios wherein partial data labels are accessible. In this paper, we propose a semi-supervised variant of EKM (SSEKM) that can effectively leverage supervision knowledge. The effects of supervision knowledge on the model and convergence behavior are elucidated via theoretical analysis. Empirical evaluations conducted on four synthetic and 16 real-world datasets from medical, biological, and industrial sectors indicate that SSEKM exhibits competitive performance against semi-supervised FCM (SSFCM) and other state-of-the-art semi-supervised fuzzy clustering algorithms on balanced datasets and surpasses them on imbalanced datasets. Additionally, SSEKM maintains a computational complexity comparable to SSFCM and offers a higher convergence speed than most comparative algorithms.
@article{he2025semi, title = {Semi-supervised equilibrium K-means for imbalanced data clustering}, year = {2025}, author = {He, Yudong}, keywords = {Fuzzy clustering, Equilibrium K-means, Semi-supervised learning, Imbalanced clustering}, organization = {Elsevier}, doi = {https://doi.org/10.1016/j.knosys.2025.113990}, url = {https://www.sciencedirect.com/science/article/pii/S0950705125010354?utm_campaign=STMJ_220042_AUTH_SERV_PA&utm_medium=email&utm_acid=304786498&SIS_ID=&dgcid=STMJ_220042_AUTH_SERV_PA&CMX_ID=&utm_in=DM581693&utm_source=AC_}, }
ICASSP
A Novel Weighted Sparse Component Analysis for Underdetermined Blind Source Separation

Yudong He, Baeck Hyun Woo, and Richard H.Y. So

2025

Abs Bib PDF

Sparse component analysis (SCA) is a popular underdetermined blind speech separation (UBSS) method. It models all sources to have an identical distribution. As speeches do not have identical distribution, SCA performs suboptimal. Some studies have improved the performance of SCA by weighting the sources through a reweighting scheme. However, they are not UBSS methods because they assume that the mixing process is known. This paper proposes a novel weighting scheme, called sparse spatial component analysis (SSCA) without the need to know the mixing process. In SSCA, weights, sources, and the parameters for modeling the mixing process are jointly optimized, making it a UBSS method. Simulation experiments show that for instantaneous mixtures, SSCA outperforms SCA and reweighted SCA, improving the source-to-distortion ratio (SDR) by 4 dB and reducing the computational time by 40%. Further, experiments using real-world recordings reveal that SSCA outperforms multichannel non-negative matrix factorization and full-rank covari ance analysis (FCA) in terms of SDR. The speed of SSCA is 200% faster than FCA.
@article{HE2025ICASSP, title = {A Novel Weighted Sparse Component Analysis for Underdetermined Blind Source Separation}, year = {2025}, author = {He, Yudong and Woo, Baeck Hyun and So, Richard H.Y.}, keywords = {blind source separation, underdetermined linear system, sparse component analysis, weighted l1 minimization}, organization = {IEEE}, }
J. SP
A novel property to modify weighted l1 minimization for improved compressed sensing

Yudong He, Baeck Hyun Woo, Fauzan Abdurrahim, and 1 more author

2025

Abs DOI Bib PDF

Weighted l1 minimization schemes are common methods to achieve compressed sensing (CS). However, they fail in the presence of inaccurate prior knowledge or improper scaling of weights due to inappropriately assigned large weights causing large and destructive errors in signal recovery. This paper proposes a theory-based algorithm to identify and correct such destructive weights for each signal entry. The enhancement is achieved through a novel sparsity-inducing property (SIP) which establishes a necessary condition for successful signal recovery. SIP outperforms existing properties such as coherence, restricted isometry property, and nullspace property by indicating which signal entries fail to be recovered. This unique advantage enables us to correct destructive weights that do not satisfy the SIP condition, making signal recovery successful where it previously failed. Results from many numerical experiments demonstrate that our proposed method can improve the signal recovery capability, robustness, and stability of the weighted l1 minimization for a wide range of applications, including sparse and compressive signal recovery, noise-aware recovery, sparse error correction, fast image acquisition, and sub-Nyquist sampling.
@article{HE2025109828, title = {A novel property to modify weighted l1 minimization for improved compressed sensing}, booktitle = {Signal Processing}, volume = {230}, pages = {109828}, year = {2025}, issn = {0165-1684}, doi = {https://doi.org/10.1016/j.sigpro.2024.109828}, url = {https://www.sciencedirect.com/science/article/pii/S0165168424004481}, author = {He, Yudong and Woo, Baeck Hyun and Abdurrahim, Fauzan and So, Richard H.Y.}, keywords = {Sparsity, Sparsity-inducing property, Weighted minimization, Compressed sensing, Underdetermined linear system}, organization = {Elsevier}, }

2024

arXiv

Imbalanced Data Clustering using Equilibrium K-Means

Yudong He

2024

Abs PDF

Centroid-based clustering algorithms, such as hard K-means (HKM) and fuzzy K-means (FKM), have suffered from learning bias towards large clusters. Their centroids tend to be crowded in large clusters, compromising performance when the true underlying data groups vary in size (i.e., imbalanced data). To address this, we propose a new clustering objective function based on the Boltzmann operator, which introduces a novel centroid repulsion mechanism, where data points surrounding the centroids repel other centroids. Larger clusters repel more, effectively mitigating the issue of large cluster learning bias. The proposed new algorithm, called equilibrium K-means (EKM), is simple, alternating between two steps; resource-saving, with the same time and space complexity as FKM; and scalable to large datasets via batch learning. We substantially evaluate the performance of EKM on synthetic and real-world datasets. The results show that EKM performs competitively on balanced data and significantly outperforms benchmark algorithms on imbalanced data. Deep clustering experiments demonstrate that EKM is a better alternative to HKM and FKM on imbalanced data as more discriminative representation can be obtained. Additionally, we reformulate HKM, FKM, and EKM in a general form of gradient descent and demonstrate how this general form facilitates a uniform study of K-means algorithms.

2023

CVPR
Remote mass facial temperature screening in varying ambient temperatures and distances

Chu Chu Qiu, Jing Wei Chin, Kwan Long Wong, and 3 more authors

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023

Abs Bib PDF

Remote body temperature measurement using infrared thermography has been widely deployed worldwide to detect feverish persons, but the measurement accuracy is affected by various factors including ambient temperature and sensor-subject distance. We present a novel compensation model to address the undesirable interacting influence of ambient temperature and sensor-subject distance during remote facial temperature screening in real-world setting. We derived our model on site-data collected over 12 months and demonstrated the significant linear relationship between ambient temperature and the measured temperature from a thermal camera. In addition, the interaction between the effects of sensor-subject distance and ambient temperature on the measured temperature is significant. Our model can significantly reduce the measurement error (MAE) by 23.5% and is better than the best existing models. The model can also extend the detection distance by up to 46% with sensitivity and specificity over 90%.
@inproceedings{qiu2023remote, title = {Remote mass facial temperature screening in varying ambient temperatures and distances}, author = {Qiu, Chu Chu and Chin, Jing Wei and Wong, Kwan Long and Chan, Tsz Tai and He, Yudong and So, Richard H. Y.}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, pages = {6067-6075}, year = {2023}, }
SID

Data-Driven and Optics-Inspired Decomposition of Global Pupil Swim in VR/AR for an Improved Perception Model of Motion Discomfort

Phoebe L. Ching, Tsz Tai Chan, Yudong He, and 2 more authors

In SID Display Week 2023, 2023

Abs PDF

VR HMD users can observe dynamic distortion (or global pupil swim). Our earlier study correlated pupil swim to selected optic flow patterns and mathematically modeled discomfort. This study decomposed global pupil swim as a linear sum of orthogonal basis patterns for improved prediction of its perceptual effects for an improved perception model

2022

ICASSP
Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation Technique

Yudong He, He Wang, Qifeng Chen, and 1 more author

In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022

Abs Bib PDF

The degenerate unmixing estimation technique (DUET) is one of the most efficient blind source separation algorithms tackling the challenging situation when the number of sources exceeds the number of microphones. However, as a time-frequency mask-based method, DUET erroneously results in interference components retention when source signals overlap each other in both frequency and time domains. In this paper, to avoid the erroneous retention, instead of masking, we propose to use multiple linear spatial filters (e.g., the minimum variance distortionless response filter) to extract the desired signals. These filters are constructed based on the information embedded in the detected single-source-points, that is, time-frequency points contributed by a single source. In comparison with the conventional DUET, our method achieved an impressive improvement greater than 5 dB in the source-to-interference ratio and 2 to 5 dB improvement in the source-to-distortion ratio, respectively. Findings are substantiated by unmixing simulation using live-recorded mixture signals from up to four sources. Audio examples can be found on the web page: "https://ydcnanhe.github.io/demo-icassp2022/"
@inproceedings{he2022harvesting, title = {Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation Technique}, author = {He, Yudong and Wang, He and Chen, Qifeng and So, Richard HY}, booktitle = {ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages = {506--510}, year = {2022}, organization = {IEEE}, url = {https://ieeexplore.ieee.org/abstract/document/9747679}, }
Ph.D. THESIS
Improving Analytic Algorithms for Speech Separation

Yudong He

2022

Abs Bib PDF

Humans can listen to others in noisy environments, but it is dicult for a humanoid robot to separate the speech of individual persons from a mixture of these sounds. Known as the cocktail party problem, this challenge has intrigued scientists and engineers for more than half a century. Using the sparse characteristic of speech signals, this thesis improves three binaural audio separation algorithms in dierent scenarios: 1) weak target signal extraction, 2) fast separation, and 3) under-determined reverberant speech separation (i.e., binaural speech separation problem with more than two mixed sources in the presence of echoes). First, the thesis improves a previously reported audio cancellation kernel to separate weak target signals. Our new version of the cancellation kernel achieves comparable or even better results with 3000 times the speed thanks to our analytical solutions. This solution originated from our realization that the whole extraction process can be performed in the time-frequency domain by the Short-time Fourier transform. Second, the thesis improves the degenerate unmixing estimation technique (DUET), one of the fastest algorithms in speech separation. As a binary masking technique, DUET cannot completely separate speech signals, resulting in poor performance. We applied post-filtering with multiple linear spatial filters to improve the mask separation results and successfully resulted in significantly better separation performance. Third, the thesis improves the l1 minimization commonly used in audio separation algorithms. Speech separation can be converted to an l1 minimization problem that aims to minimize the l1 norm of the reconstructed signal. We derived and test a new weighted l1 norm and showed that it can outperform the unweighted l1 norm. The new algorithm can be solved using the same l1 minimization solver but converges faster than the unweighted l1 miniization. The improved l1 minimization algorithm has shown to work in the presence of reverberation and with more than two mixed speech sources.
@thesis{Thesis, author = {He, Yudong}, title = {Improving Analytic Algorithms for Speech Separation}, year = {2022}, institution = {Department of Industry Engineering and Decision Analytics, The Hong Kong University of Science and Technology}, type = {phdthesis}, url = {https://lbezone.hkust.edu.hk/bib/991013114556803412}, }

2019

Hearing Aid De-noising Algorithm for Chinese Speaker: Deep Learning Based End-to-end Speech Enhancement Model

Jun Hui, Yue Wei, Yudong He, and 1 more author

In 2019 Beijing International Audiology Conference, 2019