Project

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

2025

Compressed Vocabulary Expansion Makes Stronger Recommender Systems

Haochen Zhang March 18, 2025 April 16, 2025

large-language-model recommender-system memory-efficiency vocabulary-expansion

Recommender systems play a pivotal role in providing relevant content to users. With the rapid development of large language models (LLMs), researchers have begun utilizing LLMs to build more powerful recommender systems. However, existing approaches that focus on aligning LLMs with recommendation tasks do not fully leverage their sequential information processing capabilities, leading to suboptimal performance.

Low-Rank Optimization for Memory-efficient Large Language Model Pretraining

Haochen Zhang March 18, 2025 April 16, 2025

large-language-model low-rank-optimization memory-efficiency

This project explores better ways to do low-rank optimization for memory efficient LLM training.

Automatic Active Lesion Tracking in Multiple Sclerosis Using Unsupervised Machine Learning

Jason Uwaeze March 17, 2025 April 16, 2025

diagnositic-imaging unsupervised-learning multiple-sclerosis brain-mri

This study uses Non-linear Dimensionality Reduction (NLDR) techniques to identify active lesions in magnetic resonance imaging (MRI). We applied Locally Linear Embedding (LLE) and Isometric Feature Mapping (Isomap) to MRI data from 40 multiple sclerosis patients, achieving median Dice scores of 0.74 and 0.78 for active lesion segmentation, respectively [Paper][Code].

Patch-Based Convolutional Neural Networks for Multiple Microstructural Features Detection in Nuclear Fuel

Jason Uwaeze March 17, 2025 April 16, 2025

scanning-electron-microscopy energy-dispersive-spectroscopy convolutional-neural-networks semantic-segmentation nuclear-energy

This work introduces a new framework for identifying microstructures in irradiated U-10Zr (wt. %) metallic fuel with limited annotated data. The framework includes the creation of a reliable annotated dataset with paired SEM and ground truth data from EDS maps, the applications of CNNs for microstructure identification, and the validation of model performance. We evaluate several models, including Patch-based U-Net, Attention U-Net, and Residual U-Net, finding that patch-based U-Net exhibits superior segmentation performance and consistency. This approach reduces reliance on EDS detectors and aids in accelerating nuclear material analysis process, highlighting the potential of advanced deep learning techniques to improve microstructural understanding in nuclear material.

Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization

Yu-Neng (Allen) Chuang February 23, 2025 April 16, 2025

on-device-llm-routing uncertainty-quantification efficient-llm-inference wearable-device

This work offers an accessible and reproducible pipeline for uncertainty-based routing from benchmarking to generalization.

2024

Asynchronous Decentralized Federated Lifelong Learning (ADFLL)

Guangyao Zheng December 30, 2024 April 16, 2025

personalized-healthcare distributed-learning continual-learning medical-imaging low-compute-optimization

This project introduces the Asynchronous Decentralized Federated Lifelong Learning (ADFLL) framework, an innovative approach to federated learning that addresses the limitations of synchronous training schedules and the lack of lifelong learning in conventional machine learning frameworks for medical applications. ADFLL enables asynchronous and continual learning across agents, allowing them to leverage both their own experiences and knowledge shared by others. The framework was evaluated using deep reinforcement learning (DRL) for landmark localization tasks across diverse imaging modalities, orientations, and sequences. Experimental results demonstrated that ADFLL outperforms baseline models in collaborative learning, showing superior performance on both in-distribution and out-of-distribution test sets. This robust, efficient, and flexible framework is well-suited for deployment in real-world applications requiring privacy-preserving and lifelong collaborative learning. Paper published in Medical Imaging with Deep Learning (MIDL) 2024, Towards a Collective Medical Imaging AI: Enabling Continual Learning from Peers. Github repo: https://github.com/guangyaoz/ADFLL.

Biomarker Stroke Prediction

Guangyao Zheng December 30, 2024 April 16, 2025

personalized-healthcare biomarker few-shot-learning

This project explores the link between mitochondrial oxidative phosphorylation (OxPhos) abnormalities and stroke risk in patients with advanced congestive heart failure (CHF) undergoing continuous-flow left ventricular assist device (CF-LVAD) implantation. Stroke remains a significant complication for this patient population, and prior ischemic events may predispose individuals to systemic mitochondrial dysfunction, exacerbating their risk of new strokes post-implantation.

Arrhythmia Detection and ECG Explainability

Guangyao Zheng December 30, 2024 April 16, 2025

personalized-healthcare multi-label-classification wearable-device cardiovascular-health explainability interpretability

This project addresses the critical challenges of arrhythmia detection and classification, particularly in the context of wearable electrocardiogram (ECG) monitoring devices. Unlike clinically controlled environments, wearable devices operate in noisy, real-world conditions, which complicates the accurate identification of arrhythmias. Additionally, the inherent imbalance in the ratio of normal heartbeats to arrhythmic ones, along with the diverse combinations of arrhythmia types, further compounds the difficulty of the task.

Learning-augmented Maximum Independent Set

Chen Wang September 05, 2024 April 16, 2025

foundations-of-machine-learning algorithm-design graph-mining theoretical-machine-learning beyong-worst-case-analysis

We investigated algorithms for the Maximum Independent Set (MIS) problem in the presence of a learning-augmented membership oracle. Given a graph $G=(V,E)$, the Maximum Independent Set asks to find the maximum subset of vertices $I^\star \subseteq V$ such that for any pair of vertice $u,v \in I^\star$, $(u,v)\not\in E$, i.e., they are not neighbors. It is known that the MIS problem is NP-hard and it is NP-hard to approximation for a factor of $n^{1-\delta}$ for any constant $\delta>0$.