RGC General Research Fund (PI: Prof. Bo Han, Department of Computer Science, Hong Kong Baptist University)
Project Award Information
Award Number: RGC GRF 12200725
Title: Towards Dynamic Knowledge-aware Federated Learning for Foundation Models
Principal Investigator (PI): Prof. Bo Han, Department of Computer Science, Hong Kong Baptist University
Project Summary
Foundation Models (FMs) have become pivotal in advancing artificial intelligence, offering a flexible framework for developing various industry sectors. Take an example of healthcare, which costs HK$243.2 billion (8.5% of GDP) for Hong Kong in 2021-22, applying FMs can improve the efficiency of healthcare systems, e.g., providing informative assistance on analyzing patient records. However, training FMs usually
relies on using a central server to aggregate massive amounts of data, thus increasing the risk of privacy leakage and data monopolization. In this context, Federated Learning (FL) emerges as a promising approach that can collaboratively train FMs, and embody privacy protection by enabling clients to contribute to FMs training without sending their data. Recent advancements, e.g., TogetherAI, have shown promising outcomes in this field. However, direct applications of existing FL framework to train FMs are impeded due to below challenges: How to align dynamic data requirements of FMs with static data assumptions of FL? FMs necessitate continuous updates, e.g., clients often receive fresh data to keep knowledge of the model. However, existing FL paradigms neglect the challenge of aligning dynamic data requirements of FMs with static data assumptions of FL. How to deal with the data leakage when pre-training FMs in an FL manner? FL fundamentally relies on gradient exchange. However, exchanged gradients can be exploited by adversaries to infer sensitive information, leading to the risk of data leakage. How to deal with imperfect data when fine-tuning FMs in an FL scheme? Data quality plays a crucial role in fine-tuning FMs. However, the data collected by clients may be noisy or even maliciously altered, posing threats to FM performance. In summary, in this project, we aim to develop a new FL paradigm for FMs in dynamic knowledge environments, which can be further deployed to broader scientific and industrial applications.
Research Publications
The following papers focus on federated foundation model learning:
The following papers focus on federated foundation model pre-training:
The following papers focus on federated foundation model fine-tuning:
Collaborators
University: TBD
Institute: TBD
Industry: TBD
Acknowlewdgement
This material is based upon work supported by the RGC under Grant No. 12200725. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the RGC.
|