Bo Han


RGC General Research Fund

(PI: Prof. Bo Han, Department of Computer Science, Hong Kong Baptist University)

    Project Award Information

  • Award Number: RGC GRF 12200725

  • Title: Towards Dynamic Knowledge-aware Federated Learning for Foundation Models

  • Principal Investigator (PI): Prof. Bo Han, Department of Computer Science, Hong Kong Baptist University

    Project Summary

    Foundation Models (FMs) have become pivotal in advancing artificial intelligence, offering a flexible framework for developing various industry sectors. Take an example of healthcare, which costs HK$243.2 billion (8.5% of GDP) for Hong Kong in 2021-22, applying FMs can improve the efficiency of healthcare systems, e.g., providing informative assistance on analyzing patient records. However, training FMs usually relies on using a central server to aggregate massive amounts of data, thus increasing the risk of privacy leakage and data monopolization. In this context, Federated Learning (FL) emerges as a promising approach that can collaboratively train FMs, and embody privacy protection by enabling clients to contribute to FMs training without sending their data. Recent advancements, e.g., TogetherAI, have shown promising outcomes in this field. However, direct applications of existing FL framework to train FMs are impeded due to below challenges: How to align dynamic data requirements of FMs with static data assumptions of FL? FMs necessitate continuous updates, e.g., clients often receive fresh data to keep knowledge of the model. However, existing FL paradigms neglect the challenge of aligning dynamic data requirements of FMs with static data assumptions of FL. How to deal with the data leakage when pre-training FMs in an FL manner? FL fundamentally relies on gradient exchange. However, exchanged gradients can be exploited by adversaries to infer sensitive information, leading to the risk of data leakage. How to deal with imperfect data when fine-tuning FMs in an FL scheme? Data quality plays a crucial role in fine-tuning FMs. However, the data collected by clients may be noisy or even maliciously altered, posing threats to FM performance. In summary, in this project, we aim to develop a new FL paradigm for FMs in dynamic knowledge environments, which can be further deployed to broader scientific and industrial applications.

    Research Publications

    The following papers focus on federated foundation model learning:
  • TBD

    The following papers focus on federated foundation model pre-training:
  • TBD

    The following papers focus on federated foundation model fine-tuning:
  • TBD

    Software

  • TBD, [code]

    Education

  • UG Course: TBD

  • PG Course: TBD

  • Tutorial: TBD

  • Undergraduate Research Programme (UGRP): TBD

  • Summer Undergraduate Research Fellowship: TBD

    Collaborators

  • University: TBD

  • Institute: TBD

  • Industry: TBD

    Acknowlewdgement

    This material is based upon work supported by the RGC under Grant No. 12200725. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the RGC.