Bo Han


RGC Early CAREER Scheme

(PI: Dr. Bo Han, Department of Computer Science, Hong Kong Baptist University)

    Project Award Information

  • Award Number: RGC ECS 22200720

  • Title: Trustworthy Deep Learning from Open-set Corrupted Data

  • Principal Investigator (PI): Dr. Bo Han, Department of Computer Science, Hong Kong Baptist University

    Project Summary

    Trustworthy learning from corrupted data is a vital research topic in modern machine learning (i.e., deep learning), since most real-world data are easily imperfect and corrupted, such as financial data, healthcare data and social-network data. However, existing works in trustworthy deep learning (TDL) tend to implicitly assume that corrupted data should be closed-set: samples with corrupted labels own true classes known in the training data; samples with the set of known classes can be crafted as adversarial examples in the testing phase; and samples in source domain share the same class of samples in target domain. Such closed-set assumption is the crux of existing TDL methods, which is too restrictive for many real-world applications. This project aims to address this conundrum by developing models, algorithms and prototype system for trustworthy deep learning from open-set corrupted data. The outcome of the research could significantly robustify TDL techniques in open-world knowledge discovery and decision-making processes such as those in personalized medicine, financial engineering and scientific discoveries.

    Research Publications

    The following papers focus on open-set reliability:
  • confidence scores make instance-dependent label-noise learning possible (ICML'21, Long Oral)

  • instance-dependent label-noise learning under a structural causal model (NeurIPS'21)

  • tackling instance-dependent label noise via a universal probabilistic model (AAAI'21)

  • learning with group noise (AAAI'21)

  • exploiting class activation value for partial-label learning (ICLR'22)

  • fair classification with instance-dependent label noise (CLeaR'22)

  • learning with mixed closed-set and open-set noisy labels (PAMI'22)

  • a holistic view of label noise transition matrix in deep learning and beyond (ICLR'23, Spotlight)

  • latent class-conditional noise model (PAMI'23)

    The following papers focus on open-set robustness:
  • probabilistic margins for instance reweighting in adversarial training (NeurIPS'21)

  • maximum mean discrepancy is aware of adversarial attacks (ICML'21)

  • learning diverse-structured networks for adversarial robustness (ICML'21)

  • geometry-aware instance-reweighted adversarial training (ICLR'21, Oral)

  • adversarial robustness through the lens of causality (ICLR'22)

  • understanding and improving graph injection attack by promoting unnoticeability (ICLR'22)

  • the learnability of OOD detection (NeurIPS'22, Outstanding Paper Award)

  • bilateral dependency optimization against model-inversion attacks (KDD'22)

    The following papers focus on open-set adaptivity:
  • a one-step approach towards few-shot hypothesis adaptation (NeurIPS'21, Spotlight)

  • universal semi-supervised learning (NeurIPS'21)

  • learning to discover novel classes given very limited data (ICLR'22, Spotlight)

  • diversity-enhancing generative network for few-shot hypothesis adaptation (ICML'23)

  • novel class discovery under unreliable sampling (IJCV'23)

    The following papers focus on automating trustworthy deep learning algorithms:
  • searching to exploit memorization effect in learning from noisy labels (ICML'20)

  • efficient two-stage evolutionary architecture search (ECCV'22)

  • efficient neural architecture search with local intrinsic dimension (AAAI'23)

    Software

  • confidence scores make instance-dependent label-noise learning possible, [code]

  • instance-dependent label-noise learning under a structural causal model, [code]

  • tackling instance-dependent label noise via a universal probabilistic model, [code]

  • learning with group noise, [code]

  • exploiting class activation value for partial-label learning, [code]

  • fair classification with instance-dependent label noise, [code]

  • a holistic view of label noise transition matrix in deep learning and beyond, [code]

  • latent class-conditional noise model, [code]

  • probabilistic margins for instance reweighting in adversarial training, [code]

  • maximum mean discrepancy is aware of adversarial attacks, [code]

  • learning diverse-structured networks for adversarial robustness, [code]

  • geometry-aware instance-reweighted adversarial training, [code]

  • adversarial robustness through the lens of causality, [code]

  • understanding and improving graph injection attack by promoting unnoticeability, [code]

  • bilateral dependency optimization against model-inversion attacks, [code]

  • a one-step approach towards few-shot hypothesis adaptation, [code]

  • learning to discover novel classes given very limited data, [code]

  • diversity-enhancing generative network for few-shot hypothesis adaptation, [code]

  • novel class discovery under unreliable sampling, [code]

  • searching to exploit memorization effect in learning from noisy labels, [code]

  • efficient two-stage evolutionary architecture search, [code]

  • efficient neural architecture search with local intrinsic dimension, [code]

    Education

  • UG Course: COMP3057 (2021 Autumn, 2022 Autumn), COMP4015 (2020 Autumn)

  • PG Course: COMP7250 (2021 Spring, 2022 Spring), COMP7160 (2021 Autumn, 2022 Autumn), COMP7180 (2022 Autumn)

  • Tutorial: IJCAI'21 Learning with Noisy Supervision, ACML'21 Learning under Noisy Supervision, CIKM'22 Learning and Mining with Noisy Labels

  • Undergraduate Research Programme (UGRP): Yifeng Chen, Xinyue Hu

  • Summer Undergraduate Research Fellowship: Yifeng Chen, Xinyue Hu

    Collaborators

  • University: Carnegie Mellon University, The University of Texas at Austin, The University of Sydney, The University of Melbourne, The University of Tokyo, Mohamed bin Zayed University of Artificial Intelligence, The Chinese University of Hong Kong, Hong Kong University of Science and Technology, Tsinghua University

  • Institute: RIKEN Center for Advanced Intelligence Project, Max Planck Institute for Intelligent Systems

  • Industry: Microsoft Research, Alibaba Research

    Acknowlewdgement

    This material is based upon work supported by the RGC under Grant No. 22200720. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the RGC.