bfm-robust

Companion repo for robustness specification in biomedical foundation models (BFMs)

Please cite the following preprint for reference

@misc{xian_robustness_2024,
    address = {Rochester, NY},
    type = {{SSRN} {Scholarly} {Paper}},
    title = {Robustness tests for biomedical foundation models should tailor to specification},
    url = {https://papers.ssrn.com/abstract=5013799},
    doi = {10.2139/ssrn.5013799},
    language = {en},
    urldate = {2024},
    publisher = {Social Science Research Network},
    author = {Xian, Patrick and Baker, Noah R. and David, Tom and Cui, Qiming and Holmgren, A. Jay and Bauer, Stefan and Sushil, Madhumita and Abbasi-Asl, Reza},
    month = jan,
    year = {2024},
    keywords = {AI policy, foundation model, health AI, robustness},
}

Robustness tests in existing BFMs

We carreid out the search of BFMs from a few existing GitHub repositories, review papers, and directly on the internet. We selected a total of about 50 representative BFMs (mostly published in 2023-2024) in publication and preprints, covering a broad range of biomedical domains. We then extracted the relevant information on the model name, developers, modality, domain, capabilities, and any robustness tests that have been described for the each model. The information is gathered here. In the following, we break down the claimed robustness tests conducted for the BFMs. While about a third of the models don't have an explicit robustness test, a small number of models have been subject to multiple ones.

32% None
32% Evaluation on multiple existing datasets (including public datasets used for finetuning)
16% Evaluation on artificially shifted datasets (including perturbed and synthetic datasets)
8% Evaluation on external datasets (datasets not used in development)
8% Ablation studies
6% Others

None indicates no specified robustness tests. Most claimed robustness tests in the selected BFMs involve evaluation of model performance on some datasets, which we divide into three types: existing datasets, artificially shifted datasets, and external datasets.

Robustness categorization and examples

A combination of theoretical and application-oriented resources are collected for robustness. The categorization of robustness follows that provided in the paper.

Surveys, perspectives & tutorials (general domains)

Robustness in the context of foundation models

A.I. Robustness: a Human-Centered Perspective on Technological Challenges and Opportunities, ACM Comput. Surv. (2025)
Robustness at Inference: Towards Explainability, Uncertainty, and Intervenability, CVPR Tutorial (2024)
Machine Learning Robustness: A Primer, arXiv:2404.00897
Spurious Correlations in Machine Learning: A Survey, arXiv:2402.12715
AI Maintenance: A Robustness Perspective, IEEE Computer (2023)
Foundational Robustness of Foundation Models, NeurIPS Tutorial (2022)

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bfm-robust

Robustness tests in existing BFMs

Robustness categorization and examples

Surveys, perspectives & tutorials (general domains)

Surveys, perspectives & tutorials (biomedical domains)

Group robustness

Instance-wise/Individual robustness

Interventional robustness

Aggregated robustness

Uncertainty awareness & Uncertainty-aware robustness

Longitudinal/Temporal robustness

Vendor/Acquisition-shift robustness

Knowledge robustness

Behavioral robustness

Pref-BFM adversarial robustness (language)

Pref-BFM adversarial robustness (vision)

Robustness evaluation & monitoring

About

Releases

Packages

License

RealPolitiX/bfm-robust

Folders and files

Latest commit

History

Repository files navigation

bfm-robust

Robustness tests in existing BFMs

Robustness categorization and examples

Surveys, perspectives & tutorials (general domains)

Surveys, perspectives & tutorials (biomedical domains)

Group robustness

Instance-wise/Individual robustness

Interventional robustness

Aggregated robustness

Uncertainty awareness & Uncertainty-aware robustness

Longitudinal/Temporal robustness

Vendor/Acquisition-shift robustness

Knowledge robustness

Behavioral robustness

Pref-BFM adversarial robustness (language)

Pref-BFM adversarial robustness (vision)

Robustness evaluation & monitoring

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages