Skip to content

Companion repo for robustness specification in biomedical foundation models

License

Notifications You must be signed in to change notification settings

RealPolitiX/bfm-robust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 

Repository files navigation

bfm-robust

Companion repo for robustness specification in biomedical foundation models (BFMs)

Please cite the following preprint for reference

@misc{xian_robustness_2024,
    address = {Rochester, NY},
    type = {{SSRN} {Scholarly} {Paper}},
    title = {Robustness tests for biomedical foundation models should tailor to specification},
    url = {https://papers.ssrn.com/abstract=5013799},
    doi = {10.2139/ssrn.5013799},
    language = {en},
    urldate = {2024},
    publisher = {Social Science Research Network},
    author = {Xian, Patrick and Baker, Noah R. and David, Tom and Cui, Qiming and Holmgren, A. Jay and Bauer, Stefan and Sushil, Madhumita and Abbasi-Asl, Reza},
    month = jan,
    year = {2024},
    keywords = {AI policy, foundation model, health AI, robustness},
}

Robustness tests in existing BFMs

We carreid out the search of BFMs from a few existing GitHub repositories, review papers, and directly on the internet. We selected a total of about 50 representative BFMs (mostly published in 2023-2024) in publication and preprints, covering a broad range of biomedical domains. We then extracted the relevant information on the model name, developers, modality, domain, capabilities, and any robustness tests that have been described for the each model. The information is gathered here. In the following, we break down the claimed robustness tests conducted for the BFMs. While about a third of the models don't have an explicit robustness test, a small number of models have been subject to multiple ones.

32%   None
32%   Evaluation on multiple existing datasets (including public datasets used for finetuning)
16%   Evaluation on artificially shifted datasets (including perturbed and synthetic datasets)
8%   Evaluation on external datasets (datasets not used in development)
8%   Ablation studies
6%   Others

None indicates no specified robustness tests. Most claimed robustness tests in the selected BFMs involve evaluation of model performance on some datasets, which we divide into three types: existing datasets, artificially shifted datasets, and external datasets.

Robustness categorization and examples

A combination of theoretical and application-oriented resources are collected for robustness. The categorization of robustness follows that provided in the paper.

Surveys, perspectives & tutorials (general domains)

Robustness in the context of foundation models

Surveys, perspectives & tutorials (biomedical domains)

Group robustness

Instance-wise/Individual robustness

Interventional robustness

Aggregated robustness

Uncertainty awareness & Uncertainty-aware robustness

Longitudinal/Temporal robustness

Vendor/Acquisition-shift robustness

Knowledge robustness

Behavioral robustness

Pref-BFM adversarial robustness (language)

Pref-BFM adversarial robustness (vision)

Robustness evaluation & monitoring

About

Companion repo for robustness specification in biomedical foundation models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published