Skip to content

Commit 0ddf884

Browse files
author
Tianyi Zhou
committed
add code
1 parent b620cef commit 0ddf884

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -35,14 +35,14 @@ A good teacher can adjust the curriculum based on students' learning history. By
3535
### Instructions
3636
- For now, we keep all the DIHCL code in `dihcl.py`. It supports multiple datasets and models. You can add your own options.
3737
- Example scripts to run DIHCL on CIFAR10/100 for training WideResNet-28-10 can be found in `dihcl_cifar.sh`.
38-
- Three types of DIH metrics in the paper are supported, i.e., the loss, the change of loss, and the flip of prediction correctness. Set or remove `--use_loss_as_feedback` to switch between the former two. The prediction flip is a binary feedback and thus we automatically use it when `--bandits_alg 'TS'` (Thompson sampling). The discounting factor of exponential moving average can be set as `--ema_decay 0.9`.
38+
- Three types of DIH metrics in the paper are supported, i.e., the loss, the change of loss, and the flip of prediction correctness. Add or remove `--use_loss_as_feedback` to switch between the former two. The prediction flip is a binary feedback and thus we automatically use it when `--bandits_alg 'TS'` (Thompson sampling). The discounting factor of exponential moving average can be set as `--ema_decay 0.9`.
3939
- A variety of multi-arm bandits algorithms can be used to enrourage exploration of less-selected samples (for better estimation of their DIH) when sampling data with large DIH. We current support `--bandits_alg 'UCB'`, `--bandits_alg 'EXP3'`, and `--bandits_alg 'TS'` (Thompson sampling).
4040
- We apply multiple episodes of training epochs, each following a cosine annealing learning rate decreasing from `--lr_max` to `--lr_min`. The episodes can be set by epoch numbers, for example, `--epochs 300 --schedule 0 5 10 15 20 30 40 60 90 140 210 300`.
4141
- DIHCL reduces the selected subset's size over the training episodes, starting from n (the total number of training samples). Set how to reduce the size by `--k 1.0 --dk 0.1 --mk 0.3` for example, which starts from a subset size (k * n) and multiplies it by (1 - dk) until reaching (mk * n).
42-
- To further reduce the subset in earlier epochs less than n and save more computation, set `--use_centrality` to further prune the DIH-selected subset to a few diverse and representative samples according to samples' centrality (defined on pairwise similarity between samples). Set the corresponding selection ratio and how do you want to change the ratio every episode, for example, `--select_ratio 0.5 --select_ratio_rate 1.1` will further reduce the DIH-selected subset to be half size in the first non-warm starting episode and then multiply the ratio by 1.1 for every future episode until selection_ratio = 1.
42+
- To further reduce the subset in earlier epochs less than n and save more computation, add `--use_centrality` to further prune the DIH-selected subset to a few diverse and representative samples according to samples' centrality (defined on pairwise similarity between samples). Set the corresponding selection ratio and how you want to change the ratio every episode, for example, `--select_ratio 0.5 --select_ratio_rate 1.1` will further reduce the DIH-selected subset to be its half size in the first non-warm-starting episode and then multiply this ratio by 1.1 for every future episode until selection_ratio = 1.
4343
- Centrality is an alternative of the facility location function in the paper in order to encourage diversity. The latter requires an external submodular maximization library and extra computation, compared to the centrality used here. We may add the option of submodular maximization in the future, but the centrality performs good enough on most tested tasks.
4444
- Self-supervised learning may help in some scenarios. Two types of self-supervision regularizations are supported, i.e., `--consistency` and `--contrastive`.
45-
- If one is interested to try DIHCL on noisy-label learning (though not the focus of the paper), set `--use_noisylabel` and specify the noisy type and ratio using `--label_noise_type` and `--label_noise_rate`.
45+
- If one is interested to try DIHCL on noisy-label learning (though not the focus of the paper), add `--use_noisylabel` and specify the noisy type and ratio using `--label_noise_type` and `--label_noise_rate`.
4646

4747
<b>License</b>\
4848
This project is licensed under the terms of the MIT license.

0 commit comments

Comments
 (0)