docs: add missing updates from final_polish in ulf

molereddy · molereddy · commit 9e3b9bba3f57 · 2025-03-01T13:49:32.000Z
diff --git a/README.md b/README.md
@@ -38,7 +38,7 @@ We provide several variants for each of the components in the unlearning pipelin
 | **Unlearning Methods** | GradAscent, GradDiff, NPO, SimNPO, DPO |
 | **Evaluation Metrics** | Verbatim Probability, Verbatim ROUGE, QA-ROUGE, MIA Attacks, TruthRatio, Model Utility |
 | **Datasets**          | MUSE-News (BBC), MUSE-Books (Harry Potter), TOFU (different splits) |
-| **Model Families**    | LLaMA 3.2, LLaMA 3.1, LLaMA-2, Phi-3.5, ICLM (from MUSE), Phi-1.5, Gemma |
+| **Model Families**    | TOFU: LLaMA-3.2, LLaMA-3.1, LLaMA-2; MUSE: LLaMA-2, ICLM; Additional: Phi-3.5, Phi-1.5, Gemma |
 
 ---
 
@@ -48,14 +48,17 @@ We provide several variants for each of the components in the unlearning pipelin
 - ⚡ [Quickstart](#-quickstart)
   - 🛠️ [Environment Setup](#-environment-setup)
   - 💾 [Data Setup](#-data-setup)
-  - 📜 [Running Baseline Experiments](#-running-baseline-experiments)
+- 🔄 [Updated TOFU benchmark](#-updated-tofu-benchmark)
 - 🧪 [Running Experiments](#-running-experiments)
   - 🚀 [Perform Unlearning](#-perform-unlearning)
   - 📊 [Perform an Evaluation](#-perform-an-evaluation)
+  - 📜 [Running Baseline Experiments](#-running-baseline-experiments)
 - ➕ [How to Add New Components](#-how-to-add-new-components)
 - 📚 [Further Documentation](#-further-documentation)
 - 🔗 [Support & Contributors](#-support--contributors)
-- 📝 [Citation](#-citation)
+- 📝 [Citing this work](#-citating-this-work)
+- 🤝 [Acknowledgements](#-acknowledgements)
+- 📄 [License](#-license)
 
 ---
 
@@ -79,6 +82,14 @@ python setup_data.py # populates saves/eval with evaluation results of the uploa
 
 ---
 
+### 🔄 Updated TOFU benchmark
+
+We've updated Open-Unlearning's TOFU benchmark target models to use a wider variety of newer architectures with sizes varying from 1B to 8B. These include LLaMA 3.2 1B, LLaMA 3.2 3B, LLaMA 3.1 8B, and the original LLaMA-2 7B from [the old version of TOFU](github.com/locuslab/tofu). 
+
+For each architecture, we have finetuned with four different splits of the TOFU datasets: `full`, `retain90`, `retain95`, `retain99`, for a total of 16 finetuned models. The first serves as the target (base model for unlearning) and the rest are retain models used to measure performance against for each forget split. These models are on [HuggingFace](`https://huggingface.co/collections/open-unlearning/tofu-new-models-67bcf636334ea81727573a9f0`) and the paths to these models can be set in the experimental configs or in command-line overrides.
+
+---
+
 ## 🧪 Running Experiments
 
 We provide an easily configurable interface for running evaluations by leveraging Hydra configs. For a more detailed documentation of aspects like running experiments, commonly overriden arguments, interfacing with configurations, distributed training and simple finetuning of models, refer [`docs/experiments.md`](docs/experiments.md).
@@ -107,12 +118,13 @@ python src/eval.py --config-name=eval.yaml experiment=eval/tofu/default \
   task_name=SAMPLE_EVAL
 ```
 
-- `experiment`-Path to the evaluation configuration [`configs/experiment/eval/tofu/default.yaml`](configs/experiment/eval/tofu/default.yaml).
+- `experiment`- Path to the evaluation configuration [`configs/experiment/eval/tofu/default.yaml`](configs/experiment/eval/tofu/default.yaml).
 - `model`- Sets up the model and tokenizer configs for the `Llama-3.2-1B-Instruct` model.
 - `model.model_args.pretrained_model_name_or_path`- Overrides the default experiment config to evaluate a model from a HuggingFace ID (can use a local model checkpoint path as well).
 
 For more details about creating and running evaluations, refer [`docs/evaluation.md`](docs/evaluation.md).
 
+
 ### 📜 Running Baseline Experiments
 The scripts below execute standard baseline unlearning experiments on the TOFU and MUSE datasets, evaluated using their corresponding benchmarks. The expected results for these are in [`docs/results.md`](docs/results.md).
 
@@ -130,7 +142,7 @@ Adding a new component (trainer, evaluation metric, benchmark, model, or dataset
 Please feel free to raise a pull request for any new features after setting up the environment in development mode.
 
 ```bash
-pip install .[flash-attn, dev]
+pip install .[dev]
 ```
 
 ## 📚 Further Documentation
@@ -152,11 +164,7 @@ Developed and maintained by Vineeth Dorna ([@Dornavineeth](https://github.com/Do
 
 If you encounter any issues or have questions, feel free to raise an issue in the repository 🛠️.
 
-## 📝 Citation
-
-This repo is inspired from [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory). We acknowledge the [TOFU](https://github.com/locuslab/tofu) and [MUSE](https://github.com/jaechan-repo/muse_bench) benchmarks, which served as the foundation for our re-implementation.
-
----
+## 📝 Citing this work
 
 If you use OpenUnlearning in your research, please cite:
 
@@ -176,7 +184,7 @@ If you use OpenUnlearning in your research, please cite:
 }
 ```
 <details>
-  <summary>To cite other benchmarks used from OpenUnlearning</summary>
+  <summary>Expand for bibtex to cite other benchmarks used from OpenUnlearning</summary>
 
   ```bibtex
   @article{shi2024muse,
@@ -188,8 +196,14 @@ If you use OpenUnlearning in your research, please cite:
 ```
 </details>
 
+---
+
+### 🤝 Acknowledgments
+
+- This repo is inspired from [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory). 
+- The [TOFU](https://github.com/locuslab/tofu) and [MUSE](https://github.com/jaechan-repo/muse_bench) benchmarks served as the foundation for our re-implementation. 
 
 ---
 
-## 📄 License
+### 📄 License
 This project is licensed under the MIT License. See the [`LICENSE`](LICENSE) file for details.
diff --git a/docs/components.md b/docs/components.md
@@ -142,10 +142,10 @@ A benchmark, aggregates various evaluation metrics into a suite, e.g. TOFU, MUSE
 
 ## Model
 
-To add a new model:
+To add a new model architecture:
 
 ### Implement and register a handler
-For all the models currently supported, HuggingFace's `AutoModelForCausalLM` and `AutoTokenizer` are used, and therefore the user doesn't need to add or register any handler.
+For all the models currently supported, HuggingFace's `AutoModelForCausalLM` and `AutoTokenizer` are used, and therefore the user doesn't need to create or register any handler.
 
 __Note__: Currently, we do not support loading models modified with LoRA and related variants. If you wish use such features, please create define and register model handlers for this logic in [`src/model`](../src/model) and provide the config info as discussed next.
 
@@ -233,7 +233,12 @@ defaults: # load pre-defined configs for model, trainer, data format, datasets e
   - override /eval: tofu
 
 # Now, we have to further modify specific arguments from the defaults imported above
-# This enables to easily run multiple experiments varying hyper paramters, data splits, models etc
+# This enables easily running multiple experiments varying hyper paramters, data splits, models etc
+
+model:
+  model_args: # use our finetuned target models for the TOFU benchmark task
+    pretrained_model_name_or_path: open-unlearning/tofu_Llama-3.2-1B-Instruct_full
+
 forget_split: forget10 
 retain_split: retain90
 retain_logs_path: null
diff --git a/docs/evaluation.md b/docs/evaluation.md
@@ -18,9 +18,9 @@ python src/eval.py --config-name=eval.yaml \
   model.model_args.pretrained_model_name_or_path=<LOCAL_MODEL_PATH> \
   task_name=SAMPLE_EVAL
 ```
-- `--config-name=eval.yaml`-sets task to be [`configs/eval.yaml`](../configs/eval.yaml)
-- `experiment=eval/tofu/default`-set experiment to use [`configs/eval/tofu/default.yaml`](../configs/eval/tofu/default.yaml)
-- `model=Llama-3.2-3B-Instruct`-override the default (`Llama-3.2-1B-Instruct`) model config to use [`configs/model/Llama-3.2-3B-Instruct`](../configs/model/Phi-3.5-mini-instruct.yaml).
+- `--config-name=eval.yaml`- sets task to be [`configs/eval.yaml`](../configs/eval.yaml)
+- `experiment=eval/tofu/default`- set experiment to use [`configs/eval/tofu/default.yaml`](../configs/eval/tofu/default.yaml)
+- `model=Llama-3.2-3B-Instruct`- override the default (`Llama-3.2-1B-Instruct`) model config to use [`configs/model/Llama-3.2-3B-Instruct`](../configs/model/Phi-3.5-mini-instruct.yaml).
 
 
 Run the MUSE-Books benchmark evaluation on a checkpoint of a Phi-3.5 model:
@@ -80,12 +80,12 @@ def forget_quality(model, **kwargs):
   return {"agg_value": pvalue}
 
 ```
-- `@unlearning_metric(name="rouge")`-Defines a `rouge` handler.
+- `@unlearning_metric(name="rouge")` - Defines a `rouge` handler.
 
 #### 2. Register the metric handler
 Register the handler to link the class to the configs via the class name in [`METRIC_REGISTRY`](../src/evals/metrics/__init__.py).
 
-Example: Registering `rouge` handler
+Example: Registering the `rouge` handler
 
 ```python
 from evals.metrics.memorization import rouge
@@ -150,7 +150,7 @@ reference_logs:
       forget_truth_ratio: # keys to include from the logs
         access_key: retain # name of the key to access it inside metric
       
-# since the forget_quality metric depends on another metric, truth ratio
+# since the forget_quality metric depends on another metric (truth ratio)
 pre_compute:
   forget_truth_ratio:
     access_key: forget
diff --git a/docs/experiments.md b/docs/experiments.md
@@ -14,35 +14,31 @@ At the core, three main Hydra configs—`train.yaml` (generic training), `eval.y
 ---
 
 ### Table of Contents
-- [Configuring and running experiments](#configuring-and-running-experiments)
-  - [Overview](#overview)
-    - [Table of Contents](#table-of-contents)
-  - [Example Commands](#example-commands)
-  - [Commonly Overridden Arguments](#commonly-overridden-arguments)
-    - [Model Settings](#model-settings)
-    - [Trainer Settings](#trainer-settings)
-    - [Data Settings](#data-settings)
-    - [Experiment Settings](#experiment-settings)
-  - [Simple Finetuning](#simple-finetuning)
-  - [Distributed Training](#distributed-training)
+- [Overview](#overview)
+- [Table of Contents](#table-of-contents)
+- [Example Commands](#example-commands)
+- [Commonly Overridden Arguments](#commonly-overridden-arguments)
+  - [Model Settings](#model-settings)
+  - [Trainer Settings](#trainer-settings)
+  - [Data Settings](#data-settings)
+  - [Experiment Settings](#experiment-settings)
+- [Simple Finetuning](#simple-finetuning)
+- [Distributed Training](#distributed-training)
 
 ---
 
 ## Example Commands
 
 ```bash
 ## runs a finetuning using experiment details from configs/finetune/tofu/default.yaml
-python src/train.py --config-name=train.yaml experiment=finetune/tofu/default \
-task_name=SAMPLE_TRAIN
+python src/train.py --config-name=train.yaml experiment=finetune/tofu/default task_name=SAMPLE_TRAIN
 
 ## runs an unlearning training using experiment details from configs/unlearn/tofu/default.yaml
-python src/train.py --config-name=unlearn.yaml experiment=unlearn/tofu/default \
-task_name=SAMPLE_TRAIN
+python src/train.py --config-name=unlearn.yaml experiment=unlearn/tofu/default task_name=SAMPLE_TRAIN
 
 
 ## runs an evaluation using experiment details from configs/eval/muse/default.yaml
-python src/eval.py --config-name=eval.yaml experiment=eval/muse/default \
-task_name=SAMPLE_EVAL
+python src/eval.py --config-name=eval.yaml experiment=eval/muse/default task_name=SAMPLE_EVAL
 ## Note: eval.yaml is the default config set in src/eval.py, so this argument can be omitted
 
 ## an extensively filled out configuration for an unlearning experiment
@@ -249,5 +245,5 @@ CUDA_VISIBLE_DEVICES=0,1 accelerate launch \
 **Note:** Evaluation runs are designed to work only a single GPU (this includes running evaluation during training). To run an evaluation job, modify your command to make only one GPU visible (assuming one GPU is enough for inference):
 
 ```bash
-CUDA_VISIBLE_DEVICES=0 python src/eval.py  experiment=eval/muse/default.yaml task_name=DISTRIBUTED_EVAL
+CUDA_VISIBLE_DEVICES=0 python src/eval.py  experiment=eval/muse/default.yaml task_name=SAMPLE_EVAL
 ```
diff --git a/docs/hydra.md b/docs/hydra.md
@@ -14,6 +14,7 @@ defaults:
 # , setting up data structure for loading data during unlearning
 - override /eval: muse # loads MUSE evaluation suite from eval/muse.yaml into the eval attribute 
 
+# define variables
 data_split: News
 forget_split: forget
 retain_split: retain1
@@ -54,23 +55,25 @@ trainer:
         # optim: paged_adamw_32bit
         # optim: adamw_torch
 
-task_name: ???
+task_name: ??? # ??? raises and error if this attribute is not set
 ```
+- **Structure & Attribute Access:** Configs are written in YAML and structured hierarchically like a dictionary. Attributes are accessed using dot notation: In code `cfg.model.args.learning_rate`, in command-line: `model.args.learning_rate=1e-5`.
 
-- **Defaults & Overrides:**  Base configurations are overridden using the `defaults` list. 
+- **Defaults & Overrides:**  Configs are files are included in one another using `defaults` and `override` commands. 
+
+- **Command-Line Overrides:**  Any parameter can be overridden directly from the command line. For instance:
+```bash
+python src/train.py --config-name=unlearn.yaml experiment=unlearn/muse/default \
+trainer.args.num_train_epochs=50 data_split=Books trainer=SimNPO trainer.method_args.beta=3 \
+task_name=unlearn_muse_simnpo
+```
 
 - **Package Directives:**  The `# @package` directive organizes configurations into namespaces for cleaner composition and specifies the configuration path. At the head of a YAML file, you might see directives like `# @package _global_` or more specific ones such as `# @package eval.muse.metrics.forget_knowmem_ROUGE` which inform Hydra exactly where the configuration parameters should be placed within the final composed config.
 
     For example, refer [`configs/eval/muse_metrics/forget_knowmem_ROUGE.yaml`](../configs/eval/muse_metrics/forget_knowmem_ROUGE.yaml) 
 
 - **Variable Substitution:**  Variables are defined once and reused using the `${}` syntax:
 
-- **Command-Line Overrides:**  Any parameter can be overridden directly from the command line. For instance:
-```bash
-python src/train.py --config-name=unlearn.yaml experiment=unlearn/muse/default \
-trainer.args.num_train_epochs=50 data_split=Books trainer=SimNPO trainer.method_args.
-```
-
 
 To understand the structure of an evaluation config and the available parameters for overriding, refer to: [`configs/experiment/examples/tofu_eval.yaml`](../configs/experiment/examples/tofu_eval.yaml).
 
diff --git a/docs/results.md b/docs/results.md
@@ -4,7 +4,7 @@
 
 </div>
 
-The scripts below execute standard baseline unlearning experiments on the TOFU and MUSE datasets, evaluated using their corresponding benchmarks.
+The scripts below execute standard baseline unlearning experiments on the TOFU and MUSE datasets, evaluated using their corresponding benchmarks. 
 ```bash
 bash scripts/tofu_unlearn.sh
 bash scripts/muse_unlearn.sh
@@ -27,7 +27,8 @@ __Note:__
 2. NPO in MUSE: for NPO, the MUSE implementation is inconsistent with the [original paper](https://github.com/licong-lin/negative-preference-optimization) as discussed [here]( https://github.com/jaechan-repo/muse_bench/issues/2). This inconsistency is carried over into implementations like [SimNPO](https://github.com/OPTML-Group/Unlearn-Simple/issues/5). Here, we use the original NPO implementation with the same loss function expression across datasets.
 
 
-### TOFU  unlearning on `Llama-2-7b-hf-chat`
+
+### TOFU unlearning on the `Llama-2-7b-hf-chat` architecture
 
 <div style="overflow-x: auto; max-width: 100%;"t>
 <table class="dataframe">
@@ -144,7 +145,7 @@ __Note:__
 </div>
 
 
-### TOFU  unlearning on `Llama-3.2-1B-Instruct`
+### TOFU unlearning on the `Llama-3.2-1B-Instruct` architecture
 
 <div style="overflow-x: auto; max-width: 100%;">
 <table class="dataframe">
@@ -261,7 +262,7 @@ __Note:__
 </div>
 
 
-### MUSE  unlearning on `Llama-2-7b-hf`
+### MUSE unlearning on the benchmark's target models
 
 <div style="overflow-x: auto; max-width: 100%;">
 <table class="dataframe">
@@ -299,11 +300,11 @@ __Note:__
       <th>Retain</th>
       <td>0.33</td>
       <td>0.21</td>
-      <td>0.0</td>
+      <td>0</td>
       <td>0.56</td>
       <td>0.3</td>
       <td>0.14</td>
-      <td>0.0</td>
+      <td>0</td>
       <td>0.69</td>
     </tr>
     <tr>
@@ -355,4 +356,4 @@ __Note:__
     </tr>
   </tbody>
 </table>
-</div>
+</div>