TrainingArguments missing 'mp_plugin' #768

eyal-converge · 2025-01-27T13:45:53Z

System Info

- I'm using the recommended Hugging Face Neuron Deep Learning AMI 
- Using `trn1.2xlarge`

Who can help?

ping @michaelbenayoun 😉
Getting AttributeError: 'TrainingArguments' object has no attribute 'mp_plugin' with a simple training script

train.py

from datasets import load_from_disk
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
from optimum.neuron import NeuronTrainer as Trainer
from transformers import (
    AutoTokenizer,
    AutoModelForSequenceClassification,
    TrainingArguments,
    DataCollatorWithPadding,
)

def compute_metrics(eval_pred):
    """Compute metrics for evaluation"""
    predictions, labels = eval_pred
    predictions = predictions.argmax(-1)

    accuracy = accuracy_score(labels, predictions)
    precision, recall, f1, _ = precision_recall_fscore_support(labels, predictions, average="macro", zero_division=0)

    return {"accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1}



print("Loading datasets...")
train_dataset = load_from_disk(x)
eval_dataset = load_from_disk(x)


print("Loading tokenizer and model...")
tokenizer = AutoTokenizer.from_pretrained(
    "x", trust_remote_code=True, token=os.environ["HF_TOKEN"]
)

num_labels = 2
print(f"Number of labels: {num_labels}")

model = AutoModelForSequenceClassification.from_pretrained(
    "x",
    trust_remote_code=True,
    token=os.environ["HF_TOKEN"],
    num_labels=num_labels,
    force_download=True,
)


run_name = f"x"

training_args = TrainingArguments(
    run_name=run_name,
    output_dir=args"x
    num_train_epochs="x
    evaluation_strategy="x
    logging_strategy="x
    save_strategy="x
    save_total_limit="x
    per_device_train_batch_size="x
    per_device_eval_batch_size="x
    learning_rate="x
    weight_decay="x
    warmup_ratio="x
    metric_for_best_model="x
    greater_is_better="x
    report_to="x"
)


trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer, padding=True, max_length=20000),
)

trainer.train()

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

Looking at optimum source code - this makes total sense
optimum/neuron/trainers.py at L292

self.accelerator = NeuronAccelerator(
            *args,
            mp_plugin=self.args.mp_plugin,
            zero_1=self.args.zero_1,
            mixed_precision="bf16" if self.args.bf16 else "no",
            autocast_backend=self.args.half_precision_backend,
        )

As self.args type Is TrainingArguments which doesn't have mp_plugin

Expected behavior

To work and start training similar to using the regular Trainer from transformers package

The text was updated successfully, but these errors were encountered:

eyal-converge · 2025-01-30T15:24:54Z

Hi @michaelbenayoun
Any chance you had time to look into it ?

eyal-converge · 2025-02-03T10:40:05Z

@michaelbenayoun
your help here would be highly appreciated

michaelbenayoun · 2025-02-03T10:50:29Z

Hi Eyal,
Let me take a look at it today!

eyal-converge · 2025-02-10T13:08:49Z

@michaelbenayoun any update 🤗 ?

michaelbenayoun · 2025-02-21T13:36:23Z

Indeed TrainingArguments does not have a mp_plugin attribute.
You should use NeuronTrainingArguments instead. They are a drop-in replacement for TrainingArguments that are designed to work with the NeuronTrainer.

eyal-converge · 2025-02-23T08:53:38Z

Indeed TrainingArguments does not have a mp_plugin attribute. You should use NeuronTrainingArguments instead. They are a drop-in replacement for TrainingArguments that are designed to work with the NeuronTrainer.

Thanks for the reply
I re-visit the optimum-neuron docs, maybe worth fixing It there - Link

julien-c · 2025-02-23T17:58:45Z

indeed – feel free to suggest a quick PR to improve the doc!

# What does this PR do? Fixes #768

eyal-converge added the bug Something isn't working label Jan 27, 2025

michaelbenayoun self-assigned this Jan 27, 2025

eyal-converge closed this as completed Feb 23, 2025

michaelbenayoun mentioned this issue Feb 24, 2025

Improves the doc for NeuronTrainingArguments #794

Merged

michaelbenayoun added a commit that referenced this issue Feb 26, 2025

Improves the doc for NeuronTrainingArguments (#794)

7dd1234

# What does this PR do? Fixes #768

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TrainingArguments missing 'mp_plugin' #768

TrainingArguments missing 'mp_plugin' #768

eyal-converge commented Jan 27, 2025 •

edited

Loading

eyal-converge commented Jan 30, 2025

eyal-converge commented Feb 3, 2025 •

edited

Loading

michaelbenayoun commented Feb 3, 2025

eyal-converge commented Feb 10, 2025

michaelbenayoun commented Feb 21, 2025

eyal-converge commented Feb 23, 2025

julien-c commented Feb 23, 2025

TrainingArguments missing 'mp_plugin' #768

TrainingArguments missing 'mp_plugin' #768

Comments

eyal-converge commented Jan 27, 2025 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

eyal-converge commented Jan 30, 2025

eyal-converge commented Feb 3, 2025 • edited Loading

michaelbenayoun commented Feb 3, 2025

eyal-converge commented Feb 10, 2025

michaelbenayoun commented Feb 21, 2025

eyal-converge commented Feb 23, 2025

julien-c commented Feb 23, 2025

eyal-converge commented Jan 27, 2025 •

edited

Loading

eyal-converge commented Feb 3, 2025 •

edited

Loading