Log regression task metrics in multitask model #3648

ntravis22 · 2025-03-24T22:25:38Z

Recently per-task metrics were added to multitask_model, however we did not include any for regression tasks and we did not check that the metric keys are present which can throw an error, so this addresses both of those concerns.

MattGPT-ai

Are there metrics for regression type models that we could put into scores, or do those perhaps go into the "classification report" - e.g. Pearson, spearman. In the regression models, results is written as

            eval_metrics = {
                "loss": eval_loss.item(),
                "mse": metric.mean_squared_error(),
                "mae": metric.mean_absolute_error(),
                "pearson": metric.pearsonr(),
                "spearman": metric.spearmanr(),
            }

So maybe we could either check for the base model class that defines evaluate, or just check for the keys. Then maybe we could write e.g. scores[(task_id, 'mse')]. What do you think?

ntravis22 · 2025-03-25T16:08:54Z

Are there metrics for regression type models that we could put into scores, or do those perhaps go into the "classification report" - e.g. Pearson, spearman. In the regression models, results is written as
            eval_metrics = {
                "loss": eval_loss.item(),
                "mse": metric.mean_squared_error(),
                "mae": metric.mean_absolute_error(),
                "pearson": metric.pearsonr(),
                "spearman": metric.spearmanr(),
            }
So maybe we could either check for the base model class that defines evaluate, or just check for the keys. Then maybe we could write e.g. scores[(task_id, 'mse')]. What do you think?

Ok I added mse.

MattGPT-ai · 2025-03-25T17:47:12Z

Oh actually, can we just add all four of the metrics?

ntravis22 · 2025-03-25T19:05:42Z

Oh actually, can we just add all four of the metrics?

Done

alanakbik · 2025-03-26T12:43:08Z

@ntravis22 @MattGPT-ai Could you paste a script to test this PR?

ntravis22 · 2025-04-01T22:39:33Z

@alanakbik Here is a script to test:

from flair.data import Sentence, Corpus
from flair.embeddings import TransformerEmbeddings
from flair.models import TextRegressor
from flair.trainers import ModelTrainer
from flair.trainers.plugins.base import TrainerPlugin
from flair.nn.multitask import make_multitask_model_and_corpus


class MetricPlugin(TrainerPlugin):
    """Test plugin. In practice this could do something like logging metrics to WandB."""

    @TrainerPlugin.hook
    def metric_recorded(self, record):
        print(f"Metric: {record}")

sentences = [Sentence("This is a sentence") for _ in range(100)]
for sentence in sentences:
    sentence.add_label("regression_label", 1.0)
corpus = Corpus(sentences)
model = TextRegressor(TransformerEmbeddings('bert-base-uncased'), label_name="regression_label")

multitask_model, multicorpus = make_multitask_model_and_corpus([(model, corpus)])

trainer = ModelTrainer(multitask_model, multicorpus)
trainer.train('regression_model/', plugins=[MetricPlugin()])

Running this with the changes in this PR you can see lines printed like:

Metric: MetricRecord(dev/Task_0/mse at step 6, 1743547102.8555)
Metric: MetricRecord(dev/Task_0/mae at step 6, 1743547102.8556)
Metric: MetricRecord(dev/Task_0/pearson at step 6, 1743547102.8556)
Metric: MetricRecord(dev/Task_0/spearman at step 6, 1743547102.8556)

ntravis22 added 5 commits March 24, 2025 15:15

handle case where not all metrics present

eab64f5

small change

699377c

real fix

c2e7e77

formatting

bf608f0

black formatting

f3ea87d

MattGPT-ai reviewed Mar 25, 2025

View reviewed changes

add mse for regression tasks

aa375f5

ntravis22 changed the title ~~Prevent error if not all metrics are present for a task~~ Log mse for regression tasks (and make sure metrics are present) Mar 25, 2025

add other regression metrics

6e1e47d

ntravis22 changed the title ~~Log mse for regression tasks (and make sure metrics are present)~~ Log regression task metrics in multitask model Mar 25, 2025

MattGPT-ai approved these changes Mar 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log regression task metrics in multitask model #3648

Log regression task metrics in multitask model #3648

ntravis22 commented Mar 24, 2025 •

edited

Loading

MattGPT-ai left a comment

ntravis22 commented Mar 25, 2025

MattGPT-ai commented Mar 25, 2025

ntravis22 commented Mar 25, 2025

alanakbik commented Mar 26, 2025

ntravis22 commented Apr 1, 2025 •

edited

Loading

Log regression task metrics in multitask model #3648

Are you sure you want to change the base?

Log regression task metrics in multitask model #3648

Conversation

ntravis22 commented Mar 24, 2025 • edited Loading

MattGPT-ai left a comment

Choose a reason for hiding this comment

ntravis22 commented Mar 25, 2025

MattGPT-ai commented Mar 25, 2025

ntravis22 commented Mar 25, 2025

alanakbik commented Mar 26, 2025

ntravis22 commented Apr 1, 2025 • edited Loading

ntravis22 commented Mar 24, 2025 •

edited

Loading

ntravis22 commented Apr 1, 2025 •

edited

Loading