Skip to content

feat(qa): tune quality QA generation and compare prompts #73

@lipikaramaswamy

Description

@lipikaramaswamy

Context

From e2e testing:

  • Quality QA questions occasionally ask multiple things in one question, conflating multiple meaning units. This makes the compare step unfairly strict.
  • Quality compare sometimes penalizes generalized answers that preserve core meaning but differ in specificity from the original QA answer.

Scope

  • Prompt-tune meaning unit extraction and/or quality QA generation to produce single-focus questions
  • Guide the quality compare prompt to treat generalized-but-correct answers as matching
  • Measure impact on utility scores across bio and legal datasets

Files

  • src/anonymizer/engine/rewrite/qa_generation.py (QA generation prompts)
  • src/anonymizer/engine/rewrite/evaluate.py (compare prompt).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions