SubjECTive-QA: Measuring Subjectivity in Earnings Call Transcripts’ QA Through Six-Dimensional Feature Analysis.
Authors: Huzaifa Pardawala, Siddhant Sukhani, Agam Shah, Veer Kejriwal, Rohan Bhasin, Abhishek Pillai, Dhruv Adha, Tarun Mandapati, Andrew DiBiasio, Sudheer Chava
Access the SubjECTiveQA Paper here
Access the dataset on Hugging Face
The SubjECTive-QA dataset is available on Hugging Face. You can load the dataset using the following code:
from datasets import load_dataset
dataset = load_dataset("gtfintechlab/subjectiveqa", seed={SEED})
- 5768
- 78516
- 944601
Fact-checking is extensively studied in the context of misinformation and disinformation, addressing objective inaccuracies. However, a softer form of misinformation involves responses that are factually correct but lack features such as clarity and relevance. This challenge is prevalent in formal Question-Answer (QA) settings, like press conferences in finance, politics, and sports, where subjective answers can obscure transparency. Despite this, there is a lack of manually annotated datasets for subjective features across multiple dimensions. To address this gap, we introduce SubjECTive-QA, a dataset manually annotated on Earnings Call Transcripts (ECTs) by nine annotators. The dataset includes 2,747 annotated long-form QA pairs across six features: Assertive, Cautious, Optimistic, Specific, Clear, and Relevant.
Benchmarking on our dataset reveals that the Pre-trained Language Model (PLM) RoBERTa-base has similar weighted F1 scores to Llama-3-70b-Chat on features with lower subjectivity, like Relevant and Clear, with a mean difference of 2.17% in their weighted F1 scores. On features with higher subjectivity, like Specific and Assertive, RoBERTa-base significantly outperforms Llama-3-70b-Chat, with a mean difference of 10.01% in weighted F1 scores. Furthermore, generalizing SubjECTive-QA to White House Press Briefings and Gaggles demonstrates broader applicability, with an average weighted F1 score of 65.97%.
SubjECTive-QA is available under the CC BY 4.0 license.
The SubjECTive-QA models are also available on Hugging Face . You can perform inference with the models using the following code:
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification, AutoConfig
# Load the tokenizer, model, and configuration
tokenizer = AutoTokenizer.from_pretrained("gtfintechlab/SubjECTiveQA-{FEATURE}", do_lower_case=True, do_basic_tokenize=True)
model = AutoModelForSequenceClassification.from_pretrained("gtfintechlab/SubjECTiveQA-{FEATURE}", num_labels=3)
config = AutoConfig.from_pretrained("gtfintechlab/SubjECTiveQA-FEATURE")
# Initialize the text classification pipeline
classifier = pipeline('text-classification', model=model, tokenizer=tokenizer, config=config, framework="pt")
# Classify the 'FEATURE' attribute in your question-answer pairs
qa_pairs = [
"Question: What are your company's projections for the next quarter? Answer: We anticipate a 10% increase in revenue due to the launch of our new product line.",
"Question: Can you explain the recent decline in stock prices? Answer: Market fluctuations are normal, and we are confident in our long-term strategy."
]
results = classifier(qa_pairs, batch_size=128, truncation="only_first")
print(results)
-
LABEL_0: Negatively Demonstrative of 'FEATURE' (0)
-
LABEL_1: Neutral Demonstration of 'FEATURE' (1)
-
LABEL_2: Positively Demonstrative of 'FEATURE' (2)
-
Access the fine-tuned model with the best hyperparameters for
CLEAR
on Hugging Face. -
Access the fine-tuned model with the best hyperparameters for
RELEVANT
on Hugging Face. -
Access the fine-tuned model with the best hyperparameters for
CAUTIOUS
on Hugging Face. -
Access the fine-tuned model with the best hyperparameters for
ASSERTIVE
on Hugging Face. -
Access the fine-tuned model with the best hyperparameters for
OPTIMISTIC
on Hugging Face. -
Access the fine-tuned model with the best hyperparameters for
SPECIFIC
on Hugging Face.
To run the benchmarking on different models, use the following commands:
python3 openai_benchmarking.py --model "" --feature "" --api_key "" --max_tokens "" --temperature "" --top_p "" --frequency_penalty ""
python3 togetherai_benchmarking.py --model "" --feature "" --api_key "" --max_tokens "" --temperature "" --top_p "" --top_k "" --repetition_penalty ""
python3 plm_benchmarking.py
The SubjECTive-QA dataset is licensed under the Creative Commons Attribution 4.0 International License.
For any questions or concerns, feel free to reach out to [email protected].
This work has been accepted at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024), Datasets and Benchmarks Track.
@article{pardawala2024subjective,
title={SubjECTive-QA: Measuring Subjectivity in Earnings Call Transcripts' QA Through Six-Dimensional Feature Analysis},
author={Pardawala, Huzaifa and Sukhani, Siddhant and Shah, Agam and Kejriwal, Veer and Pillai, Abhishek and Bhasin, Rohan and DiBiasio, Andrew and Mandapati, Tarun and Adha, Dhruv and Chava, Sudheer},
journal={arXiv preprint arXiv:2410.20651},
year={2024}
}