You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Title: Integrate Custom Dataset from Databricks for Aspect Sentiment Triplet Extraction
Description
I am trying to integrate a custom dataset stored in Databricks for Aspect Sentiment Triplet Extraction (ASTE) using the pyabsa library. However, I am encountering an error related to dataset loading. Below are the details of my implementation and the issues I am facing.
# Load dataset from Databricks
dataset_path = "datasets/atepc_datasets/300.vokols/vokols.test.txt.atepc'"
dataset = '300.vokols'
trainer = ASTE.ASTETrainer(
config=config,
dataset=dataset,
checkpoint_save_mode=ModelSaveOption.SAVE_MODEL_STATE_DICT,
auto_device=True,
)
triplet_extractor = trainer.load_trained_model()
examples = [
"I love this laptop, it is very good.",
"I hate this laptop, it is very bad.",
"I like this laptop, it is very good.",
"I dislike this laptop, it is very bad.",
]
for example in examples:
prediction = triplet_extractor.predict(example)
print(prediction)
Error Encountered
vbnet
Copy code
ValueError: Cannot find dataset: 300.vokols, you may need to remove existing integrated_datasets and try again. Please note that if you are using keywords to let findfile search the dataset, you need to save your dataset(s) in integrated_datasets/task_name/dataset_name
Issues Faced
Dataset Loading: Clarification is needed on how to properly format and load a custom dataset from Databricks into the pyabsa library.
Integration: Guidance on ensuring that the custom dataset is correctly integrated and utilized during the training process.
Directory Structure: Instructions on the required directory structure for custom datasets to be recognized by pyabsa.
Steps to Reproduce
Place a custom dataset in Databricks (ensure it is in .atepc format).
Use the provided code to load the dataset and attempt to train the model.
Observe the error related to dataset loading.
Expected Behavior
The custom dataset should be loaded correctly, and the model should train and predict without errors.
The text was updated successfully, but these errors were encountered:
Title: Integrate Custom Dataset from Databricks for Aspect Sentiment Triplet Extraction
Description
I am trying to integrate a custom dataset stored in Databricks for Aspect Sentiment Triplet Extraction (ASTE) using the pyabsa library. However, I am encountering an error related to dataset loading. Below are the details of my implementation and the issues I am facing.
Code Implementation
python
Copy code
from pyabsa import (
ModelSaveOption,
DeviceTypeOption,
DatasetItem,
)
from pyabsa import AspectSentimentTripletExtraction as ASTE
import pandas as pd
if name == "main":
config = ASTE.ASTEConfigManager.get_aste_config_english()
config.max_seq_len = 120
config.log_step = -1
config.pretrained_bert = "bert-base-chinese"
config.num_epoch = 100
config.learning_rate = 2e-5
config.use_amp = True
config.cache_dataset = True
config.spacy_model = "zh_core_web_sm"
Error Encountered
vbnet
Copy code
ValueError: Cannot find dataset: 300.vokols, you may need to remove existing integrated_datasets and try again. Please note that if you are using keywords to let findfile search the dataset, you need to save your dataset(s) in integrated_datasets/task_name/dataset_name
Issues Faced
Dataset Loading: Clarification is needed on how to properly format and load a custom dataset from Databricks into the pyabsa library.
Integration: Guidance on ensuring that the custom dataset is correctly integrated and utilized during the training process.
Directory Structure: Instructions on the required directory structure for custom datasets to be recognized by pyabsa.
Steps to Reproduce
Place a custom dataset in Databricks (ensure it is in .atepc format).
Use the provided code to load the dataset and attempt to train the model.
Observe the error related to dataset loading.
Expected Behavior
The custom dataset should be loaded correctly, and the model should train and predict without errors.
The text was updated successfully, but these errors were encountered: