Skip to content

Releases: argilla-io/argilla

v1.18.0

25 Oct 16:03
34389fd
Compare
Choose a tag to compare

🔆 Release highlights

💾 Add metadata properties to Feedback Datasets

You can now filter and sort records in Feedback Datasets in the UI and Python SDK using the metadata included in the records. To do that, you will first need to set up a MetadataProperty in your dataset:

# set up a dataset including metadata properties
dataset = rg.FeedbackDataset(
    fields=[
        rg.TextField(name="prompt"),
        rg.TextField(name="response"),
    ],
    questions=[
        rg.TextQuestion(name="question")
    ],
    metadata_properties=[
        rg.TermsMetadataProperty(name="source"),
        rg.IntegerMetadataProperty(name="response_length", title="Response length")
    ]
)

Learn more about how to define metadata properties or adding or deleting metadata properties in existing datasets.

This will read the metadata in the records that match the name of the metadata property. Any other metadata present in the record not matching a metadata property will be saved but not available to use in the filtering and sorting features in the UI or SDK.

# create a record with metadata
record = rg.FeedbackRecord(
    fields={
        "prompt": "Why can camels survive long without water?",
        "response": "Camels use the fat in their humps to keep them filled with energy and hydration for long periods of time."
    },
    metadata={"source": "wikipedia", "response_length": 105, "my_hidden_metadata": "hidden metadata"}
)

Learn more about how to create records with metadata and how to add, modify or delete metadata from existing records.

🗃️ Filter and sort records using metadata in Feedback Datasets

In the Python SDK, you can filter and sort records based on the Metadata Properties that you set up for your dataset. You can combine multiple filters and sorts. Here is an example of how you could use them:

filtered_records = remote.filter_by(
    metadata_filters=[
        rg.IntegerMetadataFilter(
            name="response_length",
            ge=500, # optional: greater or equal to
            le=1000 # optional: lower or equal to
        ),
        rg.TermsMetadataFilter(
            name="source", 
            values=["wikipedia", "wikihow"]
        )
    ]
).sort_by(
    [
        rg.SortBy(
            field="response_length",
            order="desc" # for descending or "asc" for ascending
        )
    ]

In the UI, simply use the Metadata and Sort components to filter and sort records like this:

metadata_filter_ui.mp4

Read more about filtering and sorting in Feedback Datasets.

⚠️ Breaking change using SQLite as backend in a docker deployment

From version 1.17.0 a new argilla os user is configured for the provided docker images. If you are using the docker deployment and you want to upload to this version from versions older than v1.17.0 (If you already updated from v1.17.0 this step was already applied - see Release Notes), you should change permissions to the SQLite db file, before upgrading the version. You can do it with the following action:

docker exec --user root <argilla_server_container_id> /bin/bash -c 'chmod -R 777 "$ARGILLA_HOME_PATH"'

Note: You can find the docker container id by running:

docker ps  | grep -i argilla-server
713973693fb7   argilla/argilla-server:v1.16.0                "/bin/bash start_arg…"   11 hours ago   Up 7 minutes       0.0.0.0:6900->6900/tcp                           docker-argilla-1

Once the version is upgraded, we recommend to provided proper security access to this folder by setting the user and group to the new argilla user:

docker exec --user root <argilla_server_container_id>  /bin/bash -c 'chown -R argilla:argilla "$ARGILLA_HOME_PATH"'

1.18.0 Changelog

Added

  • New GET /api/v1/datasets/:dataset_id/metadata-properties endpoint for listing dataset metadata properties. (#3813)
  • New POST /api/v1/datasets/:dataset_id/metadata-properties endpoint for creating dataset metadata properties. (#3813)
  • New PATCH /api/v1/metadata-properties/:metadata_property_id endpoint allowing the update of a specific metadata property. (#3952)
  • New DELETE /api/v1/metadata-properties/:metadata_property_id endpoint for deletion of a specific metadata property. (#3911)
  • New GET /api/v1/metadata-properties/:metadata_property_id/metrics endpoint to compute metrics for a specific metadata property. (#3856)
  • New PATCH /api/v1/records/:record_id endpoint to update a record. (#3920)
  • New PATCH /api/v1/dataset/:dataset_id/records endpoint to bulk update the records of a dataset. (#3934)
  • Missing validations to PATCH /api/v1/questions/:question_id. Now title and description are using the same validations used to create questions. (#3967)
  • Added TermsMetadataProperty, IntegerMetadataProperty and FloatMetadataProperty classes allowing to define metadata properties for a FeedbackDataset. (#3818)
  • Added metadata_filters to filter_by method in RemoteFeedbackDataset to filter based on metadata i.e. TermsMetadataFilter, IntegerMetadataFilter, and FloatMetadataFilter. (#3834)
  • Added a validation layer for both metadata_properties and metadata_filters in their schemas and as part of the add_records and filter_by methods, respectively. (#3860)
  • Added sort_by query parameter to listing records endpoints that allows to sort the records by inserted_at, updated_at or metadata property. (#3843)
  • Added add_metadata_property method to both FeedbackDataset and RemoteFeedbackDataset (i.e. FeedbackDataset in Argilla). (#3900)
  • Added fields inserted_at and updated_at in RemoteResponseSchema. (#3822)
  • Added support for sort_by for RemoteFeedbackDataset i.e. a FeedbackDataset uploaded to Argilla. (#3925)
  • Added metadata_properties support for both push_to_huggingface and from_huggingface. (#3947)
  • Add support for update records (metadata) from Python SDK. (#3946)
  • Added delete_metadata_properties method to delete metadata properties. (#3932)
  • Added update_metadata_properties method to update metadata_properties. (#3961)
  • Added automatic model card generation through ArgillaTrainer.save (#3857)
  • Added FeedbackDataset TaskTemplateMixin for pre-defined task templates. (#3969)
  • A maximum limit of 50 on the number of options a ranking question can accept. (#3975)
  • New last_activity_at field to FeedbackDataset exposing when the last activity for the associated dataset occurs. (#3992)

Changed

  • GET /api/v1/datasets/{dataset_id}/records, GET /api/v1/me/datasets/{dataset_id}/records and POST /api/v1/me/datasets/{dataset_id}/records/search endpoints to return the total number of records. (#3848, #3903)
  • Implemented __len__ method for filtered datasets to return the number of records matching the provided filters. (#3916)
  • Increase the default max result window for Elasticsearch created for Feedback datasets. (#3929)
  • Force elastic index refresh after records creation. (#3929)
  • Validate metadata fields for filtering and sorting in the Python SDK. (#3993)
  • Using metadata property name instead of id for indexing data in search engine index. (#3994)

Fixed

  • Fixed response schemas to allow values to be None i.e. when a record is discarded the response.values are set to None. (#3926)
  • New Contributors

Full Changelog: v1.17.0...v1.18.0

v1.17.0

19 Oct 10:36
34f4486
Compare
Choose a tag to compare

☀️ Highlights

This release comes with a lot of new goodies and quality improvements. We added model card support for the ArgillaTrainer, worked on the FeedbackDataset task templates and added timestamps to responses. We also fixed a lot of bugs and improved the overall quality of the codebase. Enjoy!

🚨 Breaking change in updating existing Hugging Face Spaces deployments

The quickstart image startup script was changed from from /start_quickstart.sh to /home/argilla/start_quickstart.sh, which might cause existing Hugging Face Spaces deployments to malfunction. A fix was added for the Argilla template space via this PR. Alternatively, you can just create a new deployment.

⚠️ Breaking change using SQLite as backend in a docker deployment

From version 1.17.0 a new argilla os user is configured for the provided docker images. If you are using the docker deployment and you want to upload to this version, you should do some actions once update your container and before working with Argilla. Execute the following command:

docker exec --user root <argilla_server_container_id> /bin/bash -c 'chown -R argilla:argilla "$ARGILLA_HOME_PATH"'

This will change the permissions on the argilla home path, which allows it to work with new containers.

Note: You can find the docker container id by running:

docker ps  | grep -i argilla-server
713973693fb7   argilla/argilla-server:v1.17.0                "/bin/bash start_arg…"   11 hours ago   Up 7 minutes       0.0.0.0:6900->6900/tcp                           docker-argilla-1

💾 ArgillaTrainer Model Card Generation

The ArgillaTrainer now supports automatic model card generation. This means that you can now generate a model card with all the required info for Hugging Face and directly share these models to the hub, as you would expect within the Hugging Face ecosystem. See the docs for more info.

model_card_kwargs = {
    "language": ["en", "es"],
    "license": "Apache-2.0",
    "model_id": "all-MiniLM-L6-v2",
    "dataset_name": "argilla/emotion",
    "tags": ["nlp", "few-shot-learning", "argilla", "setfit"],
    "model_summary": "Small summary of what the model does",
    "model_description": "An extended explanation of the model",
    "model_type": "A 1.3B parameter embedding model fine-tuned on an awesome dataset",
    "finetuned_from": "all-MiniLM-L6-v2",
    "repo": "https://github.com/..."
    "developers": "",
    "shared_by": "",
}

trainer = ArgillaTrainer(
    dataset=dataset,
    task=task,
    framework="setfit",
    framework_kwargs={"model_card_kwargs": model_card_kwargs}
)
trainer.train(output_dir="my_model")
# or get the card as `str` by calling the `generate_model_card` method
argilla_model_card = trainer.generate_model_card("my_model")

🦮 FeedbackDataset Task Templates

The Argilla FeedbackDataset now supports a number of task templates that can be used to quickly create a dataset for specific tasks out of the box. This should help starting users get right into the action without having to worry about the dataset structure. We support basic tasks like Text Classification but also allow you to setup complex RAG-pipelines. See the docs for more info.

import argilla as rg

ds = rg.FeedbackDataset.for_text_classification(
    labels=["positive", "negative"],
    multi_label=False,
    use_markdown=True,
    guidelines=None,
)
ds
# FeedbackDataset(
#   fields=[TextField(name="text", use_markdown=True)],
#   questions=[LabelQuestion(name="label", labels=["positive", "negative"])]
#   guidelines="<Guidelines for the task>",
# )

⏱️ inserted_at and updated_at are added to responses

What are responses without timestamps? The RemoteResponseSchema now supports inserted_at and updated_at fields. This should help you to keep track of the time when a response was created and updated. Perfectly, for keeping track of annotator performance within your company.

1.17.0

Added

  • Added fields inserted_at and updated_at in RemoteResponseSchema (#3822).
  • Added automatic model card generation through ArgillaTrainer.save (#3857).
  • Added task templates to the FeedbackDataset (#3973).

Changed

  • Updated Dockerfile to use multi stage build (#3221 and #3793).
  • Updated active learning for text classification notebooks to use the most recent small-text version (#3831).
  • Changed argilla dataset name in the active learning for text classification notebooks to be consistent with the default names in the huggingface spaces (#3831).
  • FeedbackDataset API methods have been aligned to be accessible through the several implementations (#3937).
  • The unify_responses support for remote datasets (#3937).

Fixed

  • Fix field not shown in the order defined in the dataset settings. Closes #3959 (#3984)
  • Updated active learning for text classification notebooks to pass ids of type int to TextClassificationRecord (#3831).
  • Fixed record fields validation that was preventing from logging records with optional fields (i.e. required=True) when the field value was None (#3846).
  • Always set pretrained_model_name_or_path attribute as string in ArgillaTrainer (#3914).
  • The inserted_at and updated_at attributes are create using the utcnow factory to avoid unexpected race conditions on timestamp creation (#3945)
  • Fixed configure_dataset_settings when providing the workspace via the arg workspace (#3887).
  • Fixed saving of models trained with ArgillaTrainer with a peft_config parameter (#3795).
  • Fixed backwards compatibility on from_huggingface when loading a FeedbackDataset from the Hugging Face Hub that was previously dumped using another version of Argilla, starting at 1.8.0, when it was first introduced (#3829).
  • Fixed TrainingTaskForQuestionAnswering.__repr__ (#3969)
  • Fixed potential dictionary key-errors in TrainingTask.prepare_for_training_with_*-methods (#3969)

Deprecated

  • Function rg.configure_dataset is deprecated in favour of rg.configure_dataset_settings. The former will be removed in version 1.19.0

New Contributors

Full Changelog: v1.16.0...v1.17.0

v1.16.0

18 Sep 10:20
01d59c9
Compare
Choose a tag to compare

☀️ Highlights

This release comes with an auto save feature for the UI, an enhanced Argilla CLI app, new keyboard shortcuts for the annotation process in the Feedback Dataset and new integrations for the ArgillaTrainer.

💾 Auto save

Argilla UI Feedback Record getting auto saved

Have you been writing a long corrected text in a TextField for a completion given by an LLM and you have refreshed the page before submitting it? Well, since this release you are covered! The Argilla UI will save every few seconds the responses given in the annotation form of a FeedbackDataset. Annotators can partially annotate one record and then come back to finish the annotation process without losing the previous work.

👨🏻‍💻 More operations directly from the Argilla CLI

Argilla CLI displaying help information

The Argilla CLI has been updated to include an extensive list of new commands, from users and datasets management to training models all from the terminal!

⌨️ New keyboard shorcuts for the Feedback Dataset

Feedback dataset shortcuts

Now, you can seamlessly navigate within the feedback form using just your keyboard. We've extended the functionality of these shortcuts to cover all types of available questions: Label, Multi-label, Ranking, Rating and Text

QnA, Chat Completion with OpenAI and Sentence Transformers model training now in the ArgillaTrainer

The ArgillaTrainer doesn't stop getting new features and improvements!

1.16.0

Added

  • Added ArgillaTrainer integration with sentence-transformers, allowing fine tuning for sentence similarity (#3739)
  • Added ArgillaTrainer integration with TrainingTask.for_question_answering (#3740)
  • Added Auto save record to save automatically the current record that you are working on (#3541)
  • Added ArgillaTrainer integration with OpenAI, allowing fine tuning for chat completion (#3615)
  • Added workspaces list command to list Argilla workspaces (#3594).
  • Added datasets list command to list Argilla datasets (#3658).
  • Added users create command to create users (#3667).
  • Added whoami command to get current user (#3673).
  • Added users delete command to delete users (#3671).
  • Added users list command to list users (#3688).
  • Added workspaces delete-user command to remove a user from a workspace (#3699).
  • Added datasets list command to list Argilla datasets (#3658).
  • Added users create command to create users (#3667).
  • Added users delete command to delete users (#3671).
  • Added workspaces create command to create an Argilla workspace (#3676).
  • Added datasets push-to-hub command to push a FeedbackDataset from Argilla into the HuggingFace Hub (#3685).
  • Added info command to get info about the used Argilla client and server (#3707).
  • Added datasets delete command to delete a FeedbackDataset from Argilla (#3703).
  • Added created_at and updated_at properties to RemoteFeedbackDataset and FilteredRemoteFeedbackDataset (#3709).
  • Added handling PermissionError when executing a command with a logged in user with not enough permissions (#3717).
  • Added workspaces add-user command to add a user to workspace (#3712).
  • Added workspace_id param to GET /api/v1/me/datasets endpoint (#3727).
  • Added workspace_id arg to list_datasets in the Python SDK (#3727).
  • Added argilla script that allows to execute Argilla CLI using the argilla command (#3730).
  • Added server_info function to check the Argilla server information (also accessible via rg.server_info) (#3772).

Changed

  • Move database commands under server group of commands (#3710)
  • server commands only included in the CLI app when server extra requirements are installed (#3710).
  • Updated PUT /api/v1/responses/{response_id} to replace values stored with received values in request (#3711).
  • Display a UserWarning when the user_id in Workspace.add_user and Workspace.delete_user is the ID of an user with the owner role as they don't require explicit permissions (#3716).
  • Rename tasks sub-package to cli (#3723).
  • Changed argilla database command in the CLI to now be accessed via argilla server database, to be deprecated in the upcoming release (#3754).
  • Changed visible_options (of label and multi label selection questions) validation in the backend to check that the provided value is greater or equal than/to 3 and less or equal than/to the number of provided options (#3773).

Fixed

  • Fixed remove user modification in text component on clear answers (#3775)
  • Fixed Highlight raw text field in dataset feedback task (#3731)
  • Fixed Field title too long (#3734)
  • Fixed error messages when deleting a DatasetForTextClassification (#3652)
  • Fixed Pending queue pagination problems when during data annotation (#3677)
  • Fixed visible_labels default value to be 20 just when visible_labels not provided and len(labels) > 20, otherwise it will either be the provided visible_labels value or None, for LabelQuestion and MultiLabelQuestion (#3702).
  • Fixed DatasetCard generation when RemoteFeedbackDataset contains suggestions (#3718).
  • Add missing draft status in ResponseSchema as now there can be responses with draft status when annotating via the UI (#3749).
  • Searches when queried words are distributed along the record fields (#3759).
  • Fixed Python 3.11 compatibility issue with /api/datasets endpoints due to the TaskType enum replacement in the endpoint URL (#3769).

As always, thanks to our amazing contributors

Full Changelog: v1.15.1...v1.16.0

v1.15.1

08 Sep 15:06
12215c0
Compare
Choose a tag to compare

Changelog 1.15.1

Fixed

  • Fixed Text component text content sanitization behavior just for markdown to prevent disappear the text (#3738)
  • Fixed Text component now you need to press Escape to exit the text area (#3733)
  • Fixed SearchEngine was creating the same number of primary shards and replica shards for each FeedbackDataset (#3736).

v1.15.0

31 Aug 11:50
7eb6274
Compare
Choose a tag to compare

🔆 Highlights

Argilla 1.15.0 comes with an enhanced FeedbackDataset settings page enabling the update of the dataset settings, an integration of the TRL package with the ArgillaTrainer, and continues adding improvements to the Python client for managing FeedbackDatasets.

⚙️ Update FeedbackDataset settings from the UI

Update Feedback Dataset settings from the UI

FeedbackDataset settings page has been updated and now it allows to update the guidelines and some attributes of the fields and questions of the dataset. Did you misspell the title or description of a field or question? Well, you don't have to remove your dataset and create it again anymore! Just go to the settings page and fix it.

🤖 TRL integration with the ArgillaTrainer

ArgillaTrainer code snippet for training reward model with TRL

The famous TRL package for training Transformers with Reinforcement Learning techniques has been integrated with the ArgillaTrainer, that comes with four new TrainingTask: SFT, Reward Modeling, PPO and DPO. Each training task expects a formatting function that will return the data in the expected format for training the model.

Check this 🆕 tutorial for training a Reward Model using the Argilla Trainer.

🐍 Filter FeedbackDataset and remove suggestions

Using FeedbackDataset filter method

In the 1.14.0 release we added many improvements for working with remote FeedbackDatasets. In this release, a new filter_by method has been added that allows to filter the records of a dataset from the Python client. For now, the records can be only filtered using the response_status, but we're planning adding more complex filters for the upcoming releases. In addition, new methods have been added allowing to remove the suggestions created for a record.

1.15.0

Added

  • Added Enable to update guidelines and dataset settings for Feedback Datasets directly in the UI (#3489)
  • Added ArgillaTrainer integration with TRL, allowing for easy supervised finetuning, reward modeling, direct preference optimization and proximal policy optimization (#3467)
  • Added formatting_func to ArgillaTrainer for FeedbackDataset datasets add a custom formatting for the data (#3599).
  • Added login function in argilla.client.login to login into an Argilla server and store the credentials locally (#3582).
  • Added login command to login into an Argilla server (#3600).
  • Added logout command to logout from an Argilla server (#3605).
  • Added DELETE /api/v1/suggestions/{suggestion_id} endpoint to delete a suggestion given its ID (#3617).
  • Added DELETE /api/v1/records/{record_id}/suggestions endpoint to delete several suggestions linked to the same record given their IDs (#3617).
  • Added response_status param to GET /api/v1/datasets/{dataset_id}/records to be able to filter by response_status as previously included for GET /api/v1/me/datasets/{dataset_id}/records (#3613).
  • Added list classmethod to ArgillaMixin to be used as FeedbackDataset.list(), also including the workspace to list from as arg (#3619).
  • Added filter_by method in RemoteFeedbackDataset to filter based on response_status (#3610).
  • Added list_workspaces function (to be used as rg.list_workspaces, but Workspace.list is preferred) to list all the workspaces from an user in Argilla (#3641).
  • Added list_datasets function (to be used as rg.list_datasets) to list the TextClassification, TokenClassification, and Text2Text datasets in Argilla (#3638).
  • Added RemoteSuggestionSchema to manage suggestions in Argilla, including the delete method to delete suggestios from Argilla via DELETE /api/v1/suggestions/{suggestion_id} (#3651).
  • Added delete_suggestions to RemoteFeedbackRecord to remove suggestions from Argilla via DELETE /api/v1/records/{record_id}/suggestions (#3651).

Changed

  • Changed Optional label for * mark for required question (#3608)
  • Updated RemoteFeedbackDataset.delete_records to use batch delete records endpoint (#3580).
  • Included allowed_for_roles for some RemoteFeedbackDataset, RemoteFeedbackRecords, and RemoteFeedbackRecord methods that are only allowed for users with roles owner and admin (#3601).
  • Renamed ArgillaToFromMixin to ArgillaMixin (#3619).
  • Move users CLI app under database CLI app (#3593).
  • Move server Enum classes to argilla.server.enums module (#3620).

Fixed

  • Fixed Filter by workspace in breadcrumbs (#3577)
  • Fixed Filter by workspace in datasets table (#3604)
  • Fixed Query search highlight for Text2Text and TextClassification (#3621)
  • Fixed RatingQuestion.values validation to raise a ValidationError when values are out of range i.e. [1, 10] (#3626).

Removed

  • Removed multi_task_text_token_classification from TaskType as not used (#3640).
  • Removed argilla_id in favor of id from RemoteFeedbackDataset (#3663).
  • Removed fetch_records from RemoteFeedbackDataset as now the records are lazily fetched from Argilla (#3663).
  • Removed push_to_argilla from RemoteFeedbackDataset, as it just works when calling it through a FeedbackDataset locally, as now the updates of the remote datasets are automatically pushed to Argilla (#3663).
  • Removed set_suggestions in favor of update(suggestions=...) for both FeedbackRecord and RemoteFeedbackRecord, as all the updates of any "updateable" attribute of a record will go through update instead (#3663).
  • Remove unused owner attribute for client Dataset data model (#3665)

As always, thanks to our amazing contributors

Full Changelog: v1.14.1...v1.15.0

v1.14.1

16 Aug 14:46
0b0b4f5
Compare
Choose a tag to compare

Changelog 1.14.1

Fixed

  • Fixed PostgreSQL database not being updated after begin_nested because of missing commit (#3567).

Full Changelog: v1.14.0...v1.14.1

v1.14.0

11 Aug 09:23
7fbbc3f
Compare
Choose a tag to compare

🔆 Highlights

Argilla 1.14.0 comes packed with improvements to manage Feedback Datasets from the Python client. Here are the most important changes in this version:

Code-snippet with a summary of the new workflows in Argilla

Pushing and pulling a dataset

Pushing a dataset to Argilla will now create a RemoteFeedbackDataset in Argilla. To make changes to your dataset in Argilla you will need to make those updates to the remote dataset. You can do so by either using the dataset returned when using the push_to_argilla() method (as shown in the image above) or by loading the dataset like so:

import argilla as rg
# connect to Argilla
rg.init(api_url="...", api_key="...")
# get the existing dataset in Argilla
remote_dataset = rg.FeedbackDataset.from_argilla(name="my-dataset", workspace="my-workspace")
# add a list of FeedbackRecords to the dataset in Argilla
remote_dataset.add_records(...)

Alternatively, you can make a local copy of the dataset using the pull() method.

local_dataset = remote_dataset.pull()

Note that any changes that you make to this local dataset will not affect the remote dataset in Argilla.

Adding and deleting records

How to add records to an existing dataset in Argilla was demonstrated in the first code snippet in the "Pushing and pulling a dataset" section. This is how you can delete a list of records using that same dataset:

records_to_delete = remote_dataset.records[0:5]
remote_dataset.delete_records(records_to_delete)

Or delete a single record:

record = remote_dataset.records[-1]
record.delete()

Add / update suggestions in existing records

To add and update suggestions in existing records, you can simply use the update() method. For example:

for record in remote_dataset.records:
    record.update(suggestions=...)

Note that adding a suggestion to a question that already has one will overwrite the previous suggestion. To learn more about the format that the suggestions must follow, check our docs.

Delete a dataset

You can now easily delete datasets from the Python client. To do that, get the existing dataset like demonstrated in the first section and just use:

remote_dataset.delete()

Create users with workspace assignments

Now you can create a user and directly assign existing workspaces to grant them access.

user = rg.User.create(username="...", first_name="...", password="...", workspaces=["ws1", "ws2"])

Changelog 1.14.0

Added

  • Added PATCH /api/v1/fields/{field_id} endpoint to update the field title and markdown settings (#3421).
  • Added PATCH /api/v1/datasets/{dataset_id} endpoint to update dataset name and guidelines (#3402).
  • Added PATCH /api/v1/questions/{question_id} endpoint to update question title, description and some settings (depending on the type of question) (#3477).
  • Added DELETE /api/v1/records/{record_id} endpoint to remove a record given its ID (#3337).
  • Added pull method in RemoteFeedbackDataset (a FeedbackDataset pushed to Argilla) to pull all the records from it and return it as a local copy as a FeedbackDataset (#3465).
  • Added delete method in RemoteFeedbackDataset (a FeedbackDataset pushed to Argilla) (#3512).
  • Added delete_records method in RemoteFeedbackDataset, and delete method in RemoteFeedbackRecord to delete records from Argilla (#3526).

Changed

  • Improved efficiency of weak labeling when dataset contains vectors (#3444).
  • Added ArgillaDatasetMixin to detach the Argilla-related functionality from the FeedbackDataset (#3427)
  • Moved FeedbackDataset-related pydantic.BaseModel schemas to argilla.client.feedback.schemas instead, to be better structured and more scalable and maintainable (#3427)
  • Update CLI to use database async connection (#3450).
  • Limit rating questions values to the positive range [1, 10] (#3451).
  • Updated POST /api/users endpoint to be able to provide a list of workspace names to which the user should be linked to (#3462).
  • Updated Python client User.create method to be able to provide a list of workspace names to which the user should be linked to (#3462).
  • Updated GET /api/v1/me/datasets/{dataset_id}/records endpoint to allow getting records matching one of the response statuses provided via query param (#3359).
  • Updated POST /api/v1/me/datasets/{dataset_id}/records endpoint to allow searching records matching one of the response statuses provided via query param (#3359).
  • Updated SearchEngine.search method to allow searching records matching one of the response statuses provided (#3359).
  • After calling FeedbackDataset.push_to_argilla, the methods FeedbackDataset.add_records and FeedbackRecord.set_suggestions will automatically call Argilla with no need of calling push_to_argilla explicitly (#3465).
  • Now calling FeedbackDataset.push_to_huggingface dumps the responses as a List[Dict[str, Any]] instead of Sequence to make it more readable via 🤗datasets (#3539).

Fixed

  • Fixed issue with bool values and default from Jinja2 while generating the HuggingFace DatasetCard from argilla_template.md (#3499).
  • Fixed DatasetConfig.from_yaml which was failing when calling FeedbackDataset.from_huggingface as the UUIDs cannot be deserialized automatically by PyYAML, so UUIDs are neither dumped nor loaded anymore (#3502).
  • Fixed an issue that didn't allow the Argilla server to work behind a proxy (#3543).
  • TextClassificationSettings and TokenClassificationSettings labels are properly parsed to strings both in the Python client and in the backend endpoint (#3495).
  • Fixed PUT /api/v1/datasets/{dataset_id}/publish to check whether at least one field and question has required=True (#3511).
  • Fixed FeedbackDataset.from_huggingface as suggestions were being lost when there were no responses (#3539).
  • Fixed QuestionSchema and FieldSchema not validating name attribute (#3550).

Deprecated

  • After calling FeedbackDataset.push_to_argilla, calling push_to_argilla again won't do anything since the dataset is already pushed to Argilla (#3465).
  • After calling FeedbackDataset.push_to_argilla, calling fetch_records won't do anything since the records are lazily fetched from Argilla (#3465).
  • After calling FeedbackDataset.push_to_argilla, the Argilla ID is no longer stored in the attribute/property argilla_id but in id instead (#3465).

As always, thanks to our amazing contributors

Full Changelog: v1.13.3...v1.14.0

v1.13.3

27 Jul 14:54
d37ea7e
Compare
Choose a tag to compare

1.13.3

Fixed

  • Fixed ModuleNotFoundError caused because the argilla.utils.telemetry module used in the ArgillaTrainer was importing an optional dependency not installed by default (#3471).
  • Fixed ImportError caused because the argilla.client.feedback.config module was importing pyyaml optional dependency not installed by default (#3471).

Full Changelog: v1.13.2...v1.13.3

v1.13.2

24 Jul 14:08
Compare
Choose a tag to compare

1.13.2

Fixed

  • The suggestion_type_enum ENUM data type created in PostgreSQL didn't have any value (#3445).

v1.13.1

21 Jul 11:16
Compare
Choose a tag to compare

1.13.1

Fixed

  • Fix database migration for PostgreSQL (See #3438)

Full Changelog: v1.13.0...v1.13.1