Skip to content

Conversation

ArkaSanka
Copy link

@ArkaSanka ArkaSanka commented Oct 21, 2025

Fix argument handling in oneshot function #1850

Issue Description

The oneshot function signature in oneshot.py was missing several parameters that exist in the underlying dataclasses (DatasetArguments, ModelArguments, RecipeArguments). This caused issues when users tried to use these parameters directly, particularly with:

  • sequential_targets: Conflicts occurred between recipe modifiers and direct parameters
  • preprocessing_func: Returns an error when the dataset is empty
  • pipeline: Not properly validated against sequential_targets

Changes Made

Parameter Alignment:

  • Updated the oneshot function signature to include all missing parameters from the argument dataclasses
  • Ensured type hints and default values match those defined in the dataclasses
  • Added missing parameters: preprocessing_func, data_collator, raw_kwargs, max_train_samples, pipeline, tracing_ignore, sequential_targets

Validation Logic:

  • Added validation to detect conflicting sequential_targets between recipe modifiers and direct parameters
  • Added validation to prevent incompatible pipeline settings with sequential_targets
  • Fixed error message formatting to comply with style guidelines

Test Improvements:

  • Updated the test fixture in test_api_inputs.py to handle all parameters correctly
  • Added detection for potential parameter conflicts to make tests more robust

Impact

These changes ensure that all parameters defined in the argument dataclasses can be used directly with the oneshot function without unexpected behavior. Users can now pass parameters like sequential_targets and preprocessing_func directly to oneshot without running into cryptic errors or unexpected behavior. The API is now more consistent with its underlying implementation, making it more intuitive to use.

@github-actions
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ArkaSanka, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the oneshot function's usability and robustness by aligning its parameter signature with its internal dataclasses and introducing critical validation checks. It ensures that users can pass a wider range of configuration options directly to oneshot without encountering unexpected behavior or cryptic errors, while also preventing operations on empty datasets and incompatible pipeline settings.

Highlights

  • Enhanced oneshot Function Signature: The oneshot function now includes previously missing parameters like preprocessing_func, pipeline, and sequential_targets, aligning its signature with underlying dataclasses for improved usability.
  • Improved Parameter Validation: New validation checks prevent conflicting configurations, specifically disallowing sequential_targets when the pipeline is set to 'independent', ensuring more robust function calls.
  • Empty Dataset Safeguard: A new check has been added to prevent calibration from proceeding with an empty dataset, raising a ValueError for clarity and preventing potential issues.
  • Robust Test Coverage: The test suite has been updated to correctly handle the expanded oneshot parameters and gracefully manage empty dataset scenarios during testing, making tests more reliable.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the oneshot function by aligning its parameters with the underlying argument dataclasses, which makes the API more complete and intuitive. It also introduces valuable validation for empty datasets and incompatible parameter combinations, such as sequential_targets with an independent pipeline. The tests have been updated accordingly to cover these new parameters and handle potential data-related issues more gracefully. My review focuses on a potential issue in the test suite where preprocessing_func is wrapped in a tuple, which seems to contradict its type hint and the stated goal of the pull request. I have also included a minor style suggestion to improve code maintainability.

@ArkaSanka ArkaSanka force-pushed the oneshot-dataset-params branch from 83ebf56 to b882a49 Compare October 21, 2025 20:12
@ArkaSanka ArkaSanka changed the title Add validation for empty dataset and enhance oneshot function parameters [Oneshot] Add validation for empty dataset and enhance oneshot function parameters Oct 21, 2025
@kylesayrs kylesayrs self-assigned this Oct 22, 2025
@kylesayrs kylesayrs self-requested a review October 22, 2025 01:59
@dsikka dsikka assigned ArkaSanka and unassigned kylesayrs Oct 22, 2025
@ArkaSanka ArkaSanka force-pushed the oneshot-dataset-params branch from b882a49 to 845aac4 Compare October 22, 2025 12:55
@ArkaSanka
Copy link
Author

Hi @kylesayrs, @dsikka, let me know if there are any additional/missing changes to be made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants