Skip to content

Support device-level configuration across all devices #1276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ibelem
Copy link
Contributor

@ibelem ibelem commented Apr 8, 2025

Currently setting free_dimension_override at a model level would restrict the models which can handle dynamic shapes, which impacts cpu, webgpu and node.js...

This PR changed the logic a little bit that only allow free_dimension_override on a device level, and work under webnn key only of transformers.js_config in config.json.

CurrentThis PR
{
  "transformers.js_config": {
    "free_dimension_overrides": {
      "batch_size": 1,
      "num_channels": 3,
      "height": 30,
      "width": 30
    }
  }
}
{
  "transformers.js_config": {
    "webnn": {
      "free_dimension_overrides": {
        "batch_size": 1,
        "num_channels": 3,
        "height": 30,
        "width": 30
      }
    }
  }
}
{
  "transformers.js_config": {
    "device": "webnn-gpu",
    "dtype": "fp16",
    "free_dimension_overrides": {
      "batch_size": 1,
      "num_channels": 3,
      "height": 30,
      "width": 30
    }
  }
}
{
  "transformers.js_config": {
    "device": "webnn-gpu",
    "dtype": "fp16",
    "webnn": {
      "free_dimension_overrides": {
        "batch_size": 1,
        "num_channels": 3,
        "height": 30,
        "width": 30
      }
    }
  }
}

Tested pass by using xenova/resnet-50 with updated local config.js file and didn't set free_dimension_overrides in app JavaScript code.

@xenova PTAL

CC @Honry @huningxin

@xenova
Copy link
Collaborator

xenova commented Apr 9, 2025

Thanks for the PR! I think this is a great start for the feature, and in-fact, could be generalized even further to support device-level configuration across all devices. This would mean

In other words, something like this:

{
  "transformers.js_config": {
    "device": "webnn-gpu", // Default device
    "device_config": {
      "webnn": {
        "free_dimension_overrides": {
          "batch_size": 1,
          "num_channels": 3,
          "height": 30,
          "width": 30
        }
      },
      "webgpu": {
        "dtype": "fp16" // when user sets device to webgpu, we will use dtype fp16
      }
    }
  }
}

Sure, this may look a bit lengthy, but the goal is to have all these configs generated automatically (especially for webnn-compatible models).

What do you think?

@ibelem
Copy link
Contributor Author

ibelem commented Apr 10, 2025

Hi @xenova, that make sense to me. The Transformers.js and device specific configurations will look like:

/**
 * Transformers.js-specific configuration, possibly present in config.json under the key `transformers.js_config`.
 * @typedef {Object} TransformersJSConfig
 * @property {import('./utils/devices.js').DeviceType} [device] The default device to use for the model.
 * @property {Record<import('./utils/devices.js').DeviceType, DeviceConfig>} [device_config] Device-specific configurations.
 */

/**
 * Device-specific configuration options.
 * @typedef {Object} DeviceConfig
 * @property {import('./utils/tensor.js').DataType|Record<import('./utils/dtypes.js').DataType, import('./utils/tensor.js').DataType>} [kv_cache_dtype] The data type of the key-value cache.
 * @property {Record<string, number>} [free_dimension_overrides] Override the free dimensions of the model.
 * See https://onnxruntime.ai/docs/tutorials/web/env-flags-and-session-options.html#freedimensionoverrides
 * for more information.
 * @property {import('./utils/dtypes.js').DataType|Record<string, import('./utils/dtypes.js').DataType>} [dtype] The default data type to use for the model.
 * @property {import('./utils/hub.js').ExternalData|Record<string, import('./utils/hub.js').ExternalData>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
 */

or


/**
 * Transformers.js-specific configuration, possibly present in config.json under the key `transformers.js_config`.
 * @typedef {Object} TransformersJSConfig
 * @property {Record<import('./utils/devices.js').DeviceType, DeviceConfig>} [device_config] Device-specific configurations.
 * @property {import('./utils/tensor.js').DataType|Record<import('./utils/dtypes.js').DataType, import('./utils/tensor.js').DataType>} [kv_cache_dtype] The data type of the key-value cache.
 * @property {Record<string, number>} [free_dimension_overrides] Override the free dimensions of the model.
 * See https://onnxruntime.ai/docs/tutorials/web/env-flags-and-session-options.html#freedimensionoverrides
 * for more information.
 * @property {import('./utils/devices.js').DeviceType} [device] The default device to use for the model.
 * @property {import('./utils/dtypes.js').DataType|Record<string, import('./utils/dtypes.js').DataType>} [dtype] The default data type to use for the model.
 * @property {import('./utils/hub.js').ExternalData|Record<string, import('./utils/hub.js').ExternalData>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
 */

/**
 * Device-specific configuration options.
 * @typedef {Object} DeviceConfig
 * @property {import('./utils/tensor.js').DataType|Record<import('./utils/dtypes.js').DataType, import('./utils/tensor.js').DataType>} [kv_cache_dtype] The data type of the key-value cache.
 * @property {Record<string, number>} [free_dimension_overrides] Override the free dimensions of the model.
 * See https://onnxruntime.ai/docs/tutorials/web/env-flags-and-session-options.html#freedimensionoverrides
 * for more information.
 * @property {import('./utils/dtypes.js').DataType|Record<string, import('./utils/dtypes.js').DataType>} [dtype] The default data type to use for the model.
 * @property {import('./utils/hub.js').ExternalData|Record<string, import('./utils/hub.js').ExternalData>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
 */

The first one is neater and exclusive, the second one keeps the original config fallback in addition to the device specific configs in 3b115bf (#1276) .

@xenova @Honry PTAL

@ibelem ibelem changed the title [WebNN] Only allow free_dimension_override on a device level Support device-level configuration across all devices Apr 10, 2025
Copy link
Collaborator

@xenova xenova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✅

Will be included in v3.5 release

@ibelem
Copy link
Contributor Author

ibelem commented Apr 16, 2025

@xenova Thank you for your code improvements, they have been a great source of inspiration for me.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@xenova xenova merged commit 6b51147 into huggingface:main Apr 16, 2025
4 checks passed
@xenova
Copy link
Collaborator

xenova commented Apr 16, 2025

Next step: updating configs on the HF hub.

We can prioritize some of the more popular models, and make automated PRs for models which ordinarily only support static shapes. Although, we'll need to decide what shapes to use when the model allows dynamic shapes. I.e., it is easy for models like https://huggingface.co/Xenova/slimsam-77-uniform for which we define values in the preprocessor_config.json (e.g., here), but for models like MODNet, which support dynamic shapes, deciding the size to use is a bit trickier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants