Support device-level configuration across all devices #1276

ibelem · 2025-04-08T06:19:46Z

Currently setting free_dimension_override at a model level would restrict the models which can handle dynamic shapes, which impacts cpu, webgpu and node.js...

This PR changed the logic a little bit that only allow free_dimension_override on a device level, and work under webnn key only of transformers.js_config in config.json.

Current

This PR

{
  "transformers.js_config": {
    "free_dimension_overrides": {
      "batch_size": 1,
      "num_channels": 3,
      "height": 30,
      "width": 30
    }
  }
}

{
  "transformers.js_config": {
    "webnn": {
      "free_dimension_overrides": {
        "batch_size": 1,
        "num_channels": 3,
        "height": 30,
        "width": 30
      }
    }
  }
}

{
  "transformers.js_config": {
    "device": "webnn-gpu",
    "dtype": "fp16",
    "free_dimension_overrides": {
      "batch_size": 1,
      "num_channels": 3,
      "height": 30,
      "width": 30
    }
  }
}

{
  "transformers.js_config": {
    "device": "webnn-gpu",
    "dtype": "fp16",
    "webnn": {
      "free_dimension_overrides": {
        "batch_size": 1,
        "num_channels": 3,
        "height": 30,
        "width": 30
      }
    }
  }
}

Tested pass by using xenova/resnet-50 with updated local config.js file and didn't set free_dimension_overrides in app JavaScript code.

@xenova PTAL

CC @Honry @huningxin

xenova · 2025-04-09T15:44:06Z

Thanks for the PR! I think this is a great start for the feature, and in-fact, could be generalized even further to support device-level configuration across all devices. This would mean

In other words, something like this:

{
  "transformers.js_config": {
    "device": "webnn-gpu", // Default device
    "device_config": {
      "webnn": {
        "free_dimension_overrides": {
          "batch_size": 1,
          "num_channels": 3,
          "height": 30,
          "width": 30
        }
      },
      "webgpu": {
        "dtype": "fp16" // when user sets device to webgpu, we will use dtype fp16
      }
    }
  }
}

Sure, this may look a bit lengthy, but the goal is to have all these configs generated automatically (especially for webnn-compatible models).

What do you think?

ibelem · 2025-04-10T06:53:07Z

Hi @xenova, that make sense to me. The Transformers.js and device specific configurations will look like:

/**
 * Transformers.js-specific configuration, possibly present in config.json under the key `transformers.js_config`.
 * @typedef {Object} TransformersJSConfig
 * @property {import('./utils/devices.js').DeviceType} [device] The default device to use for the model.
 * @property {Record<import('./utils/devices.js').DeviceType, DeviceConfig>} [device_config] Device-specific configurations.
 */

/**
 * Device-specific configuration options.
 * @typedef {Object} DeviceConfig
 * @property {import('./utils/tensor.js').DataType|Record<import('./utils/dtypes.js').DataType, import('./utils/tensor.js').DataType>} [kv_cache_dtype] The data type of the key-value cache.
 * @property {Record<string, number>} [free_dimension_overrides] Override the free dimensions of the model.
 * See https://onnxruntime.ai/docs/tutorials/web/env-flags-and-session-options.html#freedimensionoverrides
 * for more information.
 * @property {import('./utils/dtypes.js').DataType|Record<string, import('./utils/dtypes.js').DataType>} [dtype] The default data type to use for the model.
 * @property {import('./utils/hub.js').ExternalData|Record<string, import('./utils/hub.js').ExternalData>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
 */

or


/**
 * Transformers.js-specific configuration, possibly present in config.json under the key `transformers.js_config`.
 * @typedef {Object} TransformersJSConfig
 * @property {Record<import('./utils/devices.js').DeviceType, DeviceConfig>} [device_config] Device-specific configurations.
 * @property {import('./utils/tensor.js').DataType|Record<import('./utils/dtypes.js').DataType, import('./utils/tensor.js').DataType>} [kv_cache_dtype] The data type of the key-value cache.
 * @property {Record<string, number>} [free_dimension_overrides] Override the free dimensions of the model.
 * See https://onnxruntime.ai/docs/tutorials/web/env-flags-and-session-options.html#freedimensionoverrides
 * for more information.
 * @property {import('./utils/devices.js').DeviceType} [device] The default device to use for the model.
 * @property {import('./utils/dtypes.js').DataType|Record<string, import('./utils/dtypes.js').DataType>} [dtype] The default data type to use for the model.
 * @property {import('./utils/hub.js').ExternalData|Record<string, import('./utils/hub.js').ExternalData>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
 */

/**
 * Device-specific configuration options.
 * @typedef {Object} DeviceConfig
 * @property {import('./utils/tensor.js').DataType|Record<import('./utils/dtypes.js').DataType, import('./utils/tensor.js').DataType>} [kv_cache_dtype] The data type of the key-value cache.
 * @property {Record<string, number>} [free_dimension_overrides] Override the free dimensions of the model.
 * See https://onnxruntime.ai/docs/tutorials/web/env-flags-and-session-options.html#freedimensionoverrides
 * for more information.
 * @property {import('./utils/dtypes.js').DataType|Record<string, import('./utils/dtypes.js').DataType>} [dtype] The default data type to use for the model.
 * @property {import('./utils/hub.js').ExternalData|Record<string, import('./utils/hub.js').ExternalData>} [use_external_data_format=false] Whether to load the model using the external data format (used for models >= 2GB in size).
 */

The first one is neater and exclusive, the second one keeps the original config fallback in addition to the device specific configs in 3b115bf (#1276) .

@xenova @Honry PTAL

xenova

LGTM ✅

Will be included in v3.5 release

ibelem · 2025-04-16T00:30:42Z

@xenova Thank you for your code improvements, they have been a great source of inspiration for me.

HuggingFaceDocBuilderDev · 2025-04-16T00:33:46Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

xenova · 2025-04-16T14:32:59Z

Next step: updating configs on the HF hub.

We can prioritize some of the more popular models, and make automated PRs for models which ordinarily only support static shapes. Although, we'll need to decide what shapes to use when the model allows dynamic shapes. I.e., it is easy for models like https://huggingface.co/Xenova/slimsam-77-uniform for which we define values in the preprocessor_config.json (e.g., here), but for models like MODNet, which support dynamic shapes, deciding the size to use is a bit trickier.

[WebNN] Only allow free_dimension_override on a device level

4022031

add device_config

3b115bf

ibelem changed the title ~~[WebNN] Only allow free_dimension_override on a device level~~ Support device-level configuration across all devices Apr 10, 2025

xenova added 4 commits April 14, 2025 22:49

Use Omit to define device config to prevent duplication

e58fafe

Update custom config instead of checking each property

4d76e66

Cleanup

1f4866f

Add back comment

8f3ab40

xenova approved these changes Apr 16, 2025

View reviewed changes

xenova merged commit 6b51147 into huggingface:main Apr 16, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support device-level configuration across all devices #1276

Support device-level configuration across all devices #1276

Uh oh!

ibelem commented Apr 8, 2025

Uh oh!

xenova commented Apr 9, 2025

Uh oh!

ibelem commented Apr 10, 2025 •

edited

Loading

Uh oh!

xenova left a comment

Uh oh!

ibelem commented Apr 16, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 16, 2025

Uh oh!

Uh oh!

xenova commented Apr 16, 2025

Uh oh!

Uh oh!

Support device-level configuration across all devices #1276

Support device-level configuration across all devices #1276

Uh oh!

Conversation

ibelem commented Apr 8, 2025

Uh oh!

xenova commented Apr 9, 2025

Uh oh!

ibelem commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xenova left a comment

Choose a reason for hiding this comment

Uh oh!

ibelem commented Apr 16, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 16, 2025

Uh oh!

Uh oh!

xenova commented Apr 16, 2025

Uh oh!

Uh oh!

ibelem commented Apr 10, 2025 •

edited

Loading