Skip to content

Conversation

@FranciscoThiesen
Copy link

No description provided.

@meta-cla
Copy link

meta-cla bot commented Oct 25, 2025

Hi @FranciscoThiesen!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 25, 2025
@oulgen
Copy link
Contributor

oulgen commented Oct 25, 2025

kicked off the CI, but you'll probably need to update the requirements file

@FranciscoThiesen
Copy link
Author

@oulgen do you mind taking a look whenever time permits?

@oulgen
Copy link
Contributor

oulgen commented Oct 31, 2025

Hey @FranciscoThiesen thank you for implementing a new autotuning algorithm. Could you share some results? Perhaps you can compare to PatternSearch in terms of

  • Convergence time
  • Best perf found

also please make sure the tests and the lint are passing

@oulgen
Copy link
Contributor

oulgen commented Oct 31, 2025

also please update your new unit test file to be the same style as rest of the test using a class

@FranciscoThiesen
Copy link
Author

@oulgen do you have any available GPUs that could be used for convergence analysis + best perf comparison? I definitely agree that having this is a must to assess how good is MFBO versus the current hill-climbing approach.

I think this will really shine in terms of reducing the total time/resources that the auto-tuner takes, while still finding good solutions.

I see that the CI runs a few of the tests on GPUs and unfortunately I don't have a personal one that I can use for my OS contributions.

@oulgen
Copy link
Contributor

oulgen commented Oct 31, 2025

@oulgen do you have any available GPUs that could be used for convergence analysis + best perf comparison? I definitely agree that having this is a must to assess how good is MFBO versus the current hill-climbing approach.

I think this will really shine in terms of reducing the total time/resources that the auto-tuner takes, while still finding good solutions.

I see that the CI runs a few of the tests on GPUs and unfortunately I don't have a personal one that I can use for my OS contributions.

I can give it a try, but no promises on when.

@FranciscoThiesen FranciscoThiesen changed the title [Not ready for reviews yet, just want to run CI] - Implementing multi-fidelity bayesian search for the auto-tuner Implemented multi-fidelity bayesian search for the auto-tuner Nov 3, 2025
@FranciscoThiesen
Copy link
Author

@oulgen got GPUs here for running the convergence analysis. Will run it whenever time permits and then share the results.

Copy link
Contributor

@jansel jansel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How well does this work? Can you share some data comparing this to some other search methods in terms of best perf over time tuning?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks algorithm specific, let's move all the related files to a subfolder.


from typing import TYPE_CHECKING

import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use torch to avoid adding a numpy dependency.

Args:
config: The configuration to benchmark.
fn: A precompiled version of config.
fidelity: Number of repetitions for benchmarking (default: 50).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename to the repeat or samples.

)
elif enc_type == "enum":
# One-hot encoding
if hasattr(spec, "choices"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this false?

from .config_generation import FlatConfig


class ConfigEncoder:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this share more code with ConfigGeneration?

Comment on lines +47 to +50
if category in {
Category.BLOCK_SIZE,
Category.NUM_WARPS,
}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if you used the ConfigFragement directly (add a method for this type of encoding) rather than switching based on category.

except (ValueError, IndexError):
# Default to first choice if value not found
encoded[enc_start] = 1.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else:
raise and error

Comment on lines +106 to +133
def get_bounds(self) -> list[tuple[float, float]]:
"""
Get bounds for each encoded dimension.
Returns:
List of (min, max) tuples for each dimension.
"""
bounds: list[tuple[float, float]] = []

for flat_idx, spec in enumerate(self.flat_spec):
category = spec.category()
enc_start, enc_end, enc_type = self.encoding_map[flat_idx]

if enc_type == "numerical":
if category in {Category.BLOCK_SIZE, Category.NUM_WARPS}:
# Power-of-2: log2 bounds
min_val = math.log2(float(spec.low)) # type: ignore[attr-defined]
max_val = math.log2(float(spec.high)) # type: ignore[attr-defined]
bounds.append((min_val, max_val))
else:
# Other numerical bounds
bounds.append(
(float(spec.low), float(spec.high)) # type: ignore[attr-defined]
)
elif enc_type == "enum":
# One-hot: each dimension is 0 or 1
num_choices = enc_end - enc_start
bounds.extend([(0.0, 1.0)] * num_choices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ConfigSpecFragment already has methods for this.

Comment on lines +5 to +8
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ConstantKernel
from sklearn.gaussian_process.kernels import Matern
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be optional deps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants