Skip to content

Add model loader arg to generate_layers_metrics #81

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions aiu_fms_testing_utils/utils/metrics_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,44 @@
import torch.nn as nn


def get_model_prefix(
model_path,
shapes_size,
max_new_tokens: None,
batch_size: None,
seq_length: None,
dtype: None,
include_shapes: False,
):
"""
Generate a prefix for a model based on its path and other parameters.

Args:
model_path (str): The path to the model file.
shapes_size (int): The size of the shapes array to use in the model.
max_new_tokens (int): The maximum number of new tokens to use in the model.
batch_size (int): The batch size to use in the model.
seq_length (int): The sequence length to use in the model.
dtype (str): The data type to use in the model.
include_shapes (bool): Include or not the shapes to the prefix.
Returns:
str: A prefix for the model based on its path and other parameters.
"""
if model_path.count("/") > 1:
# this means that the model_path does NOT match to the hf pattern
# Eg.: /home/another-dir/another/ibm-granite/granite-3.3-8b-base
model_prefix = model_path.split("/")[-2] + "--" + model_path.split("/")[-1]
else:
# this means that the model_path does match to the hf pattern
# Eg.: ibm-granite/granite-3.3-8b-base
model_prefix = model_path.replace("/", "--")

if shapes_size > 1 or include_shapes:
model_prefix = f"{model_prefix}_max-new-tokens-{max_new_tokens}_batch-size-{batch_size}_seq-length-{seq_length}_dtype-{dtype}"

return model_prefix


def abs_diff_linalg_norm(res_vector):
"""
Calculates the Euclidean norm (also known as the L2 norm) of a given array res_vector. This is equivalent to finding the square
Expand Down
Loading
Loading