-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Bria 3 2 pipeline #12010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Bria 3 2 pipeline #12010
Conversation
- Introduced `BriaTransformer2DModel` and `BriaPipeline` for enhanced image generation capabilities. - Updated import structures across various modules to include the new Bria components. - Added utility functions and output classes specific to the Bria pipeline. - Implemented tests for the Bria pipeline to ensure functionality and output integrity.
…bria_3_2_pipeline
Error During Model loadingCode import torch
from diffusers import BriaPipeline
pipe = BriaPipeline.from_pretrained("briaai/BRIA-3.2", torch_dtype=torch.bfloat16)
pipe.to(device="cuda")
prompt = "A asian girl with red top and blue jeans"
negative_prompt = "Logo,Watermark,Ugly,Morbid,Extra fingers,Poorly drawn hands,Mutation,Blurry,Extra limbs,Gross proportions,Missing arms,Mutated hands,Long neck,Duplicate,Mutilated,Mutilated hands,Poorly drawn face,Deformed,Bad anatomy,Cloned face,Malformed limbs,Missing legs,Too many fingers"
images = pipe(prompt=prompt, negative_prompt=negative_prompt, height=1024, width=1024).images[0]
images Logs ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[/tmp/ipython-input-4-3624485676.py](https://localhost:8080/#) in <cell line: 0>()
----> 1 pipe = BriaPipeline.from_pretrained("briaai/BRIA-3.2", torch_dtype=torch.bfloat16)
2 pipe.to(device="cuda")
3 prompt = "A asian girl with red top and blue jeans"
4 negative_prompt = "Logo,Watermark,Ugly,Morbid,Extra fingers,Poorly drawn hands,Mutation,Blurry,Extra limbs,Gross proportions,Missing arms,Mutated hands,Long neck,Duplicate,Mutilated,Mutilated hands,Poorly drawn face,Deformed,Bad anatomy,Cloned face,Malformed limbs,Missing legs,Too many fingers"
5
3 frames
[/content/diffusers/src/diffusers/pipelines/pipeline_utils.py](https://localhost:8080/#) in download(cls, pretrained_model_name, **kwargs)
1536
1537 if load_components_from_hub and not trust_remote_code:
-> 1538 raise ValueError(
1539 f"The repository for {pretrained_model_name} contains custom code in {'.py, '.join([os.path.join(k, v) for k, v in custom_components.items()])} which must be executed to correctly "
1540 f"load the model. You can inspect the repository content at {', '.join([f'[https://hf.co/{pretrained_model_name}/{k}/{v}.py](https://hf.co/%7Bpretrained_model_name%7D/%7Bk%7D/%7Bv%7D.py)' for k, v in custom_components.items()])}.\n"
ValueError: The repository for briaai/BRIA-3.2 contains custom code in transformer/transformer_bria which must be executed to correctly load the model. You can inspect the repository content at https://hf.co/briaai/BRIA-3.2/transformer/transformer_bria.py.
Please pass the argument `trust_remote_code=True` to allow custom code to be run. FIXTo fix it |
Run |
hey @SahilCarterr i fixed the Huggignface model repo and runt the makes |
…dND class for rotary position embedding, and enhance Timestep and TimestepProjEmbeddings classes. Add utility functions for handling negative prompts and generating original sigmas in pipeline_bria.py.
Hey @galbria Checkout the above reviews and edit accordingly |
Hey @galbria You need to fix some tests in Error Logs=================================== FAILURES ===================================
________________ BriaPipelineSlowTests.test_bria_inference_bf16 ________________
self = <tests.pipelines.bria.test_pipeline_bria.BriaPipelineSlowTests testMethod=test_bria_inference_bf16>
def test_bria_inference_bf16(self):
> pipe = self.pipeline_class.from_pretrained(
self.repo_id, torch_dtype=torch.bfloat16, text_encoder=None, tokenizer=None
)
diffusers/tests/pipelines/bria/test_pipeline_bria.py:270:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py:114: in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
diffusers/src/diffusers/pipelines/pipeline_utils.py:1093: in from_pretrained
model = pipeline_class(**init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = BriaPipeline {
"_class_name": "BriaPipeline",
"_diffusers_version": "0.35.0.dev0",
"feature_extractor": [
nu...ansformer": [
"diffusers",
"BriaTransformer2DModel"
],
"vae": [
"diffusers",
"AutoencoderKL"
]
}
transformer = BriaTransformer2DModel(
(pos_embed): EmbedND()
(time_embed): TimestepProjEmbeddings(
(time_proj): Timesteps()
...((2304,), eps=1e-06, elementwise_affine=False)
)
(proj_out): Linear(in_features=2304, out_features=16, bias=True)
)
scheduler = FlowMatchEulerDiscreteScheduler {
"_class_name": "FlowMatchEulerDiscreteScheduler",
"_diffusers_version": "0.35.0....beta_sigmas": false,
"use_dynamic_shifting": true,
"use_exponential_sigmas": false,
"use_karras_sigmas": false
}
vae = AutoencoderKL(
(encoder): Encoder(
(conv_in): Conv2d(3, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
... Conv2d(8, 8, kernel_size=(1, 1), stride=(1, 1))
(post_quant_conv): Conv2d(4, 4, kernel_size=(1, 1), stride=(1, 1))
)
text_encoder = None, tokenizer = None, image_encoder = None
feature_extractor = None
def __init__(
self,
transformer: BriaTransformer2DModel,
scheduler: Union[FlowMatchEulerDiscreteScheduler, KarrasDiffusionSchedulers],
vae: AutoencoderKL,
text_encoder: T5EncoderModel,
tokenizer: T5TokenizerFast,
image_encoder: CLIPVisionModelWithProjection = None,
feature_extractor: CLIPImageProcessor = None,
):
self.register_modules(
vae=vae,
text_encoder=text_encoder,
tokenizer=tokenizer,
transformer=transformer,
scheduler=scheduler,
image_encoder=image_encoder,
feature_extractor=feature_extractor,
)
# TODO - why different than offical flux (-1)
self.vae_scale_factor = (
2 ** (len(self.vae.config.block_out_channels)) if hasattr(self, "vae") and self.vae is not None else 16
)
self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor)
self.default_sample_size = 64 # due to patchify=> 128,128 => res of 1k,1k
# T5 is senstive to precision so we use the precision used for precompute and cast as needed
> self.text_encoder = self.text_encoder.to(dtype=T5_PRECISION)
^^^^^^^^^^^^^^^^^^^^
E AttributeError: 'NoneType' object has no attribute 'to'
diffusers/src/diffusers/pipelines/bria/pipeline_bria.py:189: AttributeError
----------------------------- Captured stderr call -----------------------------
Loading pipeline components...: 0%| | 0/3 [00:00<?, ?it/s]Loaded vae as AutoencoderKL from `vae` subfolder of briaai/BRIA-3.2.
Loading pipeline components...: 33%|███▎ | 1/3 [00:00<00:00, 6.94it/s]Loaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of briaai/BRIA-3.2.
Loaded transformer as BriaTransformer2DModel from `transformer` subfolder of briaai/BRIA-3.2.
Loading pipeline components...: 100%|██████████| 3/3 [00:02<00:00, 1.30it/s]
_____________________ BriaPipelineSlowTests.test_to_dtype ______________________
self = <tests.pipelines.bria.test_pipeline_bria.BriaPipelineSlowTests testMethod=test_to_dtype>
def test_to_dtype(self):
> components = self.get_dummy_components()
^^^^^^^^^^^^^^^^^^^^^^^^^
E AttributeError: 'BriaPipelineSlowTests' object has no attribute 'get_dummy_components'
diffusers/tests/pipelines/bria/test_pipeline_bria.py:318: AttributeError
_________________ BriaPipelineNightlyTests.test_bria_inference _________________
self = <tests.pipelines.bria.test_pipeline_bria.BriaPipelineNightlyTests testMethod=test_bria_inference>
def test_bria_inference(self):
pipe = BriaPipeline.from_pretrained("briaai/BRIA-3.2", torch_dtype=torch.bfloat16)
pipe.to(torch_device)
prompt = "a close-up of a smiling cat, high quality, realistic"
image = pipe(prompt=prompt, num_inference_steps=5, output_type="np").images[0]
> image_slice = image[0, :10, :10, 0].flatten()
^^^^^^^^^^^^^^^^^^^^^
E IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed
diffusers/tests/pipelines/bria/test_pipeline_bria.py:349: IndexError
----------------------------- Captured stdout call -----------------------------
Using dynamic shift in pipeline with sequence length 4096
----------------------------- Captured stderr call -----------------------------
Loading pipeline components...: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
Loading checkpoint shards: 33%|███▎ | 1/3 [00:02<00:04, 2.01s/it]
Loading checkpoint shards: 67%|██████▋ | 2/3 [00:04<00:02, 2.01s/it]
Loading checkpoint shards: 100%|██████████| 3/3 [00:04<00:00, 1.55s/it]
Loaded text_encoder as T5EncoderModel from `text_encoder` subfolder of briaai/BRIA-3.2.
Loading pipeline components...: 20%|██ | 1/5 [00:06<00:24, 6.15s/it]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loaded tokenizer as T5TokenizerFast from `tokenizer` subfolder of briaai/BRIA-3.2.
Loading pipeline components...: 40%|████ | 2/5 [00:06<00:08, 2.70s/it]Loaded vae as AutoencoderKL from `vae` subfolder of briaai/BRIA-3.2.
Loading pipeline components...: 60%|██████ | 3/5 [00:06<00:03, 1.52s/it]Loaded scheduler as FlowMatchEulerDiscreteScheduler from `scheduler` subfolder of briaai/BRIA-3.2.
Loaded transformer as BriaTransformer2DModel from `transformer` subfolder of briaai/BRIA-3.2.
Loading pipeline components...: 100%|██████████| 5/5 [00:08<00:00, 1.72s/it]
100%|██████████| 5/5 [00:08<00:00, 1.77s/it]
=========================== short test summary info ============================
FAILED diffusers/tests/pipelines/bria/test_pipeline_bria.py::BriaPipelineSlowTests::test_bria_inference_bf16 - AttributeError: 'NoneType' object has no attribute 'to'
FAILED diffusers/tests/pipelines/bria/test_pipeline_bria.py::BriaPipelineSlowTests::test_to_dtype - AttributeError: 'BriaPipelineSlowTests' object has no attribute 'get_dummy_...
FAILED diffusers/tests/pipelines/bria/test_pipeline_bria.py::BriaPipelineNightlyTests::test_bria_inference - IndexError: too many indices for array: array is 3-dimensional, but 4 were ... |
hey @SahilCarterr fixed the tests. |
@bot /style |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the amazing PR @galbria! It's very clean and looks good to merge for the most part. We are working on refactoring and redesigning some of the modeling implementations, so my asks are mostly in line with that. Apologies for the long wait-time to review!
- Updated the GitHub repository link for Bria 3.2. - Added usage instructions for the gated model access. - Introduced the BriaTransformerBlock and BriaAttention classes to the model architecture. - Refactored existing classes to integrate Bria-specific components, including BriaEmbedND and BriaPipeline. - Updated the pipeline output class to reflect Bria-specific functionality. - Adjusted test cases to align with the new Bria model structure.
@asomoza @a-r-r-o-w Thank you very much for the CR!!! ive fixed the comments🤘 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for iterating! I have a few more comments
return _get_projections(attn, hidden_states, encoder_hidden_states) | ||
|
||
|
||
def get_1d_rotary_pos_embed( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit): wondering if this could be imported and used here, or you have any custom changes?
diffusers/src/diffusers/models/embeddings.py
Line 1112 in 3c0531b
def get_1d_rotary_pos_embed( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we did some custom changes in
freqs_cos = freqs.cos().repeat_interleave(2, dim=1).float()
freqs_sin = freqs.sin().repeat_interleave(2, dim=1).float()
compare to:
freqs_cos = freqs.cos().repeat_interleave(2, dim=1, output_size=freqs.shape[1] * 2).float()
freqs_sin = freqs.sin().repeat_interleave(2, dim=1, output_size=freqs.shape[1] * 2).float()
- Removed outdated inference example from Bria 3.2 documentation. - Introduced the BriaTransformerBlock class to enhance model architecture. - Updated attention handling to use `attention_kwargs` instead of `joint_attention_kwargs`. - Improved import structure in the Bria pipeline to handle optional dependencies. - Adjusted test cases to reflect changes in model dtype assertions.
Thanks for the fast response🏎️ |
What does this PR do?
Implementing Bria 3.2 pipeline and BriaTransformer2D
issue: here
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.