Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fork #1

Open
wants to merge 107 commits into
base: main
Choose a base branch
from
Open

Update fork #1

wants to merge 107 commits into from

Conversation

AleHD
Copy link

@AleHD AleHD commented Feb 26, 2025

Opening PR to keep track of upstream changes

Narsil and others added 30 commits February 26, 2025 14:11
* add recommendations for Ascend NPU using flash_attn

* update recommend_message_npu

Co-authored-by: Marc Sun <[email protected]>

---------

Co-authored-by: Marc Sun <[email protected]>
…36395)

* fix: prevent model access error during Optuna hyperparameter tuning

The `transformers.integrations.integration_utils.run_hp_search_optuna` function releases model memory and sets trainer.model to None after each trial. This causes an AttributeError when  subsequent Trainer.train calls attempt to access the model before reinitialization. This is only an issue when `fp16_full_eval` or `bf16_full_eval` flags are enabled.

* Update src/transformers/trainer.py

Co-authored-by: Marc Sun <[email protected]>

---------

Co-authored-by: Marc Sun <[email protected]>
* move `TestAssistedCandidateGeneratorDifferentTokenizers` into a new testing file

* refactor

* NOTHING. add space to rerun github actions tests

* remove it...

* `UniversalSpeculativeDecodingGenerator`

* Use `UniversalSpeculativeDecodingGenerator` when `generation_config.do_sample=True`

* assistant tokenizes only the target's new suffix

* formatting

* fix code

* fix code

* formatting

* add `TestGenerateWithDifferentModels`

* `TestGenerateWithDifferentModels` parameterize on `do_sample`

* `AssistantVocabMapping` & `AssistantVocabMappingCache`

* formatting

* `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits`

* improve `_get_assistant_to_target_input_ids` & formatting

* renaming

* WIP: debugging `min_new_tokens`

* fix get_target_ids

* `UniversalSpeculativeDecodingGenerator`

* assistant tokenizes only the target's new suffix

* formatting

* fix code

* fix code

* formatting

* `TestGenerateWithDifferentModels` parameterize on `do_sample`

* `AssistantVocabMapping` & `AssistantVocabMappingCache`

* formatting

* `AssistantToTargetTranslator`: `get_target_input_ids` & `get_target_logits`

* improve `_get_assistant_to_target_input_ids` & formatting

* renaming

* WIP: debugging `min_new_tokens`

* fix get_target_ids

* fix device issue

* fix get_assistant_input_ids

* add `TestAssistedCandidateGeneratorDifferentTokenizers`

* formatting

* `AssistantVocabTranslatorCache` refactor & tests

* revert changes in `src/transformers/generation/logits_process.py`

* refactor `AssistedCandidateGenerator`

* refactor `AssistedCandidateGeneratorDifferentTokenizers`

* formatting

* refactor `UniversalSpeculativeDecodingGenerator`

* fix negative value for max_new_tokens

* fix generation length target + attention_mask vs. assistant + attent

* fix device

* fix negative max_new_tokens bug

* fix UAG

* minor

* formatting

* `AssistedCandidateGeneratorDifferentTokenizers` `lookbehind`s init

* resolve conflict & formatting

* rerun CI tests

* remove space...

* remove old code

* fix candidate_input_ids device

* minor

* formatting

* Fix prepare + apply (#7)

* fix prepare + apply

* move to cpu

* simplity suppress_tokens

* fix bugs and refacatoring

* device move

* handle self.config.vocab_size > len(target_tokenizer.get_vocab())

* no need to normalize in candidate_generator

* address Nadav's comments + minor

* optimize device move + SuppressTokensLogitsProcessor

* AssistantToTargetTranslator, SuppressTokensLogitsProcessor and tokenizers mapping improvements

* padding size

* padding improvement

* fix and simplify get_target_logits

* renaming in get_target_logits

* minor

* add filter_value and suppress_tokens_id

* style + rename

* remove TODO

* restore original SelectTokensLogitsProcessor with modification

* fix style

* fix _update_past_and_masks and optimize code

* remove assistant_vocab_size arg

* fix attention_mask

* call _prepare_attention_mask also if not has_past_key_values

* handling attention mask for first generation

* comment

* restore test

* remove SelectTokensLogitsProcessor

* _update_past_and_masks implementation for USD

* Add unittests for Universal Assisted generation

* fix style

* update tests

* Remove unused import and fix `test_speculation_depth` test

* exclude special and reserved tokens from tokenizer for UAG

* mv `test_universal_assisted_generation.py` to `generation/test_candidate_generator.py`

* Remove unused imports and fix style using `make style` (#9)

* formatting

* Swap gated `meta-llama/llama-3.2` with `allenai/llama` (#10)

* Fix space sign disagreement (#12)

* default values for AssistantToTargetTranslator fileds

* fix space sign

* minor

* fix test + style

* Default values for some fields of assistant to target translator (#11)

* default values for AssistantToTargetTranslator fileds

* fix

* add support to empty logit_processors

* Update candidate_generator.py (#15)

fix typo

* BUG fix in _prepare_assistant_input_ids (#14)

* fix _prepare_assistant_input_ids

* target_to_assistant_input_ids

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: Nadav Timor <[email protected]>

---------

Co-authored-by: Nadav Timor <[email protected]>

* typo (`target_to_assistant_input_ids`)

* formatting

* merge upstream/main

* Fix minor review comments (#16)

* Fix: `token_ids.to(torch.int64)` (#18)

* tok ids to `torch.int64` (reference: https://huggingface.co/docs/transformers.js/en/api/tokenizers)

* `LongTensor`

* fix dtype

* `assistant_input_ids.to(dtype=torch.long)`

* Remove unused import from test_candidate_generator.py

* Remove unused import from test_candidate_generator.py

* Remove `numpy` import

* resolve pr comments (#19)

* `AssistantToTargetTranslator` docstring

* (per gante's comment) `filter_value` and `suppress_tokens_id` to class constants

* update `AssistantToTargetTranslator` docstring

* (gante's comment) replace `match-case`

* formatting

* Fix Joao's comments (#21)

* remove threading

* fix logits_processor

* fix test device

* fix style (#23)

* Move atm (#24)

* move AssistantToTargetTranslator

* fixup

* fix logit_processor

* add atm_translator test

* refactor test

* remove threading from test

* add require_torch in tests

* move AssistantVocabTranslatorCache + add tests

* ruff fix

---------

Co-authored-by: jmamou <[email protected]>
Co-authored-by: Gaurav <[email protected]>
Co-authored-by: Gaurav Jain <[email protected]>
Co-authored-by: gauravjain14 <[email protected]>
* fix config

* update

---------

Co-authored-by: Marc Sun <[email protected]>
* clean code

* oups

* fix merge

* yups

* fix if

* now you can play

* fix shape issue

* try non blocking

* fix

* updates

* up

* updates

* fix most of thetests

* update

* update

* small updates

* up

* fix the remaining bug?

* update

* rename when you read from the file

* buffer issues

* current status

* cleanup

* properly allocate dumb memory

* update a small bug

* fix colwise rep issue

* fix keep in float 32 that was keeping everything in float 32

* typo

* more fixes with keep_in_fp32_modules as we use to serach on it

* fix ROPE dtype for TP

* remove what's breaking the tests

* updates

* update and fixes

* small cleanup after merging

* allocate 2x to be safe

* style, auto

* update

* yup nit

* fix

* remove slow as fuck torch api :(

* work

* fixup

* update

* brting the fix back

* fix and update

* fixes

Co-authored-by: Marc Sun <[email protected]>

* updates because some suggestions were wrong 👀

* update?

* fuck this bloated function

* typo

* fix the dumb prefix thing once and forall

* fixes here and there

* updates

* remove prints

* fix strict cases

* styel

* properly fix keys on load!

* update

* fix base model prefix issue

* style

* update

* fix all?

* remoce 1 print

* fix the final etsts

* fixup

* last nits

* fix the detach issue which cause a 2x slowdown

* fixup

* small fixes

* ultra nit

* fix

* fix

---------

Co-authored-by: Marc Sun <[email protected]>
* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

* draft

---------

Co-authored-by: ydshieh <[email protected]>
fix permission

Co-authored-by: ydshieh <[email protected]>
* fix permission

* fix permission

---------

Co-authored-by: ydshieh <[email protected]>
fix permission

Co-authored-by: ydshieh <[email protected]>
* Skip collecting duplicated weight

* format
* test

* docstring

* prepare distributed cache data

* fix cat dim

* test mvp

* add test checks

* like this?

* working test and solution

* nit

* nit

* add shape info
* Lazy import libraries in `src/transformers/image_utils.py`

* `make fixup`

Signed-off-by: Harry Mellor <[email protected]>

* Protect imports

Signed-off-by: Harry Mellor <[email protected]>

---------

Signed-off-by: Harry Mellor <[email protected]>
* cry

* trigger

---------

Co-authored-by: ydshieh <[email protected]>
* Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher

* More updates

* More updates

* fixed: GroundingDinoLoss

* fixed: failing tests

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: amyeroberts <[email protected]>

* Addressed comments

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

Co-authored-by: Sangbum Daniel Choi <[email protected]>

* add: cardinality loss and make box loss as copy from

* change: default for reduction loss is sum

* fix: vectorized generate fake box

* fix copies

* Addressed comments

* addressed comments

* addressed one-hot

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: Sangbum Daniel Choi <[email protected]>

* Addressed comments

* fixed test

* Update src/transformers/models/grounding_dino/modeling_grounding_dino.py

* Update tests/models/grounding_dino/test_modeling_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <[email protected]>

* Starting to fix GroundingDinoLoss and GroundingDinoHungarianMatcher

* More updates

* More updates

* fixed: GroundingDinoLoss

* add: cardinality loss and make box loss as copy from

* fix copies

* Revert "Update tests/models/grounding_dino/test_modeling_grounding_dino.py"

This reverts commit aa74c4c57c430e54cc74c414d6269edb65c73e83.

* [run-slow] groundigdino

* remove nestedtensor

* [run-slow] groundig_dino

* [run-slow] grounding_dino

* [run-slow] grounding_dino

* [run-slow] grounding_dino

* check

* check

* add: enconder intermediate outputs to ImageLoss forward

* add: GroundingDinoForObjectDetectionLoss in the loss directory

* make style

* fix the loss function

* remove class_reduction since it sum is default

* remove class_reduction

* Update src/transformers/loss/loss_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <[email protected]>

* simple fix

* Update src/transformers/loss/loss_grounding_dino.py

Co-authored-by: Pavel Iakubovskii <[email protected]>

* minor fix

* Update src/transformers/loss/loss_for_object_detection.py

---------

Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Sangbum Daniel Choi <[email protected]>
Co-authored-by: Pavel Iakubovskii <[email protected]>
Co-authored-by: sangbumchoi <[email protected]>
Co-authored-by: ydshieh <[email protected]>
* Fix loading model with mismatched sizes

* trigger tests
* refactor image processor slow got ocr

* add working image processor fast

* fix fast image processor, update doc

* use one big loop for processing patches
* fix

* style

* better allocation

* fix

* fix

* style

* revert disk

* exit

* style

* return if nothing to cache

* dtensor guard

* fix regressiion

* fix regression

* fix

* fix
* Fix _load_state_dict_into_meta_model with device_map=None

* Update src/transformers/modeling_utils.py
* Check if fixes

* Fix zero3 loading

* Quality

* Fix marc nit

* Add fast tests

* Migrate to integrations.deepspeed rather than modeling_utils

* Style
* fix

* repush

---------

Co-authored-by: ydshieh <[email protected]>
transformers/image_processing_utils.py:41: UserWarning: The following named arguments are not valid for `SamImageProcessor.preprocess` and were ignored: 'point_pad_value'
* fix regression

* fix param

* fix load_state_dict

* style

* better fix for module

* fix tests

* quick fix for now

* rm print
chore: fix messagedescriptions in arguments and comments
* Fix pipeline-peft interaction

* once again you have committed a debug breakpoint

* Remove extra testing line

* Add a test to check adapter loading

* Correct adapter path

* make fixup

* Remove unnecessary check

* Make check a little more stringent
* Fix edge case for continue_final_message

* lstrip() correctly

* Add regression test

* Add a clearer error message when the final message is not present

* Add a clearer error message when the final message is not present

* Fix massive bug!
Cyrilvallez and others added 30 commits March 12, 2025 13:39
* squash everything together
start to simplify inner logic

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

continue refactor

fix

small fixes

add type hints/docstring

Update modeling_utils.py

remove _fast_init

keep improving

Update modeling_utils.py

Update modeling_utils.py

new first tp loading version

style

fix weird in-place op

trigger CIs

Update modeling_utils.py

much clearer renaming of keys

fix

update

Update test_modeling_common.py

trigger CIs

update

update

style

Update modeling_utils.py

Update modeling_utils.py

Update modeling_utils.py

fix

fast download first prototype

remove old function

remove old functions

Remove unused function and move back _get_tp_registry

fix tp plan registry

simplify

CIs

Update hub.py

Update modeling_utils.py

simplify

simplify renaming logic

remove unused check

add sanity check back (a test depends on it)

Update modeling_utils.py

finalize sound renaming logic

style

add forgotten check

Update modeling_utils.py

add key_mapping keyword

style

Update modeling_utils.py

add comment

minor updates

minor change for clarity

fix small prefix issue and simplify

style

trigger CIs

typo fix

Post rebase fix

post rebase cleanup

simplify tp

typo

oupsi

typo

correctly escape

improvements based on Marc's review

finalize Marc's review comments

 squash everything

* improve

* Update modeling_utils.py

* Update modeling_utils.py

* fix

* Update modeling_utils.py

* Update modeling_utils.py

* style

* Update modeling_utils.py

* simplify

* style

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* Update modeling_utils.py

* fix dtype issue

* Update modeling_utils.py

* style

* remove test that does not make sense

* style

* small fixes

* style

* fix

* cleanup after rebase

* style

* typo

* escape

* tp for task specific top modules

* Update modeling_utils.py

* Update modeling_utils.py

* fix allocation

* CIs

* CIs

* CIs

* improve docstring

* CIs

* Update modeling_utils.py

* fix
* Don't accidentally mutate the base_model_tp_plan

* Co-authored by: Joao Gante <[email protected]>

* Trigger tests

* Marking grad accum test as slow

* Add a flaky decorator

* Add a flaky decorator

* Use cyril's codeblock

* Don't copy() when it's None

* Use cyril's new codeblock

* make fixup
…fast ones (#36266)

* Add fast image processor class to processors supporting them

* fix test kosmos2
…processors (#36186)

* Remove differences between init and preprocess kwargs in fast image processors

* make modifs got_ocr2

* update gemma3
* refactor siglip2 fast image processor, add unused_kwargs in base fast image processor

* nits

* change unused_kwargs default to None

* update siglip2 fast image proc
* fix fused rescale normalize inconsistencies

* fix siglip2 fast image processor

* refactor kwargs validation and fused nirmalize rescale

* cleanup kwargs handling in preprocess

* update new procs after refactor
* fix

* switch to ellipsis instead

* Add co-author
Co-authored-by: fxmarty-amd <[email protected]>

* Add co-author second try
Co-authored-by: fxmarty-amd <[email protected]>
* fix wandb hp search unable to resume from sweep_id

* format styles

---------

Co-authored-by: Mohamed Mekkouri <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
* update

* small update

* no spqr quant

* testing

* testing

* test nightly

* gptqmodel

* flute

* fix hadamard

* running tests

* new docker

* fix docker

* run tests

* testing new docker

* new docker

* run tests

* new docker

* run tests

* final test

* update

* update

* run tests

* new docker

* launch tests

* test_docker

* running tests

* add comments

* fixing yml

* revert
…e kwargs (#36207)

Change qwen2VL image processors to have init and call accept the same kwargs
Corrects the type annotation to match actual usage. The variable was typed as
Dict[str, Dict[str, Callable]] but is actually used as Dict[str, Callable]
where keys are attention mechanism names and values are the corresponding
attention functions directly. This change makes the type annotation consistent
with how the dictionary is used in the codebase.
* Update tensor_parallel.py

* CIs
* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module

* chore: fix typos in utils module
* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* trigger CIs

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* better error messages

* Update test_modeling_utils.py

* Update test_modeling_utils.py
* add gguf support to t5encoder

Signed-off-by: Isotr0py <[email protected]>

* fix

Signed-off-by: Isotr0py <[email protected]>

* remove gguf from model_kwargs

Signed-off-by: Isotr0py <[email protected]>

---------

Signed-off-by: Isotr0py <[email protected]>
* make fixup

* make fixup

* Correct skip decorator

* Add TODOs

* add is_flaky() parentheses
* add support for fast image processors in add-new-model-like

* fix header not found add-fast-image-processor-cli

* Encourage adding fast image processor

* nit

* start improve doc

* update docs

* make requested modifs
* fix typo when  is on

* tiny

* add test and remove 'text_crops'

* lint
* Make the flaky list a little more general

* Trigger tests

* Make the flaky list a little more general
* Cleanup the regex used for doc preprocessing

* Run tests
* don't gc collect if 1 shard is used

* delete state dict anyways
* Set best_model_checkpoint only when ckpt exists.

Rather than set it explicitly without checking if the checkpoint directory even exists as before, now we moved the setting logic inside of _save_checkpoint and are only setting it if it exists.

* Added best_global_step to TrainerState.

* Added tests for best_model_checkpoint.

* Fixed hard-coded values in test to prevent fail.

* Added helper func and removed hard-coded best_step.

* Added side effect patch generator for _eval.

* Added evaluate side effect func.

* Removed erroneous patching.

* Fixed minor bug.

* Applied Ruff.

* Fixed Ruff problem in make style.

* Used Trainer.set_initial_training_values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.