PaliGemma #1636

mattdangerw · 2024-05-21T16:46:04Z

Initial implementation of PaliGemma for KerasNLP.

Co-authored-by: divyashreepathihalli <[email protected]>

* update image size arg thourght paligemma * update tests

During generate, and in Gemma itself, we scale all text embeddings by the sqrt of the hidden dim. We should update our PaliGemmaBackbone to do the same.

* Add cli arguments to speed up conversion * Formatting * Update checkpoint conversion to image_classifier

* Add preset * move location * address review comments

…ength * Update pali_gemma_causal_lm_preprocesor.py * Update pali_gemma_causal_lm_preprocesor.py * Update pali_gemma_causal_lm_preprocesor.py * code reformat

* replicated gemma tokenizer * Different fix * Similar fix for causal lm * code reformat --------- Co-authored-by: Varun Singh <[email protected]>

I am not sure if we want to force the users to pass the image embeddings themselves, it might be more frienly to allow the raw image input. Anyway, since this is unused and untested, and we should check with the LIT team anyway on how to set this up, let's just remove it for now.

* Added docstrings for paligemma decoder and backbone * Add causal lm docstring * add vit docstring * add dtype arg * update docstring * Line wrapping and small nits * added vit docstrings * cleaned up with Matt's comments --------- Co-authored-by: divyashreepathihalli <[email protected]>

Spell out "zero" not "0" for api consistency. Remove include rescaling, the code looked very broken (wrong scale, output unused). Fix formatting.

* Add a response_mask input * Improved attention logic to account for response mask (#87) * Improved attention logic to account for response mask * Addressed several comments * remove vit_num_classes arg from pali_gemma_backbone * fix backbone test * Try simplifying the masking code * Updated tests for thoroughness * Comments and one fix * update preset version * deleted test using unused code path * Added cast to solve bool issues * update presets path * code reformat# modified: keras_nlp/src/models/pali_gemma/pali_gemma_decoder_block_test.py * remove changes to backbone args --------- Co-authored-by: divyashreepathihalli <[email protected]> Co-authored-by: Matt Watson <[email protected]> --------- Co-authored-by: Varun Singh <[email protected]> Co-authored-by: divyashreepathihalli <[email protected]>

Otherwise the symbol would have mismatched code examples

* More consistent defaults for PaliGemma In general, we do not copy the hyper parameters for a specific pre-trained model into the init args. Do the same here for consistency. Also, use as small test models as possible, so our unit testing stays somewhat reasonable. * add basica nd saved model test --------- Co-authored-by: divyashreepathihalli <[email protected]>

divyashreepathihalli and others added 30 commits May 21, 2024 16:43

Add VIT Encoder

cd7e2f8

Add MHAPooling layer + end to end ViT model

53119bd

Feature/pg gemma changes

aa9bd89

Misc fixes

962aea7

update vit model and add a test for verifying output shape.

f23974b

Feature/pg gemma changes

53865e0

Co-authored-by: divyashreepathihalli <[email protected]>

Vit model weights conversion

86540c0

Update imports

6160ebe

Add vit attention

92e697d

add paligemma functional model

0454f1f

Paligemma full model checkpoints conversion script

56a95ce

Fix ViT build issue

9a46b6e

Multi modal Refactor for PaliGemma

2a2e6b1

Export the public API surface

b2a14d8

update image size arg throughout PaliGemma

53266b1

* update image size arg thourght paligemma * update tests

Update convert_paligemma_checkpoints.py

597daaf

Renames for consistency

87d6c5a

Update convert_pali_gemma_checkpoints.py

3ab245c

More consistency improvements for PaliGemma

671a161

Do the same scaling in our backbone we do for generate

ffd5d98

During generate, and in Gemma itself, we scale all text embeddings by the sqrt of the hidden dim. We should update our PaliGemmaBackbone to do the same.

Update conversion and add cli arguments

b94b6d3

* Add cli arguments to speed up conversion * Formatting * Update checkpoint conversion to image_classifier

Add presets

d262569

* Add preset * move location * address review comments

Update pali_gemma_causal_lm_preprocesor.py to default text sequence l…

caa7cef

…ength * Update pali_gemma_causal_lm_preprocesor.py * Update pali_gemma_causal_lm_preprocesor.py * Update pali_gemma_causal_lm_preprocesor.py * code reformat

Tokenizer fix

c8f327a

* replicated gemma tokenizer * Different fix * Similar fix for causal lm * code reformat --------- Co-authored-by: Varun Singh <[email protected]>

Allow fit calls for pali_gemma

bc3811b

Allow generate on unbatch input

1afea65

Greedy sample by default for pali gemma

0ba0920

Minor fixes for the vit

e8bff89

Spell out "zero" not "0" for api consistency. Remove include rescaling, the code looked very broken (wrong scale, output unused). Fix formatting.

mattdangerw and others added 5 commits May 21, 2024 16:44

Update pali_gemma_presets.py path

5267d5a

Add a tokenizer docstring for pali gemma

83ee31d

Otherwise the symbol would have mismatched code examples

Update pali_gemma_presets.py

edc66e8

github-actions bot added the Gemma Gemma model specific issues label May 21, 2024

mattdangerw added the kokoro:force-run Runs Tests on GPU label May 21, 2024

kokoro-team removed the kokoro:force-run Runs Tests on GPU label May 21, 2024

mattdangerw merged commit b2ec380 into master May 21, 2024

mattdangerw deleted the paligemma branch August 22, 2024 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PaliGemma #1636

PaliGemma #1636

Uh oh!

mattdangerw commented May 21, 2024

Uh oh!

Uh oh!

PaliGemma #1636

PaliGemma #1636

Uh oh!

Conversation

mattdangerw commented May 21, 2024

Uh oh!

Uh oh!