Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenizer test #21

Merged
merged 1 commit into from
Feb 20, 2025
Merged

Conversation

lucylq
Copy link
Contributor

@lucylq lucylq commented Feb 15, 2025

TODO:
buck changes

Test Plan

Build

cmake .     -DCMAKE_INSTALL_PREFIX=cmake-out -DTOKENIZERS_BUILD_TEST=ON -Bcmake-out
cmake --build cmake-out -j9 --target install

Test

(executorch) [[email protected] /data/users/lfq/tokenizers/cmake-out (lfq.tokenizer-test)]$ ctest
Test project /data/users/lfq/tokenizers/cmake-out
    Start 1: test_base64
1/5 Test #1: test_base64 ......................   Passed    0.00 sec
    Start 2: test_llama2c_tokenizer
2/5 Test #2: test_llama2c_tokenizer ...........   Passed    0.00 sec
    Start 3: test_pre_tokenizer
3/5 Test #3: test_pre_tokenizer ...............   Passed    0.73 sec
    Start 4: test_sentencepiece
4/5 Test #4: test_sentencepiece ...............   Passed    0.04 sec
    Start 5: test_tiktoken
5/5 Test #5: test_tiktoken ....................   Passed    3.32 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) =   4.10 sec

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 15, 2025
@lucylq lucylq marked this pull request as draft February 15, 2025 02:15
@lucylq lucylq marked this pull request as ready for review February 19, 2025 17:31
@facebook-github-bot
Copy link
Contributor

@lucylq has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

lucylq added a commit to lucylq/tokenizers that referenced this pull request Feb 19, 2025
Summary: Pull Request resolved: pytorch-labs#21

Test Plan:
## OSS 
Build
```
cmake .     -DCMAKE_INSTALL_PREFIX=cmake-out -DTOKENIZERS_BUILD_TEST=ON -Bcmake-out
cmake --build cmake-out -j9 --target install
```

Test
```
(executorch) [[email protected] /data/users/lfq/tokenizers/cmake-out (lfq.tokenizer-test)]$ ctest
Test project /data/users/lfq/tokenizers/cmake-out
    Start 1: test_base64
1/5 Test pytorch-labs#1: test_base64 ......................   Passed    0.00 sec
    Start 2: test_llama2c_tokenizer
2/5 Test pytorch-labs#2: test_llama2c_tokenizer ...........   Passed    0.00 sec
    Start 3: test_pre_tokenizer
3/5 Test pytorch-labs#3: test_pre_tokenizer ...............   Passed    0.73 sec
    Start 4: test_sentencepiece
4/5 Test pytorch-labs#4: test_sentencepiece ...............   Passed    0.04 sec
    Start 5: test_tiktoken
5/5 Test pytorch-labs#5: test_tiktoken ....................   Passed    3.32 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) =   4.10 sec
```

## Internal
```
 buck2 test fbsource//xplat/pytorch/tokenizers/test:
 buck2 test fbcode//pytorch/tokenizers/test:
```

Differential Revision: D69860352

Pulled By: lucylq
@lucylq lucylq force-pushed the lfq.tokenizer-test branch from cba637e to bf530db Compare February 19, 2025 22:10
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69860352

lucylq added a commit to lucylq/executorch-1 that referenced this pull request Feb 19, 2025
Summary: Pull Request resolved: pytorch-labs/tokenizers#21

Test Plan:
## OSS 
Build
```
cmake .     -DCMAKE_INSTALL_PREFIX=cmake-out -DTOKENIZERS_BUILD_TEST=ON -Bcmake-out
cmake --build cmake-out -j9 --target install
```

Test
```
(executorch) [[email protected] /data/users/lfq/tokenizers/cmake-out (lfq.tokenizer-test)]$ ctest
Test project /data/users/lfq/tokenizers/cmake-out
    Start 1: test_base64
1/5 Test pytorch-labs/tokenizers#1: test_base64 ......................   Passed    0.00 sec
    Start 2: test_llama2c_tokenizer
2/5 Test pytorch-labs/tokenizers#2: test_llama2c_tokenizer ...........   Passed    0.00 sec
    Start 3: test_pre_tokenizer
3/5 Test pytorch-labs/tokenizers#3: test_pre_tokenizer ...............   Passed    0.73 sec
    Start 4: test_sentencepiece
4/5 Test pytorch-labs/tokenizers#4: test_sentencepiece ...............   Passed    0.04 sec
    Start 5: test_tiktoken
5/5 Test pytorch-labs/tokenizers#5: test_tiktoken ....................   Passed    3.32 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) =   4.10 sec
```

## Internal
```
 buck2 test fbsource//xplat/pytorch/tokenizers/test:
 buck2 test fbcode//pytorch/tokenizers/test:
```

Differential Revision: D69860352

Pulled By: lucylq
@lucylq lucylq requested a review from larryliu0820 February 19, 2025 22:15
Summary:
X-link: pytorch/executorch#8586


Test Plan:
## OSS 
Build
```
cmake .     -DCMAKE_INSTALL_PREFIX=cmake-out -DTOKENIZERS_BUILD_TEST=ON -Bcmake-out
cmake --build cmake-out -j9 --target install
```

Test
```
(executorch) [[email protected] /data/users/lfq/tokenizers/cmake-out (lfq.tokenizer-test)]$ ctest
Test project /data/users/lfq/tokenizers/cmake-out
    Start 1: test_base64
1/5 Test pytorch-labs#1: test_base64 ......................   Passed    0.00 sec
    Start 2: test_llama2c_tokenizer
2/5 Test pytorch-labs#2: test_llama2c_tokenizer ...........   Passed    0.00 sec
    Start 3: test_pre_tokenizer
3/5 Test pytorch-labs#3: test_pre_tokenizer ...............   Passed    0.73 sec
    Start 4: test_sentencepiece
4/5 Test pytorch-labs#4: test_sentencepiece ...............   Passed    0.04 sec
    Start 5: test_tiktoken
5/5 Test pytorch-labs#5: test_tiktoken ....................   Passed    3.32 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) =   4.10 sec
```

## Internal
```
 buck2 test fbsource//xplat/pytorch/tokenizers/test:
 buck2 test fbcode//pytorch/tokenizers/test:
```

Reviewed By: larryliu0820

Differential Revision: D69860352

Pulled By: lucylq
@lucylq lucylq force-pushed the lfq.tokenizer-test branch from bf530db to 9b354ae Compare February 19, 2025 22:47
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69860352

lucylq added a commit to lucylq/executorch-1 that referenced this pull request Feb 19, 2025
Summary:

X-link: pytorch-labs/tokenizers#21

Test Plan:
## OSS 
Build
```
cmake .     -DCMAKE_INSTALL_PREFIX=cmake-out -DTOKENIZERS_BUILD_TEST=ON -Bcmake-out
cmake --build cmake-out -j9 --target install
```

Test
```
(executorch) [[email protected] /data/users/lfq/tokenizers/cmake-out (lfq.tokenizer-test)]$ ctest
Test project /data/users/lfq/tokenizers/cmake-out
    Start 1: test_base64
1/5 Test pytorch-labs/tokenizers#1: test_base64 ......................   Passed    0.00 sec
    Start 2: test_llama2c_tokenizer
2/5 Test pytorch-labs/tokenizers#2: test_llama2c_tokenizer ...........   Passed    0.00 sec
    Start 3: test_pre_tokenizer
3/5 Test pytorch-labs/tokenizers#3: test_pre_tokenizer ...............   Passed    0.73 sec
    Start 4: test_sentencepiece
4/5 Test pytorch-labs/tokenizers#4: test_sentencepiece ...............   Passed    0.04 sec
    Start 5: test_tiktoken
5/5 Test pytorch-labs/tokenizers#5: test_tiktoken ....................   Passed    3.32 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) =   4.10 sec
```

## Internal
```
 buck2 test fbsource//xplat/pytorch/tokenizers/test:
 buck2 test fbcode//pytorch/tokenizers/test:
```

Reviewed By: larryliu0820

Differential Revision: D69860352

Pulled By: lucylq
@facebook-github-bot facebook-github-bot merged commit 72500a0 into pytorch-labs:main Feb 20, 2025
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants