Skip to content
This repository was archived by the owner on May 8, 2024. It is now read-only.

Commit 1af322d

Browse files
authored
Develop (#1)
* Remove stock env. * Remove text-intel-torch.yml * Add intel_env.yml * Update quantize_inc_gpt2.py -Fix broken instructions. * Update generate_text.py -Add xpu compatibility. * Add .gitignore * Remove inference-transformers.png * Remove E2E_stock-transformers.png * Remove intel flag from finetune_model.py * Add kaggle to intel_env.yml * Update README.md - Remove stock references - Correct style - Redistribute information - Add Introduction - Add Solution Technical Overview - Add Solution Technical Details - Add Validated Hardware Details - Add How it Works - Add Get Started - Add Supported Runtime Environment - Add Summary and Next Steps - Add Appendix * Remove README.md from data directory * Update SECURITY.md file * Update intel_env.yml file - Add gperftools to dependencies. * Remove config_finetuned.yml * Move prompt.csv to config dir * Add gpt_generate_text.py file * Add files to patch transformers to use xpu. * Added logger to gptj_generate_text.py * Update README.md - Correct styles - Fix typos - Add sections - Format commands * Fix bfloat16 RuntimeError * Update intel_env.yml dependencies * Update transformers_xpu.patch * Updated license year to 2024 * Remove xpu dependency from intel_env.yml * Create intel_env_xpu.yml - Differs with intel_env.yml by including a intel-extension-for-pytorch version capable of using XPU. * Add instructions to use XPU to README.md * Make typo and corrections to instructions for README.md * Add blank space line at EOF * Remove inconsistent sentence from README.md
1 parent 8314169 commit 1af322d

20 files changed

+727
-416
lines changed

.gitignore

+143
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
share/python-wheels/
24+
*.egg-info/
25+
.installed.cfg
26+
*.egg
27+
MANIFEST
28+
29+
# PyInstaller
30+
# Usually these files are written by a python script from a template
31+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
32+
*.manifest
33+
*.spec
34+
35+
# Installer logs
36+
pip-log.txt
37+
pip-delete-this-directory.txt
38+
39+
# Unit test / coverage reports
40+
htmlcov/
41+
.tox/
42+
.nox/
43+
.coverage
44+
.coverage.*
45+
.cache
46+
nosetests.xml
47+
coverage.xml
48+
*.cover
49+
*.py,cover
50+
.hypothesis/
51+
.pytest_cache/
52+
cover/
53+
54+
# Translations
55+
*.mo
56+
*.pot
57+
58+
# Django stuff:
59+
*.log
60+
local_settings.py
61+
db.sqlite3
62+
db.sqlite3-journal
63+
64+
# Flask stuff:
65+
instance/
66+
.webassets-cache
67+
68+
# Scrapy stuff:
69+
.scrapy
70+
71+
# Sphinx documentation
72+
docs/_build/
73+
74+
# PyBuilder
75+
.pybuilder/
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# IPython
82+
profile_default/
83+
ipython_config.py
84+
85+
# pyenv
86+
# For a library or package, you might want to ignore these files since the code is
87+
# intended to run in multiple environments; otherwise, check them in:
88+
# .python-version
89+
90+
# pipenv
91+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
93+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
94+
# install all needed dependencies.
95+
#Pipfile.lock
96+
97+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
98+
__pypackages__/
99+
100+
# Celery stuff
101+
celerybeat-schedule
102+
celerybeat.pid
103+
104+
# SageMath parsed files
105+
*.sage.py
106+
107+
# Environments
108+
.env
109+
.venv
110+
venv/
111+
ENV/
112+
env.bak/
113+
venv.bak/
114+
115+
# Spyder project settings
116+
.spyderproject
117+
.spyproject
118+
119+
# Rope project settings
120+
.ropeproject
121+
122+
# mkdocs documentation
123+
/site
124+
125+
# mypy
126+
.mypy_cache/
127+
.dmypy.json
128+
dmypy.json
129+
130+
# Pyre type checker
131+
.pyre/
132+
133+
# pytype static type analyzer
134+
.pytype/
135+
136+
# Cython debug symbols
137+
cython_debug/
138+
139+
data/
140+
output/
141+
saved_models
142+
nc_workspace
143+
.vscode

LICENSE

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright (c) 2023, Intel Corporation
1+
Copyright (c) 2024, Intel Corporation
22

33
Redistribution and use in source and binary forms, with or without
44
modification, are permitted provided that the following conditions are met:

README.md

+360-327
Large diffs are not rendered by default.

SECURITY.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Security Policy
2-
Intel is committed to rapidly addressing security vulnerabilities affecting our customers and providing clear guidance on the solution, impact, severity and mitigation.
2+
3+
Intel is committed to rapidly addressing security vulnerabilities affecting our customers and providing clear guidance on the solution, impact, severity and mitigation.
34

45
## Reporting a Vulnerability
5-
Please report any security vulnerabilities in this project utilizing the guidelines [here](https://www.intel.com/content/www/us/en/security-center/vulnerability-handling-guidelines.html).
6+
7+
Please report any security vulnerabilities in this project [utilizing the guidelines here](https://www.intel.com/content/www/us/en/security-center/vulnerability-handling-guidelines.html).

assets/E2E_stock-transformers.png

-15.4 KB
Binary file not shown.

assets/inference-transformers.png

-21.1 KB
Binary file not shown.

configs/config_finetuned.yml

-14
This file was deleted.

prompt.csv configs/prompt.csv

File renamed without changes.

data/README.md

-11
This file was deleted.

env/intel/text-intel-torch.yml

-15
This file was deleted.

env/intel_env.yml

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
name: text_generation_intel
2+
channels:
3+
- intel
4+
- conda-forge
5+
dependencies:
6+
- python=3.9
7+
- intel-extension-for-pytorch=2.0.100
8+
- neural-compressor=2.3.1
9+
- numpy=1.24.3
10+
- pandas==1.5.3
11+
- kaggle==1.5.16
12+
- pip=23.3.1
13+
- pip:
14+
- datasets==2.16.0
15+
- accelerate==0.25.0
16+
- transformers==4.26.0
17+
- optimum[onnxruntime]==1.6.4
18+
- onnxruntime==1.16.3

env/intel_env_xpu.yml

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
name: text_generation_xpu_intel
2+
channels:
3+
- intel
4+
- conda-forge
5+
dependencies:
6+
- python=3.9
7+
- intel-extension-for-pytorch=2.0.120
8+
- neural-compressor=2.3.1
9+
- numpy=1.24.3
10+
- pandas==1.5.3
11+
- kaggle==1.5.16
12+
- pip=23.3.1
13+
- pip:
14+
- datasets==2.16.0
15+
- accelerate==0.25.0
16+
- transformers==4.26.0
17+
- optimum[onnxruntime]==1.6.4
18+
- onnxruntime==1.16.3

env/stock/text-stock-torch.yml

-13
This file was deleted.

src/apply_xpu_patch.py

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# !/usr/bin/env python3
2+
# -*- coding: utf-8 -*-
3+
4+
# Copyright (C) 2024 Intel Corporation
5+
# SPDX-License-Identifier: BSD-3-Clause
6+
7+
# pylint: disable=C0415,E0401,R0914
8+
9+
import transformers
10+
import os
11+
import subprocess
12+
13+
module_path = transformers.__path__[0]
14+
# get path for training_args.py
15+
target_file_path = os.path.join(module_path, "training_args.py")
16+
17+
# apply patch to training_args.py file in the transformers package
18+
subprocess.run(["patch", target_file_path, "transformers_xpu.patch"])
19+
20+
print("patch applied successfully")

src/finetune_model.py

+8-15
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# !/usr/bin/env python3
22
# -*- coding: utf-8 -*-
33

4-
# Copyright (C) 2023 Intel Corporation
4+
# Copyright (C) 2024 Intel Corporation
55
# SPDX-License-Identifier: BSD-3-Clause
66

77
# pylint: disable=C0415,E0401,R0914
@@ -25,7 +25,7 @@
2525
import torch
2626

2727
from utils import data_load
28-
28+
import intel_extension_for_pytorch as ipex
2929

3030
def main(flags):
3131

@@ -49,13 +49,12 @@ def main(flags):
4949
optimizer = torch.optim.AdamW(model.parameters(), lr=flags.lr)
5050

5151
# use IPEX to optimize model and optimizer for training
52-
if flags.intel:
53-
import intel_extension_for_pytorch as ipex
54-
if flags.bfloat16:
55-
model, optimizer = ipex.optimize(
52+
if flags.bfloat16:
53+
model, optimizer = ipex.optimize(
5654
model, optimizer=optimizer, dtype=torch.bfloat16)
57-
else:
58-
model, optimizer = ipex.optimize(
55+
model = model.bfloat16() # Prevents "RuntimeError: mat1 and mat2 must have the same dtype"
56+
else:
57+
model, optimizer = ipex.optimize(
5958
model, optimizer=optimizer, dtype=torch.float32)
6059

6160
lr_scheduler = get_scheduler(
@@ -74,7 +73,7 @@ def main(flags):
7473
num_train_epochs=flags.num_epochs,
7574
evaluation_strategy="no",
7675
save_strategy="no",
77-
warmup_steps=10,
76+
warmup_steps=10
7877
)
7978

8079
# Train the model with our dataset for Causal Language Modeling
@@ -127,12 +126,6 @@ def main(flags):
127126
default=5e-5,
128127
help='learning rate for training. Defaults to 5e-5.')
129128

130-
parser.add_argument('--intel',
131-
required=False,
132-
help="use intel pytorch extension to optimize model. Defaults to False.",
133-
action="store_true",
134-
default=False)
135-
136129
parser.add_argument('--bfloat16',
137130
required=False,
138131
default=False,

0 commit comments

Comments
 (0)