Skip to content

Commit 2015c68

Browse files
authored
Add Starcoder model support + demo (#225)
* Add support for `gpt_bigcode` models * Create basic code-completion sample application * Update sidebar * Remove debug statement * Disable 1B model (for now) * Display progress bars * Reuse config if not specified * Update supported_models.py * Update comment * Add temperature/sample/topk generation params * Update sidebar * Add `gpt_bigcode` to supported models list * Add code playground example * Update title * Cleanup * Ignore `bigcode/starcoderbase-1b` from tests * Update transformers.js version for demo
1 parent da67f41 commit 2015c68

19 files changed

+679
-6
lines changed

Diff for: README.md

+2
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ Want to jump straight in? Get started with one of our sample applications/templa
111111
|-------------------|----------------------------------|-------------------------------|
112112
| Whisper Web | Speech recognition w/ Whisper | [link](https://github.com/xenova/whisper-web) |
113113
| Doodle Dash | Real-time sketch-recognition game (see [blog](https://huggingface.co/blog/ml-web-games)) | [link](https://github.com/xenova/doodle-dash) |
114+
| Code Playground | In-browser code completion website | [link](./examples/code-completion/) |
114115
| React | Multilingual translation website | [link](./examples/react-translator/) |
115116
| Browser extension | Text classification extension | [link](./examples/extension/) |
116117
| Electron | Text classification application | [link](./examples/electron/) |
@@ -261,6 +262,7 @@ You can refine your search by selecting the task you're interested in (e.g., [te
261262
1. **[FLAN-T5](https://huggingface.co/docs/transformers/model_doc/flan-t5)** (from Google AI) released in the repository [google-research/t5x](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints) by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei
262263
1. **[GPT Neo](https://huggingface.co/docs/transformers/model_doc/gpt_neo)** (from EleutherAI) released in the repository [EleutherAI/gpt-neo](https://github.com/EleutherAI/gpt-neo) by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
263264
1. **[GPT-2](https://huggingface.co/docs/transformers/model_doc/gpt2)** (from OpenAI) released with the paper [Language Models are Unsupervised Multitask Learners](https://blog.openai.com/better-language-models/) by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
265+
1. **[GPTBigCode](https://huggingface.co/docs/transformers/model_doc/gpt_bigcode)** (from BigCode) released with the paper [SantaCoder: don't reach for the stars!](https://arxiv.org/abs/2301.03988) by Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra.
264266
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (from Facebook) released with the paper [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125) by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
265267
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Machine translation models trained using [OPUS](http://opus.nlpl.eu/) data by Jörg Tiedemann. The [Marian Framework](https://marian-nmt.github.io/) is being developed by the Microsoft Translator Team.
266268
1. **[MobileBERT](https://huggingface.co/docs/transformers/model_doc/mobilebert)** (from CMU/Google Brain) released with the paper [MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices](https://arxiv.org/abs/2004.02984) by Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou.

Diff for: docs/snippets/3_examples.snippet

+1
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Want to jump straight in? Get started with one of our sample applications/templa
44
|-------------------|----------------------------------|-------------------------------|
55
| Whisper Web | Speech recognition w/ Whisper | [link](https://github.com/xenova/whisper-web) |
66
| Doodle Dash | Real-time sketch-recognition game (see [blog](https://huggingface.co/blog/ml-web-games)) | [link](https://github.com/xenova/doodle-dash) |
7+
| Code Playground | In-browser code completion website | [link](./examples/code-completion/) |
78
| React | Multilingual translation website | [link](./examples/react-translator/) |
89
| Browser extension | Text classification extension | [link](./examples/extension/) |
910
| Electron | Text classification application | [link](./examples/electron/) |

Diff for: docs/snippets/6_supported-models.snippet

+1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
1. **[FLAN-T5](https://huggingface.co/docs/transformers/model_doc/flan-t5)** (from Google AI) released in the repository [google-research/t5x](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints) by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei
1212
1. **[GPT Neo](https://huggingface.co/docs/transformers/model_doc/gpt_neo)** (from EleutherAI) released in the repository [EleutherAI/gpt-neo](https://github.com/EleutherAI/gpt-neo) by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
1313
1. **[GPT-2](https://huggingface.co/docs/transformers/model_doc/gpt2)** (from OpenAI) released with the paper [Language Models are Unsupervised Multitask Learners](https://blog.openai.com/better-language-models/) by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
14+
1. **[GPTBigCode](https://huggingface.co/docs/transformers/model_doc/gpt_bigcode)** (from BigCode) released with the paper [SantaCoder: don't reach for the stars!](https://arxiv.org/abs/2301.03988) by Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra.
1415
1. **[M2M100](https://huggingface.co/docs/transformers/model_doc/m2m_100)** (from Facebook) released with the paper [Beyond English-Centric Multilingual Machine Translation](https://arxiv.org/abs/2010.11125) by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
1516
1. **[MarianMT](https://huggingface.co/docs/transformers/model_doc/marian)** Machine translation models trained using [OPUS](http://opus.nlpl.eu/) data by Jörg Tiedemann. The [Marian Framework](https://marian-nmt.github.io/) is being developed by the Microsoft Translator Team.
1617
1. **[MobileBERT](https://huggingface.co/docs/transformers/model_doc/mobilebert)** (from CMU/Google Brain) released with the paper [MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices](https://arxiv.org/abs/2004.02984) by Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou.

Diff for: examples/code-completion/.eslintrc.cjs

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
module.exports = {
2+
env: { browser: true, es2020: true, 'node': true },
3+
extends: [
4+
'eslint:recommended',
5+
'plugin:react/recommended',
6+
'plugin:react/jsx-runtime',
7+
'plugin:react-hooks/recommended',
8+
],
9+
parserOptions: { ecmaVersion: 'latest', sourceType: 'module' },
10+
settings: { react: { version: '18.2' } },
11+
plugins: ['react-refresh'],
12+
rules: {
13+
'react-refresh/only-export-components': 'warn',
14+
'react/prop-types': 'off',
15+
},
16+
}

Diff for: examples/code-completion/.gitignore

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Logs
2+
logs
3+
*.log
4+
npm-debug.log*
5+
yarn-debug.log*
6+
yarn-error.log*
7+
pnpm-debug.log*
8+
lerna-debug.log*
9+
10+
node_modules
11+
dist
12+
dist-ssr
13+
*.local
14+
15+
# Editor directories and files
16+
.vscode/*
17+
!.vscode/extensions.json
18+
.idea
19+
.DS_Store
20+
*.suo
21+
*.ntvs*
22+
*.njsproj
23+
*.sln
24+
*.sw?

Diff for: examples/code-completion/index.html

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
<!DOCTYPE html>
2+
<html lang="en">
3+
<head>
4+
<meta charset="UTF-8" />
5+
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
6+
<title>Transformers.js - Code completion playground</title>
7+
</head>
8+
<body>
9+
<div id="root"></div>
10+
<script type="module" src="/src/main.jsx"></script>
11+
</body>
12+
</html>

Diff for: examples/code-completion/package.json

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
{
2+
"name": "code-completion",
3+
"private": true,
4+
"version": "0.0.0",
5+
"type": "module",
6+
"scripts": {
7+
"dev": "vite",
8+
"build": "vite build",
9+
"lint": "eslint src --ext js,jsx --report-unused-disable-directives --max-warnings 0",
10+
"preview": "vite preview"
11+
},
12+
"dependencies": {
13+
"@xenova/transformers": "^2.4.4",
14+
"@monaco-editor/react": "^4.5.1",
15+
"react": "^18.2.0",
16+
"react-dom": "^18.2.0"
17+
},
18+
"devDependencies": {
19+
"@types/react": "^18.2.15",
20+
"@types/react-dom": "^18.2.7",
21+
"@vitejs/plugin-react": "^4.0.3",
22+
"autoprefixer": "^10.4.14",
23+
"eslint": "^8.45.0",
24+
"eslint-plugin-react": "^7.32.2",
25+
"eslint-plugin-react-hooks": "^4.6.0",
26+
"eslint-plugin-react-refresh": "^0.4.3",
27+
"postcss": "^8.4.27",
28+
"tailwindcss": "^3.3.3",
29+
"vite": "^4.4.5"
30+
},
31+
"overrides": {
32+
"protobufjs": "^7.2.4"
33+
}
34+
}

Diff for: examples/code-completion/postcss.config.js

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
export default {
2+
plugins: {
3+
tailwindcss: {},
4+
autoprefixer: {},
5+
},
6+
}

Diff for: examples/code-completion/src/App.css

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
.sidebar {
2+
background-color: #181818;
3+
color: #CCCCCC;
4+
}
5+
6+
body{
7+
background-color: #1F1F1F;
8+
color: white;
9+
}
10+
11+
.progress-container {
12+
position: relative;
13+
font-size: 16px;
14+
color: white;
15+
/* background-color: #e9ecef; */
16+
border-radius: 8px;
17+
text-align: left;
18+
overflow: hidden;
19+
}
20+
21+
.progress-bar {
22+
padding: 2px 4px;
23+
z-index: 0;
24+
top: 0;
25+
width: 1%;
26+
height: 100%;
27+
overflow: hidden;
28+
background-color: #007bff;
29+
white-space: nowrap;
30+
}
31+
32+
.progress-text {
33+
z-index: 2;
34+
}

0 commit comments

Comments
 (0)