Skip to content

Commit 822183c

Browse files
merveenoyanBenjaminBossanosansevieroadrinjalalistevhliu
authored
Skops announcement blog (#441)
* skops blog * skops initial commit * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * Update skops-library.md Co-authored-by: Omar Sanseviero <[email protected]> * Update skops-library.md Co-authored-by: Omar Sanseviero <[email protected]> * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * addressed comments + misc additions * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * Update skops-library.md Co-authored-by: Benjamin Bossan <[email protected]> * addressed comments * addressed comments * Update skops.md Co-authored-by: Omar Sanseviero <[email protected]> * Update skops.md Co-authored-by: Omar Sanseviero <[email protected]> * Update skops.md Co-authored-by: Omar Sanseviero <[email protected]> * Update skops.md Co-authored-by: Omar Sanseviero <[email protected]> * Update skops.md Co-authored-by: Omar Sanseviero <[email protected]> * misc improvements * Update skops.md Co-authored-by: Adrin Jalali <[email protected]> * Update skops.md Co-authored-by: Adrin Jalali <[email protected]> * Update skops.md Co-authored-by: Adrin Jalali <[email protected]> * Update skops.md Co-authored-by: Benjamin Bossan <[email protected]> * Update skops.md Co-authored-by: Adrin Jalali <[email protected]> * changed template docs link and reworded * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * Update skops.md Co-authored-by: Steven Liu <[email protected]> * made config content into a list * updated date Co-authored-by: Benjamin Bossan <[email protected]> Co-authored-by: Omar Sanseviero <[email protected]> Co-authored-by: Adrin Jalali <[email protected]> Co-authored-by: Steven Liu <[email protected]>
1 parent 863dc28 commit 822183c

File tree

4 files changed

+224
-1
lines changed

4 files changed

+224
-1
lines changed

_blog.yml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1088,6 +1088,7 @@
10881088
tags:
10891089
- rl
10901090

1091+
10911092
- local: how-to-train-sentence-transformers
10921093
title: "Train and Fine-Tune Sentence Transformers Models"
10931094
author: espejelomar
@@ -1097,14 +1098,16 @@
10971098
- guide
10981099
- nlp
10991100

1101+
11001102
- local: deploy-tfserving-kubernetes
11011103
title: "Deploying 🤗 ViT on Kubernetes with TF Serving"
11021104
author: chansung
11031105
thumbnail: /blog/assets/94_tf_serving_kubernetes/thumb.png
1104-
date: August 15, 2022
1106+
date: August 11, 2022
11051107
tags:
11061108
- guide
11071109
- cv
1110+
11081111

11091112
- local: tensorflow-philosophy
11101113
title: "Hugging Face's TensorFlow Philosophy"
@@ -1115,3 +1118,17 @@
11151118
- nlp
11161119
- cv
11171120
- guide
1121+
1122+
1123+
1124+
- local: skops
1125+
title: Introducing Skops
1126+
author: merve
1127+
thumbnail: /blog/assets/94_skops/introducing_skops.png
1128+
date: August 12, 2022
1129+
tags:
1130+
- open-source-collab
1131+
- scikit-learn
1132+
- announcement
1133+
- guide
1134+

assets/94_skops/introducing_skops.png

949 KB
Loading

assets/94_skops/skops_widget.png

78.1 KB
Loading

skops.md

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
---
2+
title: "Introducing Skops"
3+
thumbnail: /blog/assets/94_skops/introducing_skops.png
4+
---
5+
6+
<h1>
7+
Introducing Skops
8+
</h1>
9+
10+
<div class="blog-metadata">
11+
<small>Published August 12, 2022.</small>
12+
<a target="_blank" class="btn no-underline text-sm mb-5 font-sans" href="https://github.com/huggingface/blog/blob/main/skops-library.md">
13+
Update on GitHub
14+
</a>
15+
</div>
16+
17+
<div class="author-card">
18+
<a href="/merve">
19+
<img class="avatar avatar-user" src="https://aeiljuispo.cloudimg.io/v7/https://s3.amazonaws.com/moonup/production/uploads/1631694399207-6141a88b3a0ec78603c9e784.png?w=200&h=200&f=face" title="Gravatar">
20+
<div class="bfc">
21+
<code>merve</code>
22+
<span class="fullname">Merve Noyan</span>
23+
</div>
24+
</a>
25+
<a href="/adrin">
26+
<img class="avatar avatar-user" src="https://huggingface.co/avatars/f40271d9ff5ac148aab4c512f8ae6402.svg" title="Gravatar">
27+
<div class="bfc">
28+
<code>adrin</code>
29+
<span class="fullname">Adrin Jalali</span>
30+
</div>
31+
</a>
32+
<a href="/BenjaminB">
33+
<img class="avatar avatar-user" src="https://aeiljuispo.cloudimg.io/v7/https://s3.amazonaws.com/moonup/production/uploads/1656685953025-62bf03d1e80cec527083cd66.jpeg?w=200&h=200&f=face" title="Gravatar">
34+
<div class="bfc">
35+
<code>BenjaminB</code>
36+
<span class="fullname">Benjamin Bossan</span>
37+
</div>
38+
</a>
39+
</div>
40+
41+
## Introducing Skops
42+
43+
At Hugging Face, we are working on tackling various problems in open-source machine learning, including, hosting models securely and openly, enabling reproducibility, explainability and collaboration. We are thrilled to introduce you to our new library: Skops! With Skops, you can host your scikit-learn models on the Hugging Face Hub, create model cards for model documentation and collaborate with others.
44+
45+
Let's go through an end-to-end example: train a model first, and see step-by-step how to leverage Skops for sklearn in production.
46+
47+
```python
48+
# let's import the libraries first
49+
import sklearn
50+
from sklearn.datasets import load_breast_cancer
51+
from sklearn.tree import DecisionTreeClassifier
52+
from sklearn.model_selection import train_test_split
53+
54+
# Load the data and split
55+
X, y = load_breast_cancer(as_frame=True, return_X_y=True)
56+
X_train, X_test, y_train, y_test = train_test_split(
57+
X, y, test_size=0.3, random_state=42
58+
)
59+
60+
# Train the model
61+
model = DecisionTreeClassifier().fit(X_train, y_train)
62+
```
63+
64+
You can use any model filename and serialization method, like `pickle` or `joblib`. At the moment, our backend uses `joblib` to load the model. `hub_utils.init` creates a local folder containing the model in the given path, and the configuration file containing the specifications of the environment the model is trained in. The data and the task passed to the `init` will help Hugging Face Hub enable the inference widget on the model page as well as discoverability features to find the model.
65+
66+
```python
67+
from skops import hub_utils
68+
import pickle
69+
70+
# let's save the model
71+
model_path = "example.pkl"
72+
local_repo = "my-awesome-model"
73+
with open(model_path, mode="bw") as f:
74+
pickle.dump(model, file=f)
75+
76+
# we will now initialize a local repository
77+
hub_utils.init(
78+
model=model_path,
79+
requirements=[f"scikit-learn={sklearn.__version__}"],
80+
dst=local_repo,
81+
task="tabular-classification",
82+
data=X_test,
83+
)
84+
```
85+
86+
The repository now contains the serialized model and the configuration file.
87+
The configuration contains the following:
88+
- features of the model,
89+
- the requirements of the model,
90+
- an example input taken from `X_test` that we've passed,
91+
- name of the model file,
92+
- name of the task to be solved here.
93+
94+
We will now create the model card. The card should match the expected Hugging Face Hub format: a markdown part and a metadata section, which is a `yaml` section at the top. The keys to the metadata section are defined [here](https://huggingface.co/docs/hub/models-cards#model-card-metadata) and are used for the discoverability of the models.
95+
The content of the model card is determined by a template that has a:
96+
- `yaml` section on top for metadata (e.g. model license, library name, and more)
97+
- markdown section with free text and sections to be filled (e.g. simple description of the model),
98+
The following sections are extracted by `skops` to fill in the model card:
99+
- Hyperparameters of the model,
100+
- Interactive diagram of the model,
101+
- For metadata, library name, task identifier (e.g. tabular-classification), and information required by the inference widget are filled.
102+
103+
We will walk you through how to programmatically pass information to fill the model card. You can check out our documentation on the default template provided by `skops`, and its sections [here](https://skops.readthedocs.io/en/latest/model_card.html) to see what the template expects and what it looks like [here](https://github.com/skops-dev/skops/blob/main/skops/card/default_template.md).
104+
105+
You can create the model card by instantiating the `Card` class from `skops`. During model serialization, the task name and library name are written to the configuration file. This information is also needed in the card's metadata, so you can use the `metadata_from_config` method to extract the metadata from the configuration file and pass it to the card when you create it. You can add information and metadata using `add`.
106+
107+
```python
108+
from skops import card
109+
110+
# create the card
111+
model_card = card.Card(model, metadata=card.metadata_from_config(Path(destination_folder)))
112+
113+
limitations = "This model is not ready to be used in production."
114+
model_description = "This is a DecisionTreeClassifier model trained on breast cancer dataset."
115+
model_card_authors = "skops_user"
116+
get_started_code = "import pickle \nwith open(dtc_pkl_filename, 'rb') as file: \n clf = pickle.load(file)"
117+
citation_bibtex = "bibtex\n@inproceedings{...,year={2020}}"
118+
119+
# we can add the information using add
120+
model_card.add(
121+
citation_bibtex=citation_bibtex,
122+
get_started_code=get_started_code,
123+
model_card_authors=model_card_authors,
124+
limitations=limitations,
125+
model_description=model_description,
126+
)
127+
128+
# we can set the metadata part directly
129+
model_card.metadata.license = "mit"
130+
```
131+
132+
We will now evaluate the model and add a description of the evaluation method with `add`. The metrics are added by `add_metrics`, which will be parsed into a table.
133+
134+
```python
135+
from sklearn.metrics import (ConfusionMatrixDisplay, confusion_matrix,
136+
accuracy_score, f1_score)
137+
# let's make a prediction and evaluate the model
138+
y_pred = model.predict(X_test)
139+
# we can pass metrics using add_metrics and pass details with add
140+
model_card.add(eval_method="The model is evaluated using test split, on accuracy and F1 score with macro average.")
141+
model_card.add_metrics(accuracy=accuracy_score(y_test, y_pred))
142+
model_card.add_metrics(**{"f1 score": f1_score(y_test, y_pred, average="micro")})
143+
```
144+
145+
We can also add any plot of our choice to the card using `add_plot` like below.
146+
147+
```python
148+
import matplotlib.pyplot as plt
149+
from pathlib import Path
150+
# we will create a confusion matrix
151+
cm = confusion_matrix(y_test, y_pred, labels=model.classes_)
152+
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=model.classes_)
153+
disp.plot()
154+
155+
# save the plot
156+
plt.savefig(Path(local_repo) / "confusion_matrix.png")
157+
158+
# the plot will be written to the model card under the name confusion_matrix
159+
# we pass the path of the plot itself
160+
model_card.add_plot(confusion_matrix="confusion_matrix.png")
161+
```
162+
163+
Let's save the model card in the local repository. The file name here should be `README.md` since it is what Hugging Face Hub expects.
164+
```python
165+
model_card.save(Path(local_repo) / "README.md")
166+
```
167+
168+
We can now push the repository to the Hugging Face Hub. For this, we will use `push` from `hub_utils`. Hugging Face Hub requires tokens for authentication, therefore you need to pass your token in either `notebook_login` if you're logging in from a notebook, or `huggingface-cli login` if you're logging in from the CLI.
169+
170+
```python
171+
# if the repository doesn't exist remotely on the Hugging Face Hub, it will be created when we set create_remote to True
172+
repo_id = "skops-user/my-awesome-model"
173+
hub_utils.push(
174+
repo_id=repo_id,
175+
source=local_repo,
176+
token=token,
177+
commit_message="pushing files to the repo from the example!",
178+
create_remote=True,
179+
)
180+
```
181+
182+
Once we push the model to the Hub, anyone can use it unless the repository is private. You can download the models using `download`. Apart from the model file, the repository contains the model configuration and the environment requirements.
183+
184+
```python
185+
download_repo = "downloaded-model"
186+
hub_utils.download(repo_id=repo_id, dst=download_repo)
187+
```
188+
189+
The inference widget is enabled to make predictions in the repository.
190+
191+
![Hosted Inference Widget](blog/assets/94_skops/skops_widget.png)
192+
193+
If the requirements of your project have changed, you can use `update_env` to update the environment.
194+
195+
```python
196+
hub_utils.update_env(path=local_repo, requirements=["scikit-learn"])
197+
```
198+
199+
You can see the example repository pushed with above code [here](https://huggingface.co/scikit-learn/skops-blog-example).
200+
We have prepared two examples to show how to save your models and use model card utilities. You can find them in the resources section below.
201+
202+
203+
## Resources
204+
- [Model card tutorial](https://skops.readthedocs.io/en/latest/auto_examples/plot_model_card.html)
205+
- [hub_utils tutorial](https://skops.readthedocs.io/en/latest/auto_examples/plot_hf_hub.html)
206+
- [skops documentation](https://skops.readthedocs.io/en/latest/modules/classes.html)

0 commit comments

Comments
 (0)