Skip to content

Commit 30d5bcf

Browse files
authored
Update to Model Builders page for DAALL-7867 (#6)
* Update github pages for DAALL-7867 Update to fix daal4py conversion of XGBoost model error for non-numerical data used for training. * GH Pages for DAALL-7867
1 parent c3799f8 commit 30d5bcf

33 files changed

+862
-626
lines changed

daal4py/.buildinfo

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
# Sphinx build info version 1
22
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
3-
config: b3f9dd5f9dffbf1afeba6d7177901aa9
3+
config: e1589d1aed29dd35620ebcc85aa22a53
44
tags: 645f666f9bcd5a90fca523b33c5a78b7

daal4py/.nojekyll

Whitespace-only changes.

daal4py/_modules/index.html

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,20 @@
11
<!DOCTYPE html>
2-
<html class="writer-html5" lang="en" >
2+
<html class="writer-html5" lang="en">
33
<head>
44
<meta charset="utf-8" />
55
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
6-
<title>Overview: module code &mdash; daal4py 2021 documentation</title>
7-
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
8-
<link rel="stylesheet" href="../_static/css/theme.css" type="text/css" />
6+
<title>Overview: module code &mdash; daal4py 2021.1 documentation</title>
7+
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
8+
<link rel="stylesheet" type="text/css" href="../_static/css/theme.css" />
9+
<link rel="stylesheet" type="text/css" href="../_static/style.css" />
10+
11+
912
<!--[if lt IE 9]>
1013
<script src="../_static/js/html5shiv.min.js"></script>
1114
<![endif]-->
1215

16+
<script src="../_static/jquery.js"></script>
17+
<script src="../_static/_sphinx_javascript_frameworks_compat.js"></script>
1318
<script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
1419
<script src="../_static/doctools.js"></script>
1520
<script src="../_static/sphinx_highlight.js"></script>
@@ -44,6 +49,9 @@
4449
<a href="../contents.html" class="icon icon-home">
4550
daal4py
4651
</a>
52+
<div class="version">
53+
2021
54+
</div>
4755
<div role="search">
4856
<form id="rtd-search-form" class="wy-form" action="../search.html" method="get">
4957
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
@@ -52,7 +60,7 @@
5260
</form>
5361
</div>
5462
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
55-
<p class="caption" role="heading"><span class="caption-text">Contents</span></p>
63+
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
5664
<ul>
5765
<li class="toctree-l1"><a class="reference internal" href="../index.html">About daal4py</a></li>
5866
<li class="toctree-l1"><a class="reference internal" href="../data.html">Data</a></li>
@@ -98,7 +106,7 @@ <h1>All modules for which code is available</h1>
98106
<hr/>
99107

100108
<div role="contentinfo">
101-
<p>&#169; Copyright 2023, Intel.</p>
109+
<p>&#169; Copyright Intel.</p>
102110
</div>
103111

104112
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a

daal4py/_sources/algorithms.rst.txt

Lines changed: 78 additions & 79 deletions
Large diffs are not rendered by default.

daal4py/_sources/contents.rst.txt

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,30 @@
1-
.. _contents::
1+
.. ******************************************************************************
2+
.. * Copyright 2020 Intel Corporation
3+
.. *
4+
.. * Licensed under the Apache License, Version 2.0 (the "License");
5+
.. * you may not use this file except in compliance with the License.
6+
.. * You may obtain a copy of the License at
7+
.. *
8+
.. * http://www.apache.org/licenses/LICENSE-2.0
9+
.. *
10+
.. * Unless required by applicable law or agreed to in writing, software
11+
.. * distributed under the License is distributed on an "AS IS" BASIS,
12+
.. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
.. * See the License for the specific language governing permissions and
14+
.. * limitations under the License.
15+
.. *******************************************************************************/
216
3-
.. include:: note.rst
17+
.. _contents:
18+
19+
########
20+
Contents
21+
########
422

23+
.. include:: note.rst
24+
525
.. toctree::
626
:maxdepth: 2
7-
:caption: Contents
27+
:caption: Contents:
828

929
About daal4py <index>
1030
Data <data>
@@ -13,4 +33,4 @@
1333
Distributed Mode <scaling>
1434
Streaming Mode <streaming>
1535
Examples <examples>
16-
Scikit-Learn API <sklearn>
36+
Scikit-Learn API <sklearn>

daal4py/_sources/data.rst.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Input Data
2121
##########
2222

2323
.. include:: note.rst
24-
24+
2525
All array arguments to compute functions and to algorithm constructors can be
2626
provided in different formats. daal4py will automatically do its best to work on
2727
the provided data with minimal overhead, most notably without copying the data.

daal4py/_sources/examples.rst.txt

Lines changed: 58 additions & 58 deletions
Large diffs are not rendered by default.

daal4py/_sources/index.rst.txt

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Fast, Scalable and Easy Machine Learning With DAAL4PY
2121
#####################################################
2222

2323
.. include:: note.rst
24-
24+
2525
Daal4py makes your Machine Learning algorithms in Python lightning fast and easy to use. It provides
2626
highly configurable Machine Learning kernels, some of which support streaming input data and/or can
2727
be easily and efficiently scaled out to clusters of workstations. Internally it uses Intel(R)
@@ -102,7 +102,7 @@ Last but not least, daal4py allows :ref:`getting input data from streams <stream
102102

103103
oneAPI and GPU support in daal4py
104104
---------------------------------
105-
daal4py oneAPI and GPU support is deprecated. Use `scikit-learn-intelex <https://intel.github.io/scikit-learn-intelex/oneapi-gpu.html#>`_
105+
daal4py oneAPI and GPU support is deprecated. Use `scikit-learn-intelex <https://intel.github.io/scikit-learn-intelex/latest/oneapi-gpu.html#>`_
106106
instead.
107107

108108

@@ -146,11 +146,11 @@ daal4py is available at the `Python Package Index <https://pypi.org/project/daal
146146
on Anaconda Cloud in `Conda Forge channel <https://anaconda.org/conda-forge/daal4py>`_
147147
and in `Intel channel <https://anaconda.org/intel/daal4py>`_.
148148
Sources and build instructions are available in
149-
`daal4py repository <https://github.com/intel/scikit-learn-intelex/tree/master/daal4py>`_.
149+
`daal4py repository <https://github.com/intel/scikit-learn-intelex/tree/main/daal4py>`_.
150150

151151
The daal4py package is available via same distribution channels and platforms as scikit-learn-intelex.
152152
See
153-
`scikit-learn-intelex requirements <https://intel.github.io/scikit-learn-intelex/system-requirements.html>` _
153+
`scikit-learn-intelex requirements <https://intel.github.io/scikit-learn-intelex/latest/system-requirements.html>` _
154154

155155
- Install from PyPI::
156156

@@ -194,11 +194,11 @@ Scikit-Learn API and patching
194194
-----------------------------
195195
.. tip::
196196
We recommend using
197-
the 'scikit-learn-intelex package patching <https://intel.github.io/scikit-learn-intelex/what-is-patching.html>' _ for the scikit-learn patching. daal4py exposes some oneDAL solvers using a scikit-learn compatible API.
197+
the 'scikit-learn-intelex package patching <https://intel.github.io/scikit-learn-intelex/latest/what-is-patching.html>' _ for the scikit-learn patching.
198+
daal4py exposes some oneDAL solvers using a scikit-learn compatible API.
198199

199200
daal4py can furthermore monkey-patch the ``sklearn`` package to use the DAAL
200201
solvers as drop-in replacement without any code change.
201202

202203
Please refer to the section on :ref:`scikit-learn API and patching <sklearn>`
203204
for more details.
204-

daal4py/_sources/model-builders.rst.txt

Lines changed: 61 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -24,20 +24,26 @@ Model Builders for the Gradient Boosting Frameworks
2424

2525
Introduction
2626
------------------
27-
Gradient boosting on decision trees is one of the most accurate and efficient
28-
machine learning algorithms for classification and regression.
29-
The most popular implementations of it are:
27+
Gradient boosting on decision trees is one of the most accurate and efficient
28+
machine learning algorithms for classification and regression.
29+
The most popular implementations of it are:
3030

3131
* XGBoost*
3232
* LightGBM*
3333
* CatBoost*
3434

3535
daal4py Model Builders deliver the accelerated
36-
models inference of those frameworks. The inference is performed by the oneDAL GBT implementation tuned
37-
for the best performance on the Intel(R) Architecture.
36+
models inference of those frameworks. The inference is performed by the oneDAL GBT implementation tuned
37+
for the best performance on the Intel(R) Architecture.
38+
39+
.. note::
40+
41+
Currently, experimental support for XGBoost* and LightGBM* categorical data is not supported.
42+
For the model conversion to work with daal4py, convert non-numeric data to numeric data
43+
before training and converting the model.
3844

3945
Conversion
40-
---------
46+
----------
4147
The first step is to convert already trained model. The
4248
API usage for different frameworks is the same:
4349

@@ -61,37 +67,76 @@ CatBoost::
6167
Classification and Regression Inference
6268
----------------------------------------
6369

64-
The API is the same for classification and regression inference.
65-
Based on the original model passed to the ``convert_model``, ``d4p_prediction`` is either the classification or regression output.
66-
70+
The API is the same for classification and regression inference.
71+
Based on the original model passed to the ``convert_model()``, ``d4p_prediction`` is either the classification or regression output.
72+
6773
::
68-
74+
6975
d4p_prediction = d4p_model.predict(test_data)
7076

7177
Here, the ``predict()`` method of ``d4p_model`` is being used to make predictions on the ``test_data`` dataset.
72-
The ``d4p_prediction`` variable stores the predictions made by the ``predict()`` method.
78+
The ``d4p_prediction`` variable stores the predictions made by the ``predict()`` method.
79+
80+
SHAP Value Calculation for Regression Models
81+
------------------------------------------------------------
82+
83+
SHAP contribution and interaction value calculation are natively supported by models created with daal4py Model Builders.
84+
For these models, the ``predict()`` method takes additional keyword arguments:
85+
86+
::
87+
88+
d4p_model.predict(test_data, pred_contribs=True) # for SHAP contributions
89+
d4p_model.predict(test_data, pred_interactions=True) # for SHAP interactions
90+
91+
The returned prediction has the shape:
92+
93+
* ``(n_rows, n_features + 1)`` for SHAP contributions
94+
* ``(n_rows, n_features + 1, n_features + 1)`` for SHAP interactions
95+
Here, ``n_rows`` is the number of rows (i.e., observations) in
96+
``test_data``, and ``n_features`` is the number of features in the dataset.
97+
98+
The prediction result for SHAP contributions includes a feature attribution value for each feature and a bias term for each observation.
99+
100+
The prediction result for SHAP interactions comprises ``(n_features + 1) x (n_features + 1)`` values for all possible
101+
feature combinations, along with their corresponding bias terms.
102+
103+
.. note:: The shapes of SHAP contributions and interactions are consistent with the XGBoost results.
104+
In contrast, the `SHAP Python package <https://shap.readthedocs.io/en/latest/>`_ drops bias terms, resulting
105+
in SHAP contributions (SHAP interactions) with one fewer column (one fewer column and row) per observation.
73106

74107
Scikit-learn-style Estimators
75-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
108+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
76109

77110
You can also use the scikit-learn-style classes ``GBTDAALClassifier`` and ``GBTDAALRegressor`` to convert and infer your models. For example:
78111

79-
::
112+
::
80113

81114
from daal4py.sklearn.ensemble import GBTDAALRegressor
82115
reg = xgb.XGBRegressor()
83116
reg.fit(X, y)
84117
d4p_predt = GBTDAALRegressor.convert_model(reg).predict(X)
85118

119+
120+
Limitations
121+
------------------
122+
Model Builders support only base inference with prediction and probabilities prediction. The functionality is to be extended.
123+
Therefore, there are the following limitations:
124+
- The categorical features are not supported for conversion and prediction.
125+
- The multioutput models are not supported for conversion and prediction.
126+
- SHAP values can be calculated for regression models only.
127+
128+
86129
Examples
87130
---------------------------------
88131
Model Builders models conversion
89132

90-
- `XGBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/master/examples/daal4py/model_builders_xgboost.py>`_
91-
- `LightGBM model conversion <https://github.com/intel/scikit-learn-intelex/blob/master/examples/daal4py/model_builders_lightgbm.py>`_
92-
- `CatBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/master/examples/daal4py/model_builders_catboost.py>`_
133+
- `XGBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_xgboost.py>`_
134+
- `SHAP value prediction from an XGBoost model <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_xgboost_shap.py>`_
135+
- `LightGBM model conversion <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_lightgbm.py>`_
136+
- `CatBoost model conversion <https://github.com/intel/scikit-learn-intelex/blob/main/examples/daal4py/model_builders_catboost.py>`_
93137

94138
Articles and Blog Posts
95139
---------------------------------
96140

97141
- `Improving the Performance of XGBoost and LightGBM Inference <https://medium.com/intel-analytics-software/improving-the-performance-of-xgboost-and-lightgbm-inference-3b542c03447e>`_
142+

daal4py/_sources/note.rst.txt

Lines changed: 0 additions & 4 deletions
This file was deleted.

0 commit comments

Comments
 (0)