Skip to content

Commit b2214d8

Browse files
authored
Add bias to documentation of linear classifiers (#3538)
* Add bias to the score formula for binary classifiers * Fix multi-class linear trainers * Fix #3471 * Change back unnecessary changes
1 parent 2e36bd9 commit b2214d8

File tree

3 files changed

+7
-4
lines changed

3 files changed

+7
-4
lines changed

src/Microsoft.ML.Mkl.Components/SymSgdClassificationTrainer.cs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,8 @@ namespace Microsoft.ML.Trainers
5252
/// ### Training Algorithm Details
5353
/// The symbolic stochastic gradient descent is an algorithm that makes its predictions by finding a separating hyperplane.
5454
/// For instance, with feature values $f0, f1,..., f_{D-1}$, the prediction is given by determining what side of the hyperplane the point falls into.
55-
/// That is the same as the sign of the feature's weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_0, w_1,..., w_{D-1}$ are the weights computed by the algorithm.
55+
/// That is the same as the sign of the feature's weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i) + b$, where $w_0, w_1,..., w_{D-1}$
56+
/// are the weights computed by the algorithm, and $b$ is the bias computed by the algorithm.
5657
///
5758
/// While most symbolic stochastic gradient descent algorithms are inherently sequential - at each step, the processing of the current example depends on the parameters learned from previous examples.
5859
/// This algorithm trains local models in separate threads and probabilistic model cobminer that allows the local models to be combined

src/Microsoft.ML.StandardTrainers/Standard/Online/AveragedPerceptron.cs

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,9 @@ namespace Microsoft.ML.Trainers
4242
///
4343
/// ### Training Algorithm Details
4444
/// The perceptron is a classification algorithm that makes its predictions by finding a separating hyperplane.
45-
/// For instance, with feature values $f0, f1,..., f_{D-1}$, the prediction is given by determining what side of the hyperplane the point falls into.
46-
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i)$, where $w_0, w_1,..., w_{D-1}$ are the weights computed by the algorithm.
45+
/// For instance, with feature values $f_0, f_1,..., f_{D-1}$, the prediction is given by determining what side of the hyperplane the point falls into.
46+
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} (w_i * f_i) + b$, where $w_0, w_1,..., w_{D-1}$
47+
/// are the weights computed by the algorithm, and $b$ is the bias computed by the algorithm.
4748
///
4849
/// The perceptron is an online algorithm, which means it processes the instances in the training set one at a time.
4950
/// It starts with a set of initial weights (zero, random, or initialized from a previous learner). Then, for each example in the training set, the weighted sum of the features is computed.

src/Microsoft.ML.StandardTrainers/Standard/Online/LinearSvm.cs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,8 @@ namespace Microsoft.ML.Trainers
4848
/// Linear [SVM](https://en.wikipedia.org/wiki/Support-vector_machine#Linear_SVM) implements
4949
/// an algorithm that finds a hyperplane in the feature space for binary classification, by solving an [SVM problem](https://en.wikipedia.org/wiki/Support-vector_machine#Computing_the_SVM_classifier).
5050
/// For instance, with feature values $f_0, f_1,..., f_{D-1}$, the prediction is given by determining what side of the hyperplane the point falls into.
51-
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} \left(w_i * f_i \right) + b$, where $w_0, w_1,..., w_{D-1}$ and $b$ are the weights and bias computed by the algorithm.
51+
/// That is the same as the sign of the feautures' weighted sum, i.e. $\sum_{i = 0}^{D-1} \left(w_i * f_i \right) + b$, where $w_0, w_1,..., w_{D-1}$
52+
/// are the weights computed by the algorithm, and $b$ is the bias computed by the algorithm.
5253
///
5354
/// This algorithm implemented is the PEGASOS method, which alternates between stochastic gradient descent steps and projection steps,
5455
/// introduced in [this paper](http://ttic.uchicago.edu/~shai/papers/ShalevSiSr07.pdf) by Shalev-Shwartz, Singer and Srebro.

0 commit comments

Comments
 (0)