Skip to content

Commit 61a7240

Browse files
authored
Merge pull request #198 from abhro/docstring-patch-1
Update docstrings and markdown docs
2 parents 3ac79af + a4ba165 commit 61a7240

16 files changed

+297
-285
lines changed

docs/src/document_strings.md

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,23 +29,39 @@ Your document string must include the following components, in order:
2929
implementation. Generally, defer details on the role of
3030
hyperparameters to the "Hyperparameters" section (see below).
3131

32-
- Instructions on *how to import the model type* from MLJ (because a user can already inspect the doc-string in the Model Registry, without having loaded the code-providing package).
32+
- Instructions on *how to import the model type* from MLJ (because a user can
33+
already inspect the doc-string in the Model Registry, without having loaded
34+
the code-providing package).
3335

3436
- Instructions on *how to instantiate* with default hyperparameters or with keywords.
3537

36-
- A *Training data* section: explains how to bind a model to data in a machine with all possible signatures (eg, `machine(model, X, y)` but also `machine(model, X, y, w)` if, say, weights are supported); the role and scitype requirements for each data argument should be itemized.
38+
- A *Training data* section: explains how to bind a model to data in a machine
39+
with all possible signatures (eg, `machine(model, X, y)` but also
40+
`machine(model, X, y, w)` if, say, weights are supported); the role and
41+
scitype requirements for each data argument should be itemized.
3742

3843
- Instructions on *how to fit* the machine (in the same section).
3944

4045
- A *Hyperparameters* section (unless there aren't any): an itemized list of the parameters, with defaults given.
4146

42-
- An *Operations* section: each implemented operation (`predict`, `predict_mode`, `transform`, `inverse_transform`, etc ) is itemized and explained. This should include operations with no data arguments, such as `training_losses` and `feature_importances`.
47+
- An *Operations* section: each implemented operation (`predict`,
48+
`predict_mode`, `transform`, `inverse_transform`, etc ) is itemized and
49+
explained. This should include operations with no data arguments, such as
50+
`training_losses` and `feature_importances`.
4351

44-
- A *Fitted parameters* section: To explain what is returned by `fitted_params(mach)` (the same as `MLJModelInterface.fitted_params(model, fitresult)` - see later) with the fields of that named tuple itemized.
52+
- A *Fitted parameters* section: To explain what is returned by `fitted_params(mach)`
53+
(the same as `MLJModelInterface.fitted_params(model, fitresult)` - see later)
54+
with the fields of that named tuple itemized.
4555

46-
- A *Report* section (if `report` is non-empty): To explain what, if anything, is included in the `report(mach)` (the same as the `report` return value of `MLJModelInterface.fit`) with the fields itemized.
56+
- A *Report* section (if `report` is non-empty): To explain what, if anything,
57+
is included in the `report(mach)` (the same as the `report` return value of
58+
`MLJModelInterface.fit`) with the fields itemized.
4759

48-
- An optional but highly recommended *Examples* section, which includes MLJ examples, but which could also include others if the model type also implements a second "local" interface, i.e., defined in the same module. (Note that each module referring to a type can declare separate doc-strings which appear concatenated in doc-string queries.)
60+
- An optional but highly recommended *Examples* section, which includes MLJ
61+
examples, but which could also include others if the model type also
62+
implements a second "local" interface, i.e., defined in the same module. (Note
63+
that each module referring to a type can declare separate doc-strings which
64+
appear concatenated in doc-string queries.)
4965

5066
- A closing *"See also"* sentence which includes a `@ref` link to the raw model type (if you are wrapping one).
5167

docs/src/implementing_a_data_front_end.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -84,30 +84,34 @@ Suppose a supervised model type `SomeSupervised` supports sample
8484
weights, leading to two different `fit` signatures, and that it has a
8585
single operation `predict`:
8686

87-
fit(model::SomeSupervised, verbosity, X, y)
88-
fit(model::SomeSupervised, verbosity, X, y, w)
87+
```julia
88+
fit(model::SomeSupervised, verbosity, X, y)
89+
fit(model::SomeSupervised, verbosity, X, y, w)
8990

90-
predict(model::SomeSupervised, fitresult, Xnew)
91+
predict(model::SomeSupervised, fitresult, Xnew)
92+
```
9193

9294
Without a data front-end implemented, suppose `X` is expected to be a
9395
table and `y` a vector, but suppose the core algorithm always converts
9496
`X` to a matrix with features as rows (each record corresponds to
9597
a column in the table). Then a new data-front end might look like
9698
this:
9799

98-
constant MMI = MLJModelInterface
99-
100-
# for fit:
101-
MMI.reformat(::SomeSupervised, X, y) = (MMI.matrix(X)', y)
102-
MMI.reformat(::SomeSupervised, X, y, w) = (MMI.matrix(X)', y, w)
103-
MMI.selectrows(::SomeSupervised, I, Xmatrix, y) =
104-
(view(Xmatrix, :, I), view(y, I))
105-
MMI.selectrows(::SomeSupervised, I, Xmatrix, y, w) =
106-
(view(Xmatrix, :, I), view(y, I), view(w, I))
107-
108-
# for predict:
109-
MMI.reformat(::SomeSupervised, X) = (MMI.matrix(X)',)
110-
MMI.selectrows(::SomeSupervised, I, Xmatrix) = (view(Xmatrix, :, I),)
100+
```julia
101+
constant MMI = MLJModelInterface
102+
103+
# for fit:
104+
MMI.reformat(::SomeSupervised, X, y) = (MMI.matrix(X)', y)
105+
MMI.reformat(::SomeSupervised, X, y, w) = (MMI.matrix(X)', y, w)
106+
MMI.selectrows(::SomeSupervised, I, Xmatrix, y) =
107+
(view(Xmatrix, :, I), view(y, I))
108+
MMI.selectrows(::SomeSupervised, I, Xmatrix, y, w) =
109+
(view(Xmatrix, :, I), view(y, I), view(w, I))
110+
111+
# for predict:
112+
MMI.reformat(::SomeSupervised, X) = (MMI.matrix(X)',)
113+
MMI.selectrows(::SomeSupervised, I, Xmatrix) = (view(Xmatrix, :, I),)
114+
```
111115

112116
With these additions, `fit` and `predict` are refactored, so that `X`
113117
and `Xnew` represent matrices with features as rows.

docs/src/quick_start_guide.md

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -99,8 +99,7 @@ Further to the last point, `a::Float64 = 0.5::(_ > 0)` indicates that
9999
the field `a` is a `Float64`, takes `0.5` as its default value, and
100100
expects its value to be positive.
101101

102-
Please see [this
103-
issue](https://github.com/JuliaAI/MLJBase.jl/issues/68)
102+
Please see [this issue](https://github.com/JuliaAI/MLJBase.jl/issues/68)
104103
for a known issue and workaround relating to the use of `@mlj_model`
105104
with negative defaults.
106105

@@ -201,7 +200,7 @@ For a classifier, the steps are fairly similar to a regressor with these differe
201200
1. `y` will be a categorical vector and you will typically want to use
202201
the integer encoding of `y` instead of `CategoricalValue`s; use
203202
`MLJModelInterface.int` for this.
204-
1. You will need to pass the full pool of target labels (not just
203+
2. You will need to pass the full pool of target labels (not just
205204
those observed in the training data) and additionally, in the
206205
`Deterministic` case, the encoding, to make these available to
207206
`predict`. A simple way to do this is to pass `y[1]` in the
@@ -210,19 +209,19 @@ For a classifier, the steps are fairly similar to a regressor with these differe
210209
method for recovering categorical elements from their integer
211210
representations (e.g., `d(2)` is the categorical element with `2`
212211
as encoding).
213-
2. In the case of a *probabilistic* classifier you should pass all
212+
3. In the case of a *probabilistic* classifier you should pass all
214213
probabilities simultaneously to the [`UnivariateFinite`](@ref) constructor
215214
to get an abstract `UnivariateFinite` vector (type
216215
`UnivariateFiniteArray`) rather than use comprehension or
217216
broadcasting to get a vanilla vector. This is for performance
218217
reasons.
219-
218+
220219
If implementing a classifier, you should probably consult the more
221220
detailed instructions at [The predict method](@ref).
222221

223222
**Examples**:
224223

225-
- GLM's [BinaryClassifier](https://github.com/JuliaAI/MLJModels.jl/blob/3687491b132be8493b6f7a322aedf66008caaab1/src/GLM.jl#L119-L131) (`Probabilistic`)
224+
- GLM's [BinaryClassifier](https://github.com/JuliaAI/MLJModels.jl/blob/3687491b132be8493b6f7a322aedf66008caaab1/src/GLM.jl#L119-L131) (`Probabilistic`)
226225

227226
- LIBSVM's [SVC](https://github.com/JuliaAI/MLJModels.jl/blob/master/src/LIBSVM.jl) (`Deterministic`)
228227

@@ -349,8 +348,8 @@ MLJModelInterface.metadata_model(YourModel1,
349348
output_scitype = MLJModelInterface.Table(MLJModelInterface.Continuous), # for an unsupervised, what output?
350349
supports_weights = false, # does the model support sample weights?
351350
descr = "A short description of your model"
352-
load_path = "YourPackage.SubModuleContainingModelStructDefinition.YourModel1"
353-
)
351+
load_path = "YourPackage.SubModuleContainingModelStructDefinition.YourModel1"
352+
)
354353
```
355354

356355
*Important.* Do not omit the `load_path` specification. Without a

docs/src/summary_of_methods.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,11 +43,11 @@ Optional, if `SomeSupervisedModel <: Probabilistic`:
4343

4444
```julia
4545
MMI.predict_mode(model::SomeSupervisedModel, fitresult, Xnew) =
46-
mode.(predict(model, fitresult, Xnew))
46+
mode.(predict(model, fitresult, Xnew))
4747
MMI.predict_mean(model::SomeSupervisedModel, fitresult, Xnew) =
48-
mean.(predict(model, fitresult, Xnew))
48+
mean.(predict(model, fitresult, Xnew))
4949
MMI.predict_median(model::SomeSupervisedModel, fitresult, Xnew) =
50-
median.(predict(model, fitresult, Xnew))
50+
median.(predict(model, fitresult, Xnew))
5151
```
5252

5353
Required, if the model is to be registered (findable by general users):

docs/src/supervised_models.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,15 @@ The following sections were written with `Supervised` models in mind, but also c
1919
material relevant to general models:
2020

2121
- [Summary of methods](@ref)
22-
- [The form of data for fitting and predicting](@ref)
22+
- [The form of data for fitting and predicting](@ref)
2323
- [The fit method](@ref)
2424
- [The fitted_params method](@ref)
25-
- [The predict method](@ref)
26-
- [The predict_joint method](@ref)
27-
- [Training losses](@ref)
28-
- [Feature importances](@ref)
29-
- [Trait declarations](@ref)
30-
- [Iterative models and the update! method](@ref)
31-
- [Implementing a data front end](@ref)
32-
- [Supervised models with a transform method](@ref)
25+
- [The predict method](@ref)
26+
- [The predict_joint method](@ref)
27+
- [Training losses](@ref)
28+
- [Feature importances](@ref)
29+
- [Trait declarations](@ref)
30+
- [Iterative models and the update! method](@ref)
31+
- [Implementing a data front end](@ref)
32+
- [Supervised models with a transform method](@ref)
3333
- [Models that learn a probability distribution](@ref)

docs/src/the_fit_method.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,21 +7,21 @@ MMI.fit(model::SomeSupervisedModel, verbosity, X, y) -> fitresult, cache, report
77
```
88

99
1. `fitresult` is the fitresult in the sense above (which becomes an
10-
argument for `predict` discussed below).
10+
argument for `predict` discussed below).
1111

1212
2. `report` is a (possibly empty) `NamedTuple`, for example,
13-
`report=(deviance=..., dof_residual=..., stderror=..., vcov=...)`.
14-
Any training-related statistics, such as internal estimates of the
15-
generalization error, and feature rankings, should be returned in
16-
the `report` tuple. How, or if, these are generated should be
17-
controlled by hyperparameters (the fields of `model`). Fitted
18-
parameters, such as the coefficients of a linear model, do not go
19-
in the report as they will be extractable from `fitresult` (and
20-
accessible to MLJ through the `fitted_params` method described below).
21-
22-
3. The value of `cache` can be `nothing`, unless one is also defining
23-
an `update` method (see below). The Julia type of `cache` is not
24-
presently restricted.
13+
`report=(deviance=..., dof_residual=..., stderror=..., vcov=...)`.
14+
Any training-related statistics, such as internal estimates of the
15+
generalization error, and feature rankings, should be returned in
16+
the `report` tuple. How, or if, these are generated should be
17+
controlled by hyperparameters (the fields of `model`). Fitted
18+
parameters, such as the coefficients of a linear model, do not go
19+
in the report as they will be extractable from `fitresult` (and
20+
accessible to MLJ through the `fitted_params` method described below).
21+
22+
3. The value of `cache` can be `nothing`, unless one is also defining
23+
an `update` method (see below). The Julia type of `cache` is not
24+
presently restricted.
2525

2626
!!! note
2727

docs/src/the_predict_method.md

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@ A compulsory `predict` method has the form
66
MMI.predict(model::SomeSupervisedModel, fitresult, Xnew) -> yhat
77
```
88

9-
Here `Xnew` will have the same form as the `X` passed to
10-
`fit`.
9+
Here `Xnew` will have the same form as the `X` passed to `fit`.
1110

1211
Note that while `Xnew` generally consists of multiple observations
1312
(e.g., has multiple rows in the case of a table) it is assumed, in view of
@@ -44,26 +43,26 @@ may look something like this:
4443

4544
```julia
4645
function MMI.fit(model::SomeSupervisedModel, verbosity, X, y)
47-
yint = MMI.int(y)
48-
a_target_element = y[1] # a CategoricalValue/String
49-
decode = MMI.decoder(a_target_element) # can be called on integers
46+
yint = MMI.int(y)
47+
a_target_element = y[1] # a CategoricalValue/String
48+
decode = MMI.decoder(a_target_element) # can be called on integers
5049

51-
core_fitresult = SomePackage.fit(X, yint, verbosity=verbosity)
50+
core_fitresult = SomePackage.fit(X, yint, verbosity=verbosity)
5251

53-
fitresult = (decode, core_fitresult)
54-
cache = nothing
55-
report = nothing
56-
return fitresult, cache, report
52+
fitresult = (decode, core_fitresult)
53+
cache = nothing
54+
report = nothing
55+
return fitresult, cache, report
5756
end
5857
```
5958

6059
while a corresponding deterministic `predict` operation might look like this:
6160

6261
```julia
6362
function MMI.predict(model::SomeSupervisedModel, fitresult, Xnew)
64-
decode, core_fitresult = fitresult
65-
yhat = SomePackage.predict(core_fitresult, Xnew)
66-
return decode.(yhat)
63+
decode, core_fitresult = fitresult
64+
yhat = SomePackage.predict(core_fitresult, Xnew)
65+
return decode.(yhat)
6766
end
6867
```
6968

@@ -155,8 +154,8 @@ yhat = MLJModelInterface.UnivariateFinite([:FALSE, :TRUE], probs, augment=true,
155154
```
156155

157156
The constructor has a lot of options, including passing a dictionary
158-
instead of vectors. See
159-
`CategoricalDistributions.UnivariateFinite`](@ref) for details.
157+
instead of vectors. See [`CategoricalDistributions.UnivariateFinite`](@ref)
158+
for details.
160159

161160
See
162161
[LinearBinaryClassifier](https://github.com/JuliaAI/MLJModels.jl/blob/master/src/GLM.jl)

docs/src/trait_declarations.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,7 @@ MMI.input_scitype(::Type{<:DecisionTreeClassifier}) = Table(Continuous)
2727
```
2828

2929
If, instead, columns were allowed to have either: (i) a mixture of `Continuous` and `Missing`
30-
values, or (ii) `Count` (i.e., integer) values, then the
31-
declaration would be
30+
values, or (ii) `Count` (i.e., integer) values, then the declaration would be
3231

3332
```julia
3433
MMI.input_scitype(::Type{<:DecisionTreeClassifier}) = Table(Union{Continuous,Missing},Count)

docs/src/type_declarations.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -8,32 +8,32 @@ import MLJModelInterface
88
const MMI = MLJModelInterface
99

1010
mutable struct RidgeRegressor <: MMI.Deterministic
11-
lambda::Float64
11+
lambda::Float64
1212
end
1313
```
1414

15-
Models (which are mutable) should not be given internal
16-
constructors. It is recommended that they be given an external lazy
17-
keyword constructor of the same name. This constructor defines default values
18-
for every field, and optionally corrects invalid field values by calling a
19-
`clean!` method (whose fallback returns an empty message string):
15+
Models (which are mutable) should not be given internal constructors.
16+
It is recommended that they be given an external lazy keyword constructor
17+
of the same name. This constructor defines default values for every field,
18+
and optionally corrects invalid field values by calling a `clean!`
19+
method (whose fallback returns an empty message string):
2020

2121
```julia
2222
function MMI.clean!(model::RidgeRegressor)
23-
warning = ""
24-
if model.lambda < 0
25-
warning *= "Need lambda ≥ 0. Resetting lambda=0. "
26-
model.lambda = 0
27-
end
28-
return warning
23+
warning = ""
24+
if model.lambda < 0
25+
warning *= "Need lambda ≥ 0. Resetting lambda=0. "
26+
model.lambda = 0
27+
end
28+
return warning
2929
end
3030

3131
# keyword constructor
3232
function RidgeRegressor(; lambda=0.0)
33-
model = RidgeRegressor(lambda)
34-
message = MMI.clean!(model)
35-
isempty(message) || @warn message
36-
return model
33+
model = RidgeRegressor(lambda)
34+
message = MMI.clean!(model)
35+
isempty(message) || @warn message
36+
return model
3737
end
3838
```
3939

@@ -96,8 +96,8 @@ following example:
9696

9797
```julia
9898
@mlj_model mutable struct YourModel <: MMI.Deterministic
99-
a::Float64 = 0.5::(_ > 0)
100-
b::String = "svd"::(_ in ("svd","qr"))
99+
a::Float64 = 0.5::(_ > 0)
100+
b::String = "svd"::(_ in ("svd","qr"))
101101
end
102102
```
103103

@@ -115,22 +115,22 @@ expects its value to be positive.
115115
You cannot use the `@mlj_model` macro if your model struct has type
116116
parameters.
117117

118-
#### Known issue with @mlj_macro
118+
#### Known issue with `@mlj_macro`
119119

120120
Defaults with negative values can trip up the `@mlj_macro` (see [this
121121
issue](https://github.com/JuliaAI/MLJBase.jl/issues/68)). So,
122122
for example, this does not work:
123123

124124
```julia
125125
@mlj_model mutable struct Bar
126-
a::Int = -1::(_ > -2)
126+
a::Int = -1::(_ > -2)
127127
end
128128
```
129129

130130
But this does:
131131

132132
```julia
133133
@mlj_model mutable struct Bar
134-
a::Int = (-)(1)::(_ > -2)
134+
a::Int = (-)(1)::(_ > -2)
135135
end
136136
```

docs/src/unsupervised_models.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,9 @@ similar fashion. The main differences are:
3131
is the same as `transform`, as in
3232
`MLJModelInterface.inverse_transform(model, fitresult, Xout)`, which:
3333
- must make sense for any `Xout` for which `scitype(Xout) <:
34-
output_scitype(SomeSupervisedModel)` (see below); and
34+
output_scitype(SomeSupervisedModel)` (see below); and
3535
- must return an object `Xin` satisfying `scitype(Xin) <:
36-
input_scitype(SomeSupervisedModel)`.
36+
input_scitype(SomeSupervisedModel)`.
3737

3838
For sample implementatations, see MLJ's [built-in
3939
transformers](https://github.com/JuliaAI/MLJModels.jl/blob/dev/src/builtins/Transformers.jl)

0 commit comments

Comments
 (0)