Skip to content

Conversation

@barucden
Copy link
Member

@barucden barucden commented May 4, 2024

Fixes #37

This adds support for regression models. However, the models produced by liblinear do not seem to be very good.

For example, the following example in scikit-learn:

from sklearn.svm import LinearSVR
import numpy as np

X = np.random.rand(10000, 1)
y = (2 * X)[:, 0]
m = LinearSVR(loss='squared_epsilon_insensitive', dual=False, verbose=1, fit_intercept=False)
m.fit(X, y)

print(m.coef_)

prints

iter  1 act 1.332e+04 pre 1.332e+04 delta 2.000e+00 f 1.332e+04 |g| 1.332e+04 CG   1
[LibLinear][1.99969976]

(meaning the linear coefficient was found pretty accurately)

Whereas, the following code

using LIBLINEAR

X = rand(1, 10000)
y = vec(2 .* X)

m = linear_train(y, X, solver_type=LIBLINEAR.L2R_L2LOSS_SVR, verbose=true)

println(m.w)

prints

init f 3.334e+11 |g| 4.989e+07
iter  1 f 1.462e+11 |g| 1.004e+03 CG   2 step_size 1.00e+00
[7503.24992010474]

with the current PR (meaning totally inaccurate linear coefficient). According to my investigation, this is what is indeed returned by liblinear. Scikit seems to use a different solver than liblinear, but I am not sure if that's the only issue.

Also: linear_predict is not really type-stable as the output type depends on solver_type. For one-class SVM, the output is a pair of Vector{String} and Vector{Float64}. For regression models, it is Vector{Float64} and Vector{Float64} (I made it to return the same vector twice). For other models, it is Vector{typeof(labels)} and Vector{Float64}.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

prediction crashes with illegal index

1 participant