Skip to content

Commit 6fdce8f

Browse files
committed
quickstart wda + start unbalanced
1 parent 64693f9 commit 6fdce8f

File tree

2 files changed

+144
-6
lines changed

2 files changed

+144
-6
lines changed

docs/source/quickstart.rst

Lines changed: 144 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -278,7 +278,7 @@ choose the quadratic regularization.
278278
Group Lasso regularization
279279
""""""""""""""""""""""""""
280280

281-
Another regularization that has been used in recent years is the group lasso
281+
Another regularization that has been used in recent years [5]_ is the group lasso
282282
regularization
283283

284284
.. math::
@@ -333,7 +333,7 @@ Another solver is proposed to solve the problem
333333
s.t. \gamma 1 = a; \gamma^T 1= b; \gamma\geq 0
334334
335335
where :math:`\Omega_e` is the entropic regularization. In this case we use a
336-
generalized conditional gradient [7]_ implemented in :any:`ot.opim.gcg` that does not linearize the entropic term and
336+
generalized conditional gradient [7]_ implemented in :any:`ot.optim.gcg` that does not linearize the entropic term and
337337
relies on :any:`ot.sinkhorn` for its iterations.
338338

339339
.. hint::
@@ -421,11 +421,11 @@ Estimating the Wassresein barycenter with free support but fixed weights
421421
corresponds to solving the following optimization problem:
422422

423423
.. math::
424-
\min_\{x_i\} \quad \sum_{k} w_kW(\mu,\mu_k)
424+
\min_{\{x_i\}} \quad \sum_{k} w_kW(\mu,\mu_k)
425425
426426
s.t. \quad \mu=\sum_{i=1}^n a_i\delta_{x_i}
427427
428-
WE provide an alternating solver based on [20]_ in
428+
We provide an alternating solver based on [20]_ in
429429
:any:`ot.lp.free_support_barycenter`. This function minimize the problem and
430430
return an optimal support :math:`\{x_i\}` for uniform or given weights
431431
:math:`a`.
@@ -443,13 +443,149 @@ return an optimal support :math:`\{x_i\}` for uniform or given weights
443443
Monge mapping and Domain adaptation
444444
-----------------------------------
445445

446+
The original transport problem investigated by Gaspard Monge was seeking for a
447+
mapping function that maps (or transports) between a source and target
448+
distribution but that minimizes the transport loss. The existence and uniqueness of this
449+
optimal mapping is still an open problem in the general case but has been proven
450+
for smooth distributions by Brenier in his eponym `theorem
451+
<https://who.rocq.inria.fr/Jean-David.Benamou/demiheure.pdf>`__. We provide in
452+
:any:`ot.da` several solvers for Monge mapping estimation and domain adaptation.
453+
454+
Monge Mapping estimation
455+
^^^^^^^^^^^^^^^^^^^^^^^^
456+
457+
We now discuss several approaches that are implemented in POT to estimate or
458+
approximate a Monge mapping from finite distributions.
459+
460+
First note that when the source and target distributions are supposed to be Gaussian
461+
distributions, there exists a close form solution for the mapping and its an
462+
affine function [14]_ of the form :math:`T(x)=Ax+b` . In this case we provide the function
463+
:any:`ot.da.OT_mapping_linear` that return the operator :math:`A` and vector
464+
:math:`b`. Note that if the number of samples is too small there is a parameter
465+
:code:`reg` that provide a regularization for the covariance matrix estimation.
466+
467+
For a more general mapping estimation we also provide the barycentric mapping
468+
proposed in [6]_ . It is implemented in the class :any:`ot.da.EMDTransport` and
469+
other transport based classes in :any:`ot.da` . Those classes are discussed more
470+
in the following but follow an interface similar to sklearn classes. Finally a
471+
method proposed in [8]_ that estimate a continuous mapping approximating the
472+
barycentric mapping is provided in :any:`ot.da.joint_OT_mapping_linear` for
473+
linear mapping and :any:`ot.da.joint_OT_mapping_kernel` for non linear mapping.
474+
475+
.. hint::
476+
477+
Example of the linear Monge mapping estimation is available
478+
in the following example:
479+
480+
- :any:`auto_examples/plot_otda_linear_mapping`
481+
482+
Domain adaptation classes
483+
^^^^^^^^^^^^^^^^^^^^^^^^^
484+
485+
The use of OT for domain adaptation (OTDA) has been first proposed in [5]_ that also
486+
introduced the group Lasso regularization. The main idea of OTDA is to estimate
487+
a mapping of the samples between source and target distributions which allows to
488+
transport labeled source samples onto the target distribution with no labels.
489+
490+
We provide several classes based on :any:`ot.da.BaseTransport` that provide
491+
several OT and mapping estimations. The interface of those classes is similar to
492+
classifiers in sklearn toolbox. At initialization several parameters (for
493+
instance regularization parameter) can be set. Then one needs to estimate the
494+
mapping with function :any:`ot.da.BaseTransport.fit`. Finally one can map the
495+
samples from source to target with :any:`ot.da.BaseTransport.transform` and
496+
from target to source with :any:`ot.da.BaseTransport.inverse_transform`. Here is
497+
an example for class :any:`ot.da.EMDTransport`
498+
499+
.. code::
500+
501+
ot_emd = ot.da.EMDTransport()
502+
ot_emd.fit(Xs=Xs, Xt=Xt)
503+
504+
Mapped_Xs= ot_emd.transform(Xs=Xs)
505+
506+
A list
507+
of the provided implementation is given in the following note.
508+
509+
.. note::
510+
511+
Here is a list of the mapping classes inheriting from
512+
:any:`ot.da.BaseTransport`
513+
514+
* :any:`ot.da.EMDTransport` : Barycentric mapping with EMD transport
515+
* :any:`ot.da.SinkhornTransport` : Barycentric mapping with Sinkhorn transport
516+
* :any:`ot.da.SinkhornL1l2Transport` : Barycentric mapping with Sinkhorn +
517+
group Lasso regularization [5]_
518+
* :any:`ot.da.SinkhornLpl1Transport` : Barycentric mapping with Sinkhorn +
519+
non convex group Lasso regularization [5]_
520+
* :any:`ot.da.LinearTransport` : Linear mapping estimation between Gaussians
521+
[14]_
522+
* :any:`ot.da.MappingTransport` : Nonlinear mapping estimation [8]_
523+
524+
.. hint::
525+
526+
Example of the use of OTDA classes are available in the following exmaples:
527+
528+
- :any:`auto_examples/plot_otda_color_images`
529+
- :any:`auto_examples/plot_otda_mapping`
530+
- :any:`auto_examples/plot_otda_mapping_colors_images`
531+
- :any:`auto_examples/plot_otda_semi_supervised`
446532

447533
Other applications
448534
------------------
449535

536+
We discuss in the following several implementations that has been used and
537+
proposed in the OT and machine learning community.
538+
450539
Wasserstein Discriminant Analysis
451540
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
452541

542+
Wasserstein Discriminant Analysis [11]_ is a generalization of `Fisher Linear Discriminant
543+
Analysis <https://en.wikipedia.org/wiki/Linear_discriminant_analysis>`__ that
544+
allows discrimination between classes that are not linearly separable. It
545+
consist in finding a linear projector optimizing the following criterion
546+
547+
.. math::
548+
P = \text{arg}\min_P \frac{\sum_i OT_e(\mu_i\#P,\mu_i\#P)}{\sum_{i,j\neq i}
549+
OT_e(\mu_i\#P,\mu_j\#P)}
550+
551+
where :math:`\#` is the push-forward operator, :math:`OT_e` is the entropic OT
552+
loss and :math:`\mu_i` is the
553+
distribution of samples from class :math:`i`. :math:`P` is also constrained to
554+
be in the Stiefel manifold. WDA can be solved in pot using function
555+
:any:`ot.dr.wda`. It requires to have installed :code:`pymanopt` and
556+
:code:`autograd` for manifold optimization and automatic differentiation
557+
respectively. Note that we also provide the Fisher discriminant estimator in
558+
:any:`ot.dr.wda` for easy comparison.
559+
560+
.. warning::
561+
Note that due to the hard dependency on :code:`pymanopt` and
562+
:code:`autograd`, :any:`ot.dr` is not imported by default. If you want to
563+
use it you have to specifically import it with :code:`import ot.dr` .
564+
565+
.. hint::
566+
567+
An example of the use of WDA is available in the following example:
568+
569+
- :any:`auto_examples/plot_WDA`
570+
571+
572+
Unbalanced optimal transport
573+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
574+
575+
Unbalanced OT is a relaxation of the original OT problem where the violation of
576+
the constraint on the marginals is added to the objective of the optimization
577+
problem:
578+
579+
.. math::
580+
\min_\gamma \quad \sum_{i,j}\gamma_{i,j}M_{i,j} + reg\cdot\Omega(\gamma) + \alpha KL(\gamma 1, a) + \alpha KL(\gamma^T 1, b)
581+
582+
s.t. \quad \gamma\geq 0
583+
584+
585+
where KL is the Kullback-Leibler divergence. This formulation allwos for
586+
computing approximate mapping between distributions that do not have the same
587+
amount of mass. Interestingly the problem can be solved with a generalization of
588+
the Bregman projections algorithm [10]_.
453589

454590
Gromov-Wasserstein
455591
^^^^^^^^^^^^^^^^^^
@@ -461,6 +597,10 @@ GPU acceleration
461597
We provide several implementation of our OT solvers in :any:`ot.gpu`. Those
462598
implementation use the :code:`cupy` toolbox.
463599

600+
.. warning::
601+
Note that due to the hard dependency on :code:`cupy`, :any:`ot.gpu` is not
602+
imported by default. If you want to
603+
use it you have to specifically import it with :code:`import ot.gpu` .
464604

465605

466606
FAQ

docs/source/readme.rst

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -221,8 +221,6 @@ This toolbox has been created and is maintained by
221221

222222
The contributors to this library are
223223

224-
- `Rémi Flamary <http://remi.flamary.com/>`__
225-
- `Nicolas Courty <http://people.irisa.fr/Nicolas.Courty/>`__
226224
- `Alexandre Gramfort <http://alexandre.gramfort.net/>`__
227225
- `Laetitia Chapel <http://people.irisa.fr/Laetitia.Chapel/>`__
228226
- `Michael Perrot <http://perso.univ-st-etienne.fr/pem82055/>`__

0 commit comments

Comments
 (0)