Skip to content

Commit a96caae

Browse files
author
Hicham Janati
committed
update UOT paragraph in quickstart
1 parent e552320 commit a96caae

File tree

1 file changed

+61
-46
lines changed

1 file changed

+61
-46
lines changed

docs/source/quickstart.rst

Lines changed: 61 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@
22
Quick start guide
33
=================
44

5-
In the following we provide some pointers about which functions and classes
5+
In the following we provide some pointers about which functions and classes
66
to use for different problems related to optimal transport (OT) and machine
77
learning. We refer when we can to concrete examples in the documentation that
88
are also available as notebooks on the POT Github.
99

1010
This document is not a tutorial on numerical optimal transport. For this we strongly
11-
recommend to read the very nice book [15]_ .
11+
recommend to read the very nice book [15]_ .
1212

1313

1414
Optimal transport and Wasserstein distance
@@ -55,8 +55,8 @@ solver is quite efficient and uses sparsity of the solution.
5555
Examples of use for :any:`ot.emd` are available in :
5656

5757
- :any:`auto_examples/plot_OT_2D_samples`
58-
- :any:`auto_examples/plot_OT_1D`
59-
- :any:`auto_examples/plot_OT_L1_vs_L2`
58+
- :any:`auto_examples/plot_OT_1D`
59+
- :any:`auto_examples/plot_OT_L1_vs_L2`
6060

6161

6262
Computing Wasserstein distance
@@ -102,13 +102,13 @@ distance.
102102
An example of use for :any:`ot.emd2` is available in :
103103

104104
- :any:`auto_examples/plot_compute_emd`
105-
105+
106106

107107
Special cases
108108
^^^^^^^^^^^^^
109109

110110
Note that the OT problem and the corresponding Wasserstein distance can in some
111-
special cases be computed very efficiently.
111+
special cases be computed very efficiently.
112112

113113
For instance when the samples are in 1D, then the OT problem can be solved in
114114
:math:`O(n\log(n))` by using a simple sorting. In this case we provide the
@@ -117,13 +117,13 @@ matrix and value. Note that since the solution is very sparse the :code:`sparse`
117117
parameter of :any:`ot.emd_1d` allows for solving and returning the solution for
118118
very large problems. Note that in order to compute directly the :math:`W_p`
119119
Wasserstein distance in 1D we provide the function :any:`ot.wasserstein_1d` that
120-
takes :code:`p` as a parameter.
120+
takes :code:`p` as a parameter.
121121

122122
Another special case for estimating OT and Monge mapping is between Gaussian
123123
distributions. In this case there exists a close form solution given in Remark
124124
2.29 in [15]_ and the Monge mapping is an affine function and can be
125125
also computed from the covariances and means of the source and target
126-
distributions. In the case when the finite sample dataset is supposed gaussian, we provide
126+
distributions. In the case when the finite sample dataset is supposed gaussian, we provide
127127
:any:`ot.da.OT_mapping_linear` that returns the parameters for the Monge
128128
mapping.
129129

@@ -176,7 +176,7 @@ solution of the resulting optimization problem can be expressed as:
176176
where :math:`u` and :math:`v` are vectors and :math:`K=\exp(-M/\lambda)` where
177177
the :math:`\exp` is taken component-wise. In order to solve the optimization
178178
problem, on can use an alternative projection algorithm called Sinkhorn-Knopp that can be very
179-
efficient for large values if regularization.
179+
efficient for large values if regularization.
180180

181181
The Sinkhorn-Knopp algorithm is implemented in :any:`ot.sinkhorn` and
182182
:any:`ot.sinkhorn2` that return respectively the OT matrix and the value of the
@@ -201,12 +201,12 @@ More details about the algorithms used are given in the following note.
201201
+ :code:`method='sinkhorn'` calls :any:`ot.bregman.sinkhorn_knopp` the
202202
classic algorithm [2]_.
203203
+ :code:`method='sinkhorn_stabilized'` calls :any:`ot.bregman.sinkhorn_stabilized` the
204-
log stabilized version of the algorithm [9]_.
204+
log stabilized version of the algorithm [9]_.
205205
+ :code:`method='sinkhorn_epsilon_scaling'` calls
206206
:any:`ot.bregman.sinkhorn_epsilon_scaling` the epsilon scaling version
207-
of the algorithm [9]_.
207+
of the algorithm [9]_.
208208
+ :code:`method='greenkhorn'` calls :any:`ot.bregman.greenkhorn` the
209-
greedy sinkhorn verison of the algorithm [22]_.
209+
greedy sinkhorn verison of the algorithm [22]_.
210210

211211
In addition to all those variants of sinkhorn, we have another
212212
implementation solving the problem in the smooth dual or semi-dual in
@@ -236,7 +236,7 @@ of algorithms in [18]_ [19]_.
236236
Examples of use for :any:`ot.sinkhorn` are available in :
237237

238238
- :any:`auto_examples/plot_OT_2D_samples`
239-
- :any:`auto_examples/plot_OT_1D`
239+
- :any:`auto_examples/plot_OT_1D`
240240
- :any:`auto_examples/plot_OT_1D_smooth`
241241
- :any:`auto_examples/plot_stochastic`
242242

@@ -248,13 +248,13 @@ While entropic OT is the most common and favored in practice, there exist other
248248
kind of regularization. We provide in POT two specific solvers for other
249249
regularization terms, namely quadratic regularization and group lasso
250250
regularization. But we also provide in :any:`ot.optim` two generic solvers that allows solving any
251-
smooth regularization in practice.
251+
smooth regularization in practice.
252252

253253
Quadratic regularization
254254
""""""""""""""""""""""""
255255

256256
The first general regularization term we can solve is the quadratic
257-
regularization of the form
257+
regularization of the form
258258

259259
.. math::
260260
\Omega(\gamma)=\sum_{i,j} \gamma_{i,j}^2
@@ -264,7 +264,7 @@ densifying the OT matrix but it keeps some sort of sparsity that is lost with
264264
entropic regularization as soon as :math:`\lambda>0` [17]_. This problem can be
265265
solved with POT using solvers from :any:`ot.smooth`, more specifically
266266
functions :any:`ot.smooth.smooth_ot_dual` or
267-
:any:`ot.smooth.smooth_ot_semi_dual` with parameter :code:`reg_type='l2'` to
267+
:any:`ot.smooth.smooth_ot_semi_dual` with parameter :code:`reg_type='l2'` to
268268
choose the quadratic regularization.
269269

270270
.. hint::
@@ -300,7 +300,7 @@ gradient algorithm [7]_ in function
300300
.. hint::
301301
Examples of group Lasso regularization are available in :
302302

303-
- :any:`auto_examples/plot_otda_classes`
303+
- :any:`auto_examples/plot_otda_classes`
304304
- :any:`auto_examples/plot_otda_d2`
305305

306306

@@ -311,7 +311,7 @@ Finally we propose in POT generic solvers that can be used to solve any
311311
regularization as long as you can provide a function computing the
312312
regularization and a function computing its gradient (or sub-gradient).
313313

314-
In order to solve
314+
In order to solve
315315

316316
.. math::
317317
\gamma^* = arg\min_\gamma \quad \sum_{i,j}\gamma_{i,j}M_{i,j} + \lambda\Omega(\gamma)
@@ -336,12 +336,12 @@ Another generic solver is proposed to solve the problem
336336
where :math:`\Omega_e` is the entropic regularization. In this case we use a
337337
generalized conditional gradient [7]_ implemented in :any:`ot.optim.gcg` that
338338
does not linearize the entropic term but
339-
relies on :any:`ot.sinkhorn` for its iterations.
339+
relies on :any:`ot.sinkhorn` for its iterations.
340340

341341
.. hint::
342342
An example of generic solvers are available in :
343343

344-
- :any:`auto_examples/plot_optim_OTreg`
344+
- :any:`auto_examples/plot_optim_OTreg`
345345

346346

347347
Wasserstein Barycenters
@@ -382,7 +382,7 @@ solver :any:`ot.lp.barycenter` that rely on generic LP solvers. By default the
382382
function uses :any:`scipy.optimize.linprog`, but more efficient LP solvers from
383383
cvxopt can be also used by changing parameter :code:`solver`. Note that this problem
384384
requires to solve a very large linear program and can be very slow in
385-
practice.
385+
practice.
386386

387387
Similarly to the OT problem, OT barycenters can be computed in the regularized
388388
case. When using entropic regularization is used, the problem can be solved with a
@@ -403,11 +403,11 @@ operators. We provide an implementation of this algorithm in function
403403
Examples of Wasserstein (:any:`ot.lp.barycenter`) and regularized Wasserstein
404404
barycenter (:any:`ot.bregman.barycenter`) computation are available in :
405405

406-
- :any:`auto_examples/plot_barycenter_1D`
407-
- :any:`auto_examples/plot_barycenter_lp_vs_entropic`
406+
- :any:`auto_examples/plot_barycenter_1D`
407+
- :any:`auto_examples/plot_barycenter_lp_vs_entropic`
408408

409409
An example of convolutional barycenter
410-
(:any:`ot.bregman.convolutional_barycenter2d`) computation
410+
(:any:`ot.bregman.convolutional_barycenter2d`) computation
411411
for 2D images is available
412412
in :
413413

@@ -451,13 +451,13 @@ optimal mapping is still an open problem in the general case but has been proven
451451
for smooth distributions by Brenier in his eponym `theorem
452452
<https://who.rocq.inria.fr/Jean-David.Benamou/demiheure.pdf>`__. We provide in
453453
:any:`ot.da` several solvers for smooth Monge mapping estimation and domain
454-
adaptation from discrete distributions.
454+
adaptation from discrete distributions.
455455

456456
Monge Mapping estimation
457457
^^^^^^^^^^^^^^^^^^^^^^^^
458458

459459
We now discuss several approaches that are implemented in POT to estimate or
460-
approximate a Monge mapping from finite distributions.
460+
approximate a Monge mapping from finite distributions.
461461

462462
First note that when the source and target distributions are supposed to be Gaussian
463463
distributions, there exists a close form solution for the mapping and its an
@@ -513,16 +513,16 @@ A list of the provided implementation is given in the following note.
513513

514514
Here is a list of the OT mapping classes inheriting from
515515
:any:`ot.da.BaseTransport`
516-
516+
517517
* :any:`ot.da.EMDTransport` : Barycentric mapping with EMD transport
518518
* :any:`ot.da.SinkhornTransport` : Barycentric mapping with Sinkhorn transport
519519
* :any:`ot.da.SinkhornL1l2Transport` : Barycentric mapping with Sinkhorn +
520520
group Lasso regularization [5]_
521521
* :any:`ot.da.SinkhornLpl1Transport` : Barycentric mapping with Sinkhorn +
522-
non convex group Lasso regularization [5]_
522+
non convex group Lasso regularization [5]_
523523
* :any:`ot.da.LinearTransport` : Linear mapping estimation between Gaussians
524524
[14]_
525-
* :any:`ot.da.MappingTransport` : Nonlinear mapping estimation [8]_
525+
* :any:`ot.da.MappingTransport` : Nonlinear mapping estimation [8]_
526526

527527
.. hint::
528528

@@ -550,7 +550,7 @@ consist in finding a linear projector optimizing the following criterion
550550
.. math::
551551
P = \text{arg}\min_P \frac{\sum_i OT_e(\mu_i\#P,\mu_i\#P)}{\sum_{i,j\neq i}
552552
OT_e(\mu_i\#P,\mu_j\#P)}
553-
553+
554554
where :math:`\#` is the push-forward operator, :math:`OT_e` is the entropic OT
555555
loss and :math:`\mu_i` is the
556556
distribution of samples from class :math:`i`. :math:`P` is also constrained to
@@ -575,10 +575,10 @@ respectively. Note that we also provide the Fisher discriminant estimator in
575575
Unbalanced optimal transport
576576
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
577577

578-
Unbalanced OT is a relaxation of the original OT problem where the violation of
578+
Unbalanced OT is a relaxation of the entropy regularized OT problem where the violation of
579579
the constraint on the marginals is added to the objective of the optimization
580580
problem:
581-
581+
582582
.. math::
583583
\min_\gamma \quad \sum_{i,j}\gamma_{i,j}M_{i,j} + reg\cdot\Omega(\gamma) + \alpha KL(\gamma 1, a) + \alpha KL(\gamma^T 1, b)
584584
@@ -589,9 +589,24 @@ where KL is the Kullback-Leibler divergence. This formulation allows for
589589
computing approximate mapping between distributions that do not have the same
590590
amount of mass. Interestingly the problem can be solved with a generalization of
591591
the Bregman projections algorithm [10]_. We provide a solver for unbalanced OT
592-
in :any:`ot.unbalanced` and more specifically
593-
in function :any:`ot.sinkhorn_unbalanced`. A solver for unbalanced OT barycenter
594-
is available in :any:`ot.barycenter_unbalanced`.
592+
in :any:`ot.unbalanced`. Computing the optimal transport
593+
plan or the transport cost is similar to the balanced case. The Sinkhorn-Knopp
594+
algorithm is implemented in :any:`ot.sinkhorn_unbalanced` and :any:`ot.sinkhorn_unbalanced2`
595+
that return respectively the OT matrix and the value of the
596+
linear term. Note that the regularization parameter :math:`\alpha` in the
597+
equation above is given to those functions with the parameter :code:`reg_m`.
598+
599+
Similarly, Unbalanced OT barycenters can be computed using :any:`ot.barycenter_unbalanced`.
600+
601+
.. note::
602+
The main function to solve entropic regularized OT is :any:`ot.sinkhorn_unbalanced`.
603+
This function is a wrapper and the parameter :code:`method` help you select
604+
the actual algorithm used to solve the problem:
605+
606+
+ :code:`method='sinkhorn'` calls :any:`ot.unbalanced.sinkhorn_knopp_unbalanced`
607+
the generalized Sinkhorn algorithm [10]_.
608+
+ :code:`method='sinkhorn_stabilized'` calls :any:`ot.unbalanced.sinkhorn_stabilized_unbalanced`
609+
the log stabilized version of the algorithm [10]_.
595610

596611

597612
.. hint::
@@ -636,7 +651,7 @@ barycenters that can be expressed as
636651
637652
where :math:`Ck` is the distance matrix between samples in distribution
638653
:math:`k`. Note that interestingly the barycenter is defined as a symmetric
639-
positive matrix. We provide a block coordinate optimization procedure in
654+
positive matrix. We provide a block coordinate optimization procedure in
640655
:any:`ot.gromov.gromov_barycenters` and
641656
:any:`ot.gromov.entropic_gromov_barycenters` for non-regularized and regularized
642657
barycenters respectively.
@@ -654,19 +669,19 @@ The implementations of FGW and FGW barycenter is provided in functions
654669
Examples of computation of GW, regularized G and FGW are available in :
655670

656671
- :any:`auto_examples/plot_gromov`
657-
- :any:`auto_examples/plot_fgw`
672+
- :any:`auto_examples/plot_fgw`
658673

659674
Examples of GW, regularized GW and FGW barycenters are available in :
660675

661676
- :any:`auto_examples/plot_gromov_barycenter`
662-
- :any:`auto_examples/plot_barycenter_fgw`
677+
- :any:`auto_examples/plot_barycenter_fgw`
663678

664679

665680
GPU acceleration
666681
^^^^^^^^^^^^^^^^
667682

668683
We provide several implementation of our OT solvers in :any:`ot.gpu`. Those
669-
implementations use the :code:`cupy` toolbox that obviously need to be installed.
684+
implementations use the :code:`cupy` toolbox that obviously need to be installed.
670685

671686

672687
.. note::
@@ -701,7 +716,7 @@ FAQ
701716
1. **How to solve a discrete optimal transport problem ?**
702717

703718
The solver for discrete OT is the function :py:mod:`ot.emd` that returns
704-
the OT transport matrix. If you want to solve a regularized OT you can
719+
the OT transport matrix. If you want to solve a regularized OT you can
705720
use :py:mod:`ot.sinkhorn`.
706721

707722

@@ -714,9 +729,9 @@ FAQ
714729
T=ot.emd(a,b,M) # exact linear program
715730
T_reg=ot.sinkhorn(a,b,M,reg) # entropic regularized OT
716731
717-
More detailed examples can be seen on this example:
732+
More detailed examples can be seen on this example:
718733
:doc:`auto_examples/plot_OT_2D_samples`
719-
734+
720735

721736
2. **pip install POT fails with error : ImportError: No module named Cython.Build**
722737

@@ -726,7 +741,7 @@ FAQ
726741
installing POT.
727742

728743
Note that this problem do not occur when using conda-forge since the packages
729-
there are pre-compiled.
744+
there are pre-compiled.
730745

731746
See `Issue #59 <https://github.com/rflamary/POT/issues/59>`__ for more
732747
details.
@@ -751,7 +766,7 @@ FAQ
751766
In order to limit import time and hard dependencies in POT. we do not import
752767
some sub-modules automatically with :code:`import ot`. In order to use the
753768
acceleration in :any:`ot.gpu` you need first to import is with
754-
:code:`import ot.gpu`.
769+
:code:`import ot.gpu`.
755770

756771
See `Issue #85 <https://github.com/rflamary/POT/issues/85>`__ and :any:`ot.gpu`
757772
for more details.
@@ -763,7 +778,7 @@ References
763778
.. [1] Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011,
764779
December). `Displacement nterpolation using Lagrangian mass transport
765780
<https://people.csail.mit.edu/sparis/publi/2011/sigasia/Bonneel_11_Displacement_Interpolation.pdf>`__.
766-
In ACM Transactions on Graphics (TOG) (Vol. 30, No. 6, p. 158). ACM.
781+
In ACM Transactions on Graphics (TOG) (Vol. 30, No. 6, p. 158). ACM.
767782
768783
.. [2] Cuturi, M. (2013). `Sinkhorn distances: Lightspeed computation of
769784
optimal transport <https://arxiv.org/pdf/1306.0895.pdf>`__. In Advances
@@ -874,4 +889,4 @@ References
874889
.. [24] Vayer, T., Chapel, L., Flamary, R., Tavenard, R. and Courty, N.
875890
(2019). `Optimal Transport for structured data with application on
876891
graphs <http://proceedings.mlr.press/v97/titouan19a.html>`__ Proceedings
877-
of the 36th International Conference on Machine Learning (ICML).
892+
of the 36th International Conference on Machine Learning (ICML).

0 commit comments

Comments
 (0)