2
2
Quick start guide
3
3
=================
4
4
5
- In the following we provide some pointers about which functions and classes
5
+ In the following we provide some pointers about which functions and classes
6
6
to use for different problems related to optimal transport (OT) and machine
7
7
learning. We refer when we can to concrete examples in the documentation that
8
8
are also available as notebooks on the POT Github.
9
9
10
10
This document is not a tutorial on numerical optimal transport. For this we strongly
11
- recommend to read the very nice book [15 ]_ .
11
+ recommend to read the very nice book [15 ]_ .
12
12
13
13
14
14
Optimal transport and Wasserstein distance
@@ -55,8 +55,8 @@ solver is quite efficient and uses sparsity of the solution.
55
55
Examples of use for :any: `ot.emd ` are available in :
56
56
57
57
- :any: `auto_examples/plot_OT_2D_samples `
58
- - :any: `auto_examples/plot_OT_1D `
59
- - :any: `auto_examples/plot_OT_L1_vs_L2 `
58
+ - :any: `auto_examples/plot_OT_1D `
59
+ - :any: `auto_examples/plot_OT_L1_vs_L2 `
60
60
61
61
62
62
Computing Wasserstein distance
@@ -102,13 +102,13 @@ distance.
102
102
An example of use for :any: `ot.emd2 ` is available in :
103
103
104
104
- :any: `auto_examples/plot_compute_emd `
105
-
105
+
106
106
107
107
Special cases
108
108
^^^^^^^^^^^^^
109
109
110
110
Note that the OT problem and the corresponding Wasserstein distance can in some
111
- special cases be computed very efficiently.
111
+ special cases be computed very efficiently.
112
112
113
113
For instance when the samples are in 1D, then the OT problem can be solved in
114
114
:math: `O(n\log (n))` by using a simple sorting. In this case we provide the
@@ -117,13 +117,13 @@ matrix and value. Note that since the solution is very sparse the :code:`sparse`
117
117
parameter of :any: `ot.emd_1d ` allows for solving and returning the solution for
118
118
very large problems. Note that in order to compute directly the :math: `W_p`
119
119
Wasserstein distance in 1D we provide the function :any: `ot.wasserstein_1d ` that
120
- takes :code: `p ` as a parameter.
120
+ takes :code: `p ` as a parameter.
121
121
122
122
Another special case for estimating OT and Monge mapping is between Gaussian
123
123
distributions. In this case there exists a close form solution given in Remark
124
124
2.29 in [15 ]_ and the Monge mapping is an affine function and can be
125
125
also computed from the covariances and means of the source and target
126
- distributions. In the case when the finite sample dataset is supposed gaussian, we provide
126
+ distributions. In the case when the finite sample dataset is supposed gaussian, we provide
127
127
:any: `ot.da.OT_mapping_linear ` that returns the parameters for the Monge
128
128
mapping.
129
129
@@ -176,7 +176,7 @@ solution of the resulting optimization problem can be expressed as:
176
176
where :math: `u` and :math: `v` are vectors and :math: `K=\exp (-M/\lambda )` where
177
177
the :math: `\exp ` is taken component-wise. In order to solve the optimization
178
178
problem, on can use an alternative projection algorithm called Sinkhorn-Knopp that can be very
179
- efficient for large values if regularization.
179
+ efficient for large values if regularization.
180
180
181
181
The Sinkhorn-Knopp algorithm is implemented in :any: `ot.sinkhorn ` and
182
182
:any: `ot.sinkhorn2 ` that return respectively the OT matrix and the value of the
@@ -201,12 +201,12 @@ More details about the algorithms used are given in the following note.
201
201
+ :code: `method='sinkhorn' ` calls :any: `ot.bregman.sinkhorn_knopp ` the
202
202
classic algorithm [2 ]_.
203
203
+ :code: `method='sinkhorn_stabilized' ` calls :any: `ot.bregman.sinkhorn_stabilized ` the
204
- log stabilized version of the algorithm [9 ]_.
204
+ log stabilized version of the algorithm [9 ]_.
205
205
+ :code: `method='sinkhorn_epsilon_scaling' ` calls
206
206
:any: `ot.bregman.sinkhorn_epsilon_scaling ` the epsilon scaling version
207
- of the algorithm [9 ]_.
207
+ of the algorithm [9 ]_.
208
208
+ :code: `method='greenkhorn' ` calls :any: `ot.bregman.greenkhorn ` the
209
- greedy sinkhorn verison of the algorithm [22 ]_.
209
+ greedy sinkhorn verison of the algorithm [22 ]_.
210
210
211
211
In addition to all those variants of sinkhorn, we have another
212
212
implementation solving the problem in the smooth dual or semi-dual in
@@ -236,7 +236,7 @@ of algorithms in [18]_ [19]_.
236
236
Examples of use for :any: `ot.sinkhorn ` are available in :
237
237
238
238
- :any: `auto_examples/plot_OT_2D_samples `
239
- - :any: `auto_examples/plot_OT_1D `
239
+ - :any: `auto_examples/plot_OT_1D `
240
240
- :any: `auto_examples/plot_OT_1D_smooth `
241
241
- :any: `auto_examples/plot_stochastic `
242
242
@@ -248,13 +248,13 @@ While entropic OT is the most common and favored in practice, there exist other
248
248
kind of regularization. We provide in POT two specific solvers for other
249
249
regularization terms, namely quadratic regularization and group lasso
250
250
regularization. But we also provide in :any: `ot.optim ` two generic solvers that allows solving any
251
- smooth regularization in practice.
251
+ smooth regularization in practice.
252
252
253
253
Quadratic regularization
254
254
""""""""""""""""""""""""
255
255
256
256
The first general regularization term we can solve is the quadratic
257
- regularization of the form
257
+ regularization of the form
258
258
259
259
.. math ::
260
260
\Omega (\gamma )=\sum _{i,j} \gamma _{i,j}^2
@@ -264,7 +264,7 @@ densifying the OT matrix but it keeps some sort of sparsity that is lost with
264
264
entropic regularization as soon as :math: `\lambda >0 ` [17 ]_. This problem can be
265
265
solved with POT using solvers from :any: `ot.smooth `, more specifically
266
266
functions :any: `ot.smooth.smooth_ot_dual ` or
267
- :any: `ot.smooth.smooth_ot_semi_dual ` with parameter :code: `reg_type='l2' ` to
267
+ :any: `ot.smooth.smooth_ot_semi_dual ` with parameter :code: `reg_type='l2' ` to
268
268
choose the quadratic regularization.
269
269
270
270
.. hint ::
@@ -300,7 +300,7 @@ gradient algorithm [7]_ in function
300
300
.. hint ::
301
301
Examples of group Lasso regularization are available in :
302
302
303
- - :any: `auto_examples/plot_otda_classes `
303
+ - :any: `auto_examples/plot_otda_classes `
304
304
- :any: `auto_examples/plot_otda_d2 `
305
305
306
306
@@ -311,7 +311,7 @@ Finally we propose in POT generic solvers that can be used to solve any
311
311
regularization as long as you can provide a function computing the
312
312
regularization and a function computing its gradient (or sub-gradient).
313
313
314
- In order to solve
314
+ In order to solve
315
315
316
316
.. math ::
317
317
\gamma ^* = arg\min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j} + \lambda\Omega (\gamma )
@@ -336,12 +336,12 @@ Another generic solver is proposed to solve the problem
336
336
where :math: `\Omega _e` is the entropic regularization. In this case we use a
337
337
generalized conditional gradient [7 ]_ implemented in :any: `ot.optim.gcg ` that
338
338
does not linearize the entropic term but
339
- relies on :any: `ot.sinkhorn ` for its iterations.
339
+ relies on :any: `ot.sinkhorn ` for its iterations.
340
340
341
341
.. hint ::
342
342
An example of generic solvers are available in :
343
343
344
- - :any: `auto_examples/plot_optim_OTreg `
344
+ - :any: `auto_examples/plot_optim_OTreg `
345
345
346
346
347
347
Wasserstein Barycenters
@@ -382,7 +382,7 @@ solver :any:`ot.lp.barycenter` that rely on generic LP solvers. By default the
382
382
function uses :any: `scipy.optimize.linprog `, but more efficient LP solvers from
383
383
cvxopt can be also used by changing parameter :code: `solver `. Note that this problem
384
384
requires to solve a very large linear program and can be very slow in
385
- practice.
385
+ practice.
386
386
387
387
Similarly to the OT problem, OT barycenters can be computed in the regularized
388
388
case. When using entropic regularization is used, the problem can be solved with a
@@ -403,11 +403,11 @@ operators. We provide an implementation of this algorithm in function
403
403
Examples of Wasserstein (:any: `ot.lp.barycenter `) and regularized Wasserstein
404
404
barycenter (:any: `ot.bregman.barycenter `) computation are available in :
405
405
406
- - :any: `auto_examples/plot_barycenter_1D `
407
- - :any: `auto_examples/plot_barycenter_lp_vs_entropic `
406
+ - :any: `auto_examples/plot_barycenter_1D `
407
+ - :any: `auto_examples/plot_barycenter_lp_vs_entropic `
408
408
409
409
An example of convolutional barycenter
410
- (:any: `ot.bregman.convolutional_barycenter2d `) computation
410
+ (:any: `ot.bregman.convolutional_barycenter2d `) computation
411
411
for 2D images is available
412
412
in :
413
413
@@ -451,13 +451,13 @@ optimal mapping is still an open problem in the general case but has been proven
451
451
for smooth distributions by Brenier in his eponym `theorem
452
452
<https://who.rocq.inria.fr/Jean-David.Benamou/demiheure.pdf> `__. We provide in
453
453
:any: `ot.da ` several solvers for smooth Monge mapping estimation and domain
454
- adaptation from discrete distributions.
454
+ adaptation from discrete distributions.
455
455
456
456
Monge Mapping estimation
457
457
^^^^^^^^^^^^^^^^^^^^^^^^
458
458
459
459
We now discuss several approaches that are implemented in POT to estimate or
460
- approximate a Monge mapping from finite distributions.
460
+ approximate a Monge mapping from finite distributions.
461
461
462
462
First note that when the source and target distributions are supposed to be Gaussian
463
463
distributions, there exists a close form solution for the mapping and its an
@@ -513,16 +513,16 @@ A list of the provided implementation is given in the following note.
513
513
514
514
Here is a list of the OT mapping classes inheriting from
515
515
:any: `ot.da.BaseTransport `
516
-
516
+
517
517
* :any: `ot.da.EMDTransport ` : Barycentric mapping with EMD transport
518
518
* :any: `ot.da.SinkhornTransport ` : Barycentric mapping with Sinkhorn transport
519
519
* :any: `ot.da.SinkhornL1l2Transport ` : Barycentric mapping with Sinkhorn +
520
520
group Lasso regularization [5 ]_
521
521
* :any: `ot.da.SinkhornLpl1Transport ` : Barycentric mapping with Sinkhorn +
522
- non convex group Lasso regularization [5 ]_
522
+ non convex group Lasso regularization [5 ]_
523
523
* :any: `ot.da.LinearTransport ` : Linear mapping estimation between Gaussians
524
524
[14 ]_
525
- * :any: `ot.da.MappingTransport ` : Nonlinear mapping estimation [8 ]_
525
+ * :any: `ot.da.MappingTransport ` : Nonlinear mapping estimation [8 ]_
526
526
527
527
.. hint ::
528
528
@@ -550,7 +550,7 @@ consist in finding a linear projector optimizing the following criterion
550
550
.. math ::
551
551
P = \text {arg}\min _P \frac {\sum _i OT_e(\mu _i\# P,\mu _i\# P)}{\sum _{i,j\neq i}
552
552
OT_e(\mu _i\# P,\mu _j\# P)}
553
-
553
+
554
554
where :math: `\#` is the push-forward operator, :math: `OT_e` is the entropic OT
555
555
loss and :math: `\mu _i` is the
556
556
distribution of samples from class :math: `i`. :math: `P` is also constrained to
@@ -575,10 +575,10 @@ respectively. Note that we also provide the Fisher discriminant estimator in
575
575
Unbalanced optimal transport
576
576
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
577
577
578
- Unbalanced OT is a relaxation of the original OT problem where the violation of
578
+ Unbalanced OT is a relaxation of the entropy regularized OT problem where the violation of
579
579
the constraint on the marginals is added to the objective of the optimization
580
580
problem:
581
-
581
+
582
582
.. math ::
583
583
\min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j} + reg\cdot\Omega (\gamma ) + \alpha KL(\gamma 1 , a) + \alpha KL(\gamma ^T 1 , b)
584
584
@@ -589,9 +589,24 @@ where KL is the Kullback-Leibler divergence. This formulation allows for
589
589
computing approximate mapping between distributions that do not have the same
590
590
amount of mass. Interestingly the problem can be solved with a generalization of
591
591
the Bregman projections algorithm [10 ]_. We provide a solver for unbalanced OT
592
- in :any: `ot.unbalanced ` and more specifically
593
- in function :any: `ot.sinkhorn_unbalanced `. A solver for unbalanced OT barycenter
594
- is available in :any: `ot.barycenter_unbalanced `.
592
+ in :any: `ot.unbalanced `. Computing the optimal transport
593
+ plan or the transport cost is similar to the balanced case. The Sinkhorn-Knopp
594
+ algorithm is implemented in :any: `ot.sinkhorn_unbalanced ` and :any: `ot.sinkhorn_unbalanced2 `
595
+ that return respectively the OT matrix and the value of the
596
+ linear term. Note that the regularization parameter :math: `\alpha ` in the
597
+ equation above is given to those functions with the parameter :code: `reg_m `.
598
+
599
+ Similarly, Unbalanced OT barycenters can be computed using :any: `ot.barycenter_unbalanced `.
600
+
601
+ .. note ::
602
+ The main function to solve entropic regularized OT is :any: `ot.sinkhorn_unbalanced `.
603
+ This function is a wrapper and the parameter :code: `method ` help you select
604
+ the actual algorithm used to solve the problem:
605
+
606
+ + :code: `method='sinkhorn' ` calls :any: `ot.unbalanced.sinkhorn_knopp_unbalanced `
607
+ the generalized Sinkhorn algorithm [10 ]_.
608
+ + :code: `method='sinkhorn_stabilized' ` calls :any: `ot.unbalanced.sinkhorn_stabilized_unbalanced `
609
+ the log stabilized version of the algorithm [10 ]_.
595
610
596
611
597
612
.. hint ::
@@ -636,7 +651,7 @@ barycenters that can be expressed as
636
651
637
652
where :math: `Ck` is the distance matrix between samples in distribution
638
653
:math: `k`. Note that interestingly the barycenter is defined as a symmetric
639
- positive matrix. We provide a block coordinate optimization procedure in
654
+ positive matrix. We provide a block coordinate optimization procedure in
640
655
:any: `ot.gromov.gromov_barycenters ` and
641
656
:any: `ot.gromov.entropic_gromov_barycenters ` for non-regularized and regularized
642
657
barycenters respectively.
@@ -654,19 +669,19 @@ The implementations of FGW and FGW barycenter is provided in functions
654
669
Examples of computation of GW, regularized G and FGW are available in :
655
670
656
671
- :any: `auto_examples/plot_gromov `
657
- - :any: `auto_examples/plot_fgw `
672
+ - :any: `auto_examples/plot_fgw `
658
673
659
674
Examples of GW, regularized GW and FGW barycenters are available in :
660
675
661
676
- :any: `auto_examples/plot_gromov_barycenter `
662
- - :any: `auto_examples/plot_barycenter_fgw `
677
+ - :any: `auto_examples/plot_barycenter_fgw `
663
678
664
679
665
680
GPU acceleration
666
681
^^^^^^^^^^^^^^^^
667
682
668
683
We provide several implementation of our OT solvers in :any: `ot.gpu `. Those
669
- implementations use the :code: `cupy ` toolbox that obviously need to be installed.
684
+ implementations use the :code: `cupy ` toolbox that obviously need to be installed.
670
685
671
686
672
687
.. note ::
701
716
1. **How to solve a discrete optimal transport problem ? **
702
717
703
718
The solver for discrete OT is the function :py:mod: `ot.emd ` that returns
704
- the OT transport matrix. If you want to solve a regularized OT you can
719
+ the OT transport matrix. If you want to solve a regularized OT you can
705
720
use :py:mod: `ot.sinkhorn `.
706
721
707
722
714
729
T= ot.emd(a,b,M) # exact linear program
715
730
T_reg= ot.sinkhorn(a,b,M,reg) # entropic regularized OT
716
731
717
- More detailed examples can be seen on this example:
732
+ More detailed examples can be seen on this example:
718
733
:doc: `auto_examples/plot_OT_2D_samples `
719
-
734
+
720
735
721
736
2. **pip install POT fails with error : ImportError: No module named Cython.Build **
722
737
726
741
installing POT.
727
742
728
743
Note that this problem do not occur when using conda-forge since the packages
729
- there are pre-compiled.
744
+ there are pre-compiled.
730
745
731
746
See `Issue #59 <https://github.com/rflamary/POT/issues/59 >`__ for more
732
747
details.
751
766
In order to limit import time and hard dependencies in POT. we do not import
752
767
some sub-modules automatically with :code: `import ot `. In order to use the
753
768
acceleration in :any: `ot.gpu ` you need first to import is with
754
- :code: `import ot.gpu `.
769
+ :code: `import ot.gpu `.
755
770
756
771
See `Issue #85 <https://github.com/rflamary/POT/issues/85 >`__ and :any: `ot.gpu `
757
772
for more details.
@@ -763,7 +778,7 @@ References
763
778
.. [1 ] Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011,
764
779
December). `Displacement nterpolation using Lagrangian mass transport
765
780
<https://people.csail.mit.edu/sparis/publi/2011/sigasia/Bonneel_11_Displacement_Interpolation.pdf> `__.
766
- In ACM Transactions on Graphics (TOG) (Vol. 30, No. 6, p. 158). ACM.
781
+ In ACM Transactions on Graphics (TOG) (Vol. 30, No. 6, p. 158). ACM.
767
782
768
783
.. [2 ] Cuturi, M. (2013). `Sinkhorn distances: Lightspeed computation of
769
784
optimal transport <https://arxiv.org/pdf/1306.0895.pdf> `__. In Advances
@@ -874,4 +889,4 @@ References
874
889
.. [24 ] Vayer, T., Chapel, L., Flamary, R., Tavenard, R. and Courty, N.
875
890
(2019). `Optimal Transport for structured data with application on
876
891
graphs <http://proceedings.mlr.press/v97/titouan19a.html> `__ Proceedings
877
- of the 36th International Conference on Machine Learning (ICML).
892
+ of the 36th International Conference on Machine Learning (ICML).
0 commit comments