@@ -278,7 +278,7 @@ choose the quadratic regularization.
278
278
Group Lasso regularization
279
279
""""""""""""""""""""""""""
280
280
281
- Another regularization that has been used in recent years is the group lasso
281
+ Another regularization that has been used in recent years [ 5 ]_ is the group lasso
282
282
regularization
283
283
284
284
.. math ::
@@ -333,7 +333,7 @@ Another solver is proposed to solve the problem
333
333
s.t. \gamma 1 = a; \gamma ^T 1 = b; \gamma\geq 0
334
334
335
335
where :math: `\Omega _e` is the entropic regularization. In this case we use a
336
- generalized conditional gradient [7 ]_ implemented in :any: `ot.opim .gcg ` that does not linearize the entropic term and
336
+ generalized conditional gradient [7 ]_ implemented in :any: `ot.optim .gcg ` that does not linearize the entropic term and
337
337
relies on :any: `ot.sinkhorn ` for its iterations.
338
338
339
339
.. hint ::
@@ -421,11 +421,11 @@ Estimating the Wassresein barycenter with free support but fixed weights
421
421
corresponds to solving the following optimization problem:
422
422
423
423
.. math ::
424
- \min _\{ x_i\} \quad \sum _{k} w_kW(\mu ,\mu _k)
424
+ \min _{ \{ x_i\} } \quad \sum _{k} w_kW(\mu ,\mu _k)
425
425
426
426
s.t. \quad \mu =\sum _{i=1 }^n a_i\delta _{x_i}
427
427
428
- WE provide an alternating solver based on [20 ]_ in
428
+ We provide an alternating solver based on [20 ]_ in
429
429
:any: `ot.lp.free_support_barycenter `. This function minimize the problem and
430
430
return an optimal support :math: `\{ x_i\}` for uniform or given weights
431
431
:math: `a`.
@@ -443,13 +443,149 @@ return an optimal support :math:`\{x_i\}` for uniform or given weights
443
443
Monge mapping and Domain adaptation
444
444
-----------------------------------
445
445
446
+ The original transport problem investigated by Gaspard Monge was seeking for a
447
+ mapping function that maps (or transports) between a source and target
448
+ distribution but that minimizes the transport loss. The existence and uniqueness of this
449
+ optimal mapping is still an open problem in the general case but has been proven
450
+ for smooth distributions by Brenier in his eponym `theorem
451
+ <https://who.rocq.inria.fr/Jean-David.Benamou/demiheure.pdf> `__. We provide in
452
+ :any: `ot.da ` several solvers for Monge mapping estimation and domain adaptation.
453
+
454
+ Monge Mapping estimation
455
+ ^^^^^^^^^^^^^^^^^^^^^^^^
456
+
457
+ We now discuss several approaches that are implemented in POT to estimate or
458
+ approximate a Monge mapping from finite distributions.
459
+
460
+ First note that when the source and target distributions are supposed to be Gaussian
461
+ distributions, there exists a close form solution for the mapping and its an
462
+ affine function [14 ]_ of the form :math: `T(x)=Ax+b` . In this case we provide the function
463
+ :any: `ot.da.OT_mapping_linear ` that return the operator :math: `A` and vector
464
+ :math: `b`. Note that if the number of samples is too small there is a parameter
465
+ :code: `reg ` that provide a regularization for the covariance matrix estimation.
466
+
467
+ For a more general mapping estimation we also provide the barycentric mapping
468
+ proposed in [6 ]_ . It is implemented in the class :any: `ot.da.EMDTransport ` and
469
+ other transport based classes in :any: `ot.da ` . Those classes are discussed more
470
+ in the following but follow an interface similar to sklearn classes. Finally a
471
+ method proposed in [8 ]_ that estimate a continuous mapping approximating the
472
+ barycentric mapping is provided in :any: `ot.da.joint_OT_mapping_linear ` for
473
+ linear mapping and :any: `ot.da.joint_OT_mapping_kernel ` for non linear mapping.
474
+
475
+ .. hint ::
476
+
477
+ Example of the linear Monge mapping estimation is available
478
+ in the following example:
479
+
480
+ - :any: `auto_examples/plot_otda_linear_mapping `
481
+
482
+ Domain adaptation classes
483
+ ^^^^^^^^^^^^^^^^^^^^^^^^^
484
+
485
+ The use of OT for domain adaptation (OTDA) has been first proposed in [5 ]_ that also
486
+ introduced the group Lasso regularization. The main idea of OTDA is to estimate
487
+ a mapping of the samples between source and target distributions which allows to
488
+ transport labeled source samples onto the target distribution with no labels.
489
+
490
+ We provide several classes based on :any: `ot.da.BaseTransport ` that provide
491
+ several OT and mapping estimations. The interface of those classes is similar to
492
+ classifiers in sklearn toolbox. At initialization several parameters (for
493
+ instance regularization parameter) can be set. Then one needs to estimate the
494
+ mapping with function :any: `ot.da.BaseTransport.fit `. Finally one can map the
495
+ samples from source to target with :any: `ot.da.BaseTransport.transform ` and
496
+ from target to source with :any: `ot.da.BaseTransport.inverse_transform `. Here is
497
+ an example for class :any: `ot.da.EMDTransport `
498
+
499
+ .. code ::
500
+
501
+ ot_emd = ot.da.EMDTransport()
502
+ ot_emd.fit(Xs=Xs, Xt=Xt)
503
+
504
+ Mapped_Xs= ot_emd.transform(Xs=Xs)
505
+
506
+ A list
507
+ of the provided implementation is given in the following note.
508
+
509
+ .. note ::
510
+
511
+ Here is a list of the mapping classes inheriting from
512
+ :any: `ot.da.BaseTransport `
513
+
514
+ * :any: `ot.da.EMDTransport ` : Barycentric mapping with EMD transport
515
+ * :any: `ot.da.SinkhornTransport ` : Barycentric mapping with Sinkhorn transport
516
+ * :any: `ot.da.SinkhornL1l2Transport ` : Barycentric mapping with Sinkhorn +
517
+ group Lasso regularization [5 ]_
518
+ * :any: `ot.da.SinkhornLpl1Transport ` : Barycentric mapping with Sinkhorn +
519
+ non convex group Lasso regularization [5 ]_
520
+ * :any: `ot.da.LinearTransport ` : Linear mapping estimation between Gaussians
521
+ [14 ]_
522
+ * :any: `ot.da.MappingTransport ` : Nonlinear mapping estimation [8 ]_
523
+
524
+ .. hint ::
525
+
526
+ Example of the use of OTDA classes are available in the following exmaples:
527
+
528
+ - :any: `auto_examples/plot_otda_color_images `
529
+ - :any: `auto_examples/plot_otda_mapping `
530
+ - :any: `auto_examples/plot_otda_mapping_colors_images `
531
+ - :any: `auto_examples/plot_otda_semi_supervised `
446
532
447
533
Other applications
448
534
------------------
449
535
536
+ We discuss in the following several implementations that has been used and
537
+ proposed in the OT and machine learning community.
538
+
450
539
Wasserstein Discriminant Analysis
451
540
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
452
541
542
+ Wasserstein Discriminant Analysis [11 ]_ is a generalization of `Fisher Linear Discriminant
543
+ Analysis <https://en.wikipedia.org/wiki/Linear_discriminant_analysis> `__ that
544
+ allows discrimination between classes that are not linearly separable. It
545
+ consist in finding a linear projector optimizing the following criterion
546
+
547
+ .. math ::
548
+ P = \text {arg}\min _P \frac {\sum _i OT_e(\mu _i\# P,\mu _i\# P)}{\sum _{i,j\neq i}
549
+ OT_e(\mu _i\# P,\mu _j\# P)}
550
+
551
+ where :math: `\#` is the push-forward operator, :math: `OT_e` is the entropic OT
552
+ loss and :math: `\mu _i` is the
553
+ distribution of samples from class :math: `i`. :math: `P` is also constrained to
554
+ be in the Stiefel manifold. WDA can be solved in pot using function
555
+ :any: `ot.dr.wda `. It requires to have installed :code: `pymanopt ` and
556
+ :code: `autograd ` for manifold optimization and automatic differentiation
557
+ respectively. Note that we also provide the Fisher discriminant estimator in
558
+ :any: `ot.dr.wda ` for easy comparison.
559
+
560
+ .. warning ::
561
+ Note that due to the hard dependency on :code: `pymanopt ` and
562
+ :code: `autograd `, :any: `ot.dr ` is not imported by default. If you want to
563
+ use it you have to specifically import it with :code: `import ot.dr ` .
564
+
565
+ .. hint ::
566
+
567
+ An example of the use of WDA is available in the following example:
568
+
569
+ - :any: `auto_examples/plot_WDA `
570
+
571
+
572
+ Unbalanced optimal transport
573
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
574
+
575
+ Unbalanced OT is a relaxation of the original OT problem where the violation of
576
+ the constraint on the marginals is added to the objective of the optimization
577
+ problem:
578
+
579
+ .. math ::
580
+ \min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j} + reg\cdot\Omega (\gamma ) + \alpha KL(\gamma 1 , a) + \alpha KL(\gamma ^T 1 , b)
581
+
582
+ s.t. \quad \gamma\geq 0
583
+
584
+
585
+ where KL is the Kullback-Leibler divergence. This formulation allwos for
586
+ computing approximate mapping between distributions that do not have the same
587
+ amount of mass. Interestingly the problem can be solved with a generalization of
588
+ the Bregman projections algorithm [10 ]_.
453
589
454
590
Gromov-Wasserstein
455
591
^^^^^^^^^^^^^^^^^^
@@ -461,6 +597,10 @@ GPU acceleration
461
597
We provide several implementation of our OT solvers in :any: `ot.gpu `. Those
462
598
implementation use the :code: `cupy ` toolbox.
463
599
600
+ .. warning ::
601
+ Note that due to the hard dependency on :code: `cupy `, :any: `ot.gpu ` is not
602
+ imported by default. If you want to
603
+ use it you have to specifically import it with :code: `import ot.gpu ` .
464
604
465
605
466
606
FAQ
0 commit comments