diff --git a/README.md b/README.md index fcdb4fe..3321072 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,14 @@ # img2img Seminar project Unsupervised Image-to-Image translation using GANs +## Table of Contents + +* [Installation/Usage for Contributors](#installationusage-for-contributors) +* [Implementations](#implementations) + + [CycleGAN](#cyclegan) + + [Pix2Pix](#pix2pix) + + [UNIT](#unit) + + [TUNIT](#tunit) ## Installation/Usage for contributors ```bash @@ -16,8 +24,103 @@ Seminar project Unsupervised Image-to-Image translation using GANs $ pip install -e ".[dev]" ``` ## Implementations +### Pix2Pix +_Unpaired Image-to-Image Translation with Conditional Adversarial Networks_ + +#### Authors +Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros + +#### Abstract +We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either. + +[[Paper]](https://arxiv.org/abs/1611.07004) [[Code]](src/img2img/models/pix2pix) + +

+ + +

+ +#### Run Example +```bash +$ cd data/ +$ bash download_pix2pix_dataset.sh facades +$ cd ../src/img2img/models/pix2pix +$ python3 pix2pix.py --dataset_name facades +``` + +

+ +

+

+ Rows from top to bottom: (1) The condition for the generator (2) Generated image
+ based of condition (3) The true corresponding image to the condition +

+ -### pix2pix ### CycleGAN +_Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks_ + +#### Authors +Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros + +#### Abstract +Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G:X→Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F:Y→X and introduce a cycle consistency loss to push F(G(X))≈X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach. + +[[Paper]](https://arxiv.org/abs/1703.10593) [[Code]](src/img2img/models/cyclegan) + +

+ +

+ +#### Run Example +```bash +$ cd data/ +$ bash download_cyclegan_dataset.sh monet2photo +$ cd ../src/img2img/models/cyclegan +$ python3 cyclegan.py --dataset_name monet2photo +``` + +

+ +

+

+ Monet to photo translations. +

+ + ### UNIT -### TUNIT \ No newline at end of file +_Unsupervised Image-to-Image Translation Networks_ + +#### Authors +Ming-Yu Liu, Thomas Breuel, Jan Kautz + +#### Abstract +Unsupervised image-to-image translation aims at learning a joint distribution of images in different domains by using images from the marginal distributions in individual domains. Since there exists an infinite set of joint distributions that can arrive the given marginal distributions, one could infer nothing about the joint distribution from the marginal distributions without additional assumptions. To address the problem, we make a shared-latent space assumption and propose an unsupervised image-to-image translation framework based on Coupled GANs. We compare the proposed framework with competing approaches and present high quality image translation results on various challenging unsupervised image translation tasks, including street scene image translation, animal image translation, and face image translation. We also apply the proposed framework to domain adaptation and achieve state-of-the-art performance on benchmark datasets. Code and additional results are available in this [https URL](https://github.com/mingyuliutw/unit). + +[[Paper]](https://arxiv.org/abs/1703.00848) [[Code]](src/img2img/models/unit) + +#### Run Example +```bash +$ cd data/ +$ bash download_cyclegan_dataset.sh apple2orange +$ cd ../src/img2img/models/tunit +$ python3 unit.py --dataset_name apple2orange +``` + +### TUNIT +_Rethinking the Truly Unsupervised Image-to-Image Translation_ + +#### Authors +Kyungjune Baek, Yunjey Choi, Youngjung Uh, Jaejun Yoo, Hyunjung Shim +#### Abstract +Every recent image-to-image translation model inherently requires either image-level (i.e. input-output pairs) or set-level (i.e. domain labels) supervision. However, even set-level supervision can be a severe bottleneck for data collection in practice. In this paper, we tackle image-to-image translation in a fully unsupervised setting, i.e., neither paired images nor domain labels. To this end, we propose a truly unsupervised image-to-image translation model (TUNIT) that simultaneously learns to separate image domains and translates input images into the estimated domains. Experimental results show that our model achieves comparable or even better performance than the set-level supervised model trained with full labels, generalizes well on various datasets, and is robust against the choice of hyperparameters (e.g. the preset number of pseudo domains). Furthermore, TUNIT can be easily extended to semi-supervised learning with a few labeled data. + +[[Paper]](https://arxiv.org/abs/2006.06500) [[Code]](src/img2img/models/tunit) + +```bash +$ cd data/ +$ bash download_cyclegan_dataset.sh apple2orange +$ cd ../src/img2img/models/tunit +$ python3 unit.py --dataset_name apple2orange +``` + diff --git a/assets/cyclegan (1).png b/assets/cyclegan (1).png new file mode 100644 index 0000000..2c45f24 Binary files /dev/null and b/assets/cyclegan (1).png differ diff --git a/assets/cyclegan.png b/assets/cyclegan.png new file mode 100644 index 0000000..2a02b89 Binary files /dev/null and b/assets/cyclegan.png differ diff --git a/assets/pix2pix.png b/assets/pix2pix.png new file mode 100644 index 0000000..5debdb7 Binary files /dev/null and b/assets/pix2pix.png differ diff --git a/assets/pix2pix_architecture.png b/assets/pix2pix_architecture.png new file mode 100644 index 0000000..32cd524 Binary files /dev/null and b/assets/pix2pix_architecture.png differ