Initial Commit
user_wwq
3 years ago
| 0 | # Painting Outside the Box: Image Outpainting with GANs | |
| 1 | ||
| 2 | We designed and implemented a pipeline for **image outpainting** for our final project for Stanford's CS 230 (Deep Learning) in Spring 2018. | |
| 3 | ||
| 4 |  | |
| 5 | ||
| 6 | ## Quick Links | |
| 7 | ||
| 8 | * Paper: [https://arxiv.org/abs/1808.08483](https://arxiv.org/abs/1808.08483) | |
| 9 | * Poster: [image-outpainting/poster/msabini-gili__image-outpainting-poster.pdf](https://github.com/ShinyCode/image-outpainting/blob/master/poster/msabini-gili__image-outpainting-poster.pdf) | |
| 10 | * Training animation: [http://marksabini.com/cs230-outpainting](http://marksabini.com/cs230-outpainting) | |
| 11 | ||
| 12 | ## Environment Setup | |
| 13 | ||
| 14 | These instructions assume that the current working directory `[cwd]` is `[repo]/src`. We no longer have access to our specific instances, so we unfortunately cannot help with system-specific troubleshooting. However, the instructions below are to the best of our memory, and should be *very close* to what is needed: | |
| 15 | ||
| 16 | * If desired, use an AWS instance. We used `p2.xlarge`, and the CS 230 website has a good setup guide here: https://cs230-stanford.github.io/aws-starter-guide.html. We also recommend reserving a static IP for AWS instances to make `ssh` and `scp` easier. | |
| 17 | ||
| 18 | ||
| 19 | * Install the necessary dependencies. The main ones are TensorFlow, OpenCV, NumPy, SciPy, and PIL. | |
| 20 | ||
| 21 | * Clone the repo: `>> git clone https://github.com/ShinyCode/image-outpainting.git` | |
| 22 | ||
| 23 | * Download the [raw dataset](http://data.csail.mit.edu/places/places365/val_256.tar) from the Places365 website to `[cwd]/raw`. From the Python shell, run the following to resize the data and load it into an `.npy` file: | |
| 24 | ||
| 25 | ``` | |
| 26 | >> import util | |
| 27 | >> IN_PATH, RSZ_PATH, OUT_PATH = 'raw', 'places_resized', 'places/all_images.npy' | |
| 28 | >> util.resize_images(IN_PATH, RSZ_PATH) | |
| 29 | >> util.compile_images(RSZ_PATH, OUT_PATH) | |
| 30 | ``` | |
| 31 | ||
| 32 | * Before the data can be used for training, it needs to be separated into training and validation splits. There isn't a function to do this, but the code below should be close to what is needed: | |
| 33 | ||
| 34 | ```python | |
| 35 | >> import numpy as np | |
| 36 | >> data = np.load('places/all_images.npy') | |
| 37 | >> idx_test = np.random.choice(36500, 100, replace=False) | |
| 38 | >> idx_train = list(set(range(36500)) - set(idx_test)) | |
| 39 | >> imgs_train = data[idx_train] | |
| 40 | >> imgs_test = data[idx_test] | |
| 41 | >> np.savez('places/places_128.npz', imgs_train=imgs_train, imgs_test=imgs_test, idx_train=idx_train, idx_test=idx_test) | |
| 42 | ``` | |
| 43 | ||
| 44 | Strictly speaking, saving the indices (`idx_train`, `idx_test`) isn't necessary, but it will help later to correlate results with the original images. | |
| 45 | ||
| 46 | ## Training the Model | |
| 47 | ||
| 48 | * Create the folder `[cwd]/output`. Whenever you start a new run, *this folder should be empty*. Otherwise, the code will abort to avoid clobbering a previous run. To avoid this, simply rename the directory containing your previous run: | |
| 49 | `>> mv [cwd]/output [cwd]/someOtherName` | |
| 50 | * If running the code for long periods of time, it's recommended to use **tmux** or **screen**. | |
| 51 | * Change any hyperparameters at the top of `train.py`, although it is recommended to leave the file paths alone. | |
| 52 | * Run the code as follows: `>> ./run.sh`. This wraps the Python code and saves the console output to `[cwd]/output/out`. | |
| 53 | * The code can be interrupted if needed, but it is recommended to allow it to finish, as it: | |
| 54 | * Saves results on the test set every `INTV_PRINT` iterations to `[cwd]/output/`. | |
| 55 | * Saves the model every `INTV_SAVE` iterations to `[cwd]/output/models/`. | |
| 56 | * Saves the loss every `INTV_SAVE` iterations to `[cwd]/output/loss.npz`. | |
| 57 | * Performs postprocessing on the final batch of test images and saves the results. | |
| 58 | ||
| 59 | ## Using the Model | |
| 60 | ||
| 61 | * Training can be restarted from a checkpoint by running `>> python train.py [ITER]`, where `ITER` corresponds to the iteration associated with the desired model checkpoint. | |
| 62 | * The file `[cwd]/gen.py` performs image outpainting and expands the size of the input image. Once you have a model, you can run `gen.py` from the terminal by invoking: | |
| 63 | `>> python gen.py [model_PATH] [in_PATH] [out_PATH]` | |
| 64 | Here, `model_PATH` is the path to the model, `in_PATH` is the path to the input image, and `out_PATH` is the path where the output image will be saved. | |
| 65 | * The file `[cwd]/test.py` is slightly different from `gen.py`, since it first crops the image and then performs image outpainting, so that the output image size is the same as the input image size. This is useful for computing metrics, as the input image functions as the ground truth. It is invoked analogously to `gen.py`: | |
| 66 | `>> python test.py [model_PATH][in_PATH] [out_PATH]` | |
| 67 | ||
| 68 | ## Gallery | |
| 69 | ||
| 70 | Here are some of our results, taken directly from our poster! | |
| 71 | ||
| 72 | ### Main Results | |
| 73 | ||
| 74 |  | |
| 75 | ||
| 76 | ### Recursive Outpainting | |
| 77 | ||
| 78 |  |
| 0 | ## 介绍 (Introduction) | |
| 1 | ||
| 2 | 添加该项目的功能、使用场景和输入输出参数等相关信息。 | |
| 3 | ||
| 4 | You can describe the function, usage and parameters of the project.⏎ |
| 0 | { | |
| 1 | "cells": [ | |
| 2 | { | |
| 3 | "cell_type": "markdown", | |
| 4 | "metadata": {}, | |
| 5 | "source": [ | |
| 6 | "## 1. 项目介绍\n", | |
| 7 | "\n", | |
| 8 | " - 项目是由模块组成、有特定功能的程序。它能够满足用户的直接使用需求,例如[古诗词生成器](https://momodel.cn/explore/5bfb634e1afd943c623dd9cf?type=app&tab=1)、[风格迁移](https://momodel.cn/explore/5bfb634e1afd943c623dd9cf?type=app&tab=1)等。\n", | |
| 9 | " - 开发项目过程中你可以导入数据集,也可以通过每个 cell 上方工具栏的`<+>`直接插入[模块](https://momodel.cn/modules)和代码块。\n", | |
| 10 | " - 你可以将开发好的项目进行[部署](https://momodel.cn/docs/#/zh-cn/%E5%BC%80%E5%8F%91%E5%92%8C%E9%83%A8%E7%BD%B2%E4%B8%80%E4%B8%AA%E9%A1%B9%E7%9B%AE),项目部署成功并选择正式版本发布后会展示在“项目”页面,用户可以在线使用,也可以通过 API 调用。\n", | |
| 11 | "\n", | |
| 12 | " - 项目目录结构:\n", | |
| 13 | "\n", | |
| 14 | " - ```results```*-----结果的文件存放地(如果你运行 job,务必将运行结果指定在此目录)*\n", | |
| 15 | " - ```_OVERVIEW.md``` *-----项目的相关介绍*\n", | |
| 16 | " - ```_README.md```*-----说明文档*\n", | |
| 17 | " - ```app_spec.yml```*-----定义项目的输入输出,为部署服务*\n", | |
| 18 | " - ```coding_here.ipynb```*-----输入并运行代码*" | |
| 19 | ] | |
| 20 | }, | |
| 21 | { | |
| 22 | "cell_type": "markdown", | |
| 23 | "metadata": {}, | |
| 24 | "source": [ | |
| 25 | "\n", | |
| 26 | "## 2. 开发环境简介\n", | |
| 27 | "\n", | |
| 28 | "你当前所在的页面 Notebook 是一个内嵌 JupyterLab 的在线类 IDE 编程环境,开发过程中可以使用页面右侧的 API 文档进行快速查询。Notebook 有以下主要功能:\n", | |
| 29 | "\n", | |
| 30 | "- [调用数据集、模块和代码块资源](https://momodel.cn/docs/#/zh-cn/%E5%A6%82%E4%BD%95%E5%AF%BC%E5%85%A5%E5%B9%B6%E4%BD%BF%E7%94%A8%E6%A8%A1%E5%9D%97%E5%92%8C%E6%95%B0%E6%8D%AE%E9%9B%86)\n", | |
| 31 | "- [多人代码协作](https://momodel.cn/docs/#/zh-cn/%E5%9C%A8Mo%E8%BF%90%E8%A1%8C%E4%BD%A0%E7%9A%84%E7%AC%AC%E4%B8%80%E6%AE%B5%E4%BB%A3%E7%A0%81?id=_7-%e4%bd%a0%e5%8f%af%e4%bb%a5%e9%82%80%e8%af%b7%e5%a5%bd%e5%8f%8b%e8%bf%9b%e8%a1%8c%e5%8d%8f%e4%bd%9c)\n", | |
| 32 | "- [在 GPU 资源上训练机器学习模型](https://momodel.cn/docs/#/zh-cn/%E5%9C%A8GPU%E6%88%96CPU%E8%B5%84%E6%BA%90%E4%B8%8A%E8%AE%AD%E7%BB%83%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0%E6%A8%A1%E5%9E%8B)\n", | |
| 33 | "- [简单部署](https://momodel.cn/docs/#/zh-cn/%E5%BC%80%E5%8F%91%E5%92%8C%E9%83%A8%E7%BD%B2%E4%B8%80%E4%B8%AA%E9%A1%B9%E7%9B%AE)\n", | |
| 34 | "\n", | |
| 35 | "快来动手试试吧!点击左侧工具栏的新建文件图标即可选择你需要的文件类型。\n", | |
| 36 | "\n", | |
| 37 | "<img src='https://imgbed.momodel.cn/006tNc79gy1g61agfcv23j31c30u0789.jpg' width=100% height=100%>\n", | |
| 38 | "\n", | |
| 39 | "\n", | |
| 40 | "\n", | |
| 41 | "左侧和右侧工具栏都可根据使用需要进行收合。\n", | |
| 42 | "<img src='https://imgbed.momodel.cn/collapse_tab.2019-09-06 11_07_44.gif' width=100% height=100%>" | |
| 43 | ] | |
| 44 | }, | |
| 45 | { | |
| 46 | "cell_type": "markdown", | |
| 47 | "metadata": {}, | |
| 48 | "source": [ | |
| 49 | "## 3. 快捷键与代码补全\n", | |
| 50 | "Mo Notebook 已完全采用 Jupyter Notebook 的原生快捷键,并且支持 `tab` 代码补全。\n", | |
| 51 | "\n", | |
| 52 | "运行代码:`shift` + `enter` 或者 `shift` + `return`" | |
| 53 | ] | |
| 54 | }, | |
| 55 | { | |
| 56 | "cell_type": "markdown", | |
| 57 | "metadata": {}, | |
| 58 | "source": [ | |
| 59 | "## 4. 常用指令介绍\n", | |
| 60 | "\n", | |
| 61 | "- 解压上传后的文件\n", | |
| 62 | "\n", | |
| 63 | "在 cell 中输入并运行以下命令:\n", | |
| 64 | "```!7zx file_name.zip```\n", | |
| 65 | "\n", | |
| 66 | "- 查看所有包(package)\n", | |
| 67 | "\n", | |
| 68 | "`!pip list --format=columns`\n", | |
| 69 | "\n", | |
| 70 | "- 检查是否已有某个包\n", | |
| 71 | "\n", | |
| 72 | "`!pip show package_name`\n", | |
| 73 | "\n", | |
| 74 | "- 安装缺失的包\n", | |
| 75 | "\n", | |
| 76 | "`!pip install package_name`\n", | |
| 77 | "\n", | |
| 78 | "- 更新已有的包\n", | |
| 79 | "\n", | |
| 80 | "`!pip install package_name --upgrade`\n", | |
| 81 | "\n", | |
| 82 | "\n", | |
| 83 | "- 使用包\n", | |
| 84 | "\n", | |
| 85 | "`import package_name`\n", | |
| 86 | "\n", | |
| 87 | "- 显示当前目录下的档案及目录\n", | |
| 88 | "\n", | |
| 89 | "`ls`\n", | |
| 90 | "\n", | |
| 91 | "- 使用引入的数据集\n", | |
| 92 | "\n", | |
| 93 | "数据集被引入后存放在 datasets 文件夹下,注意,这个文件夹是只读的,不可修改。如果需要修改,可在 Notebook 中使用\n", | |
| 94 | "\n", | |
| 95 | "`!cp -R ./datasets/<imported_dataset_dir> ./<your_folder>`\n", | |
| 96 | "\n", | |
| 97 | "指令将其复制到其他文件夹后再编辑,对于引入的数据集中的 zip 文件,可使用\n", | |
| 98 | "\n", | |
| 99 | "`!7zx ./datasets/<imported_dataset_dir>/<XXX.zip> ./<your_folder>`\n", | |
| 100 | "\n", | |
| 101 | "指令解压缩到其他文件夹后使用" | |
| 102 | ] | |
| 103 | }, | |
| 104 | { | |
| 105 | "cell_type": "markdown", | |
| 106 | "metadata": {}, | |
| 107 | "source": [ | |
| 108 | "## 5. 其他可参考资源\n", | |
| 109 | "\n", | |
| 110 | "- [帮助文档](https://momodel.cn/docs/#/):基本页面介绍和常见问题都可以在里面找到\n", | |
| 111 | "- [平台功能教程](https://momodel.cn/classroom/class/5c5696cd1afd9458d456bf54):通过图文结合的 Notebook 详细介绍开发环境基本功能和操作\n", | |
| 112 | "- [从 Python 到人工智能](https://momodel.cn/classroom/course/detail?&id=60f02c635076ff487bce4c6f):超易入门的 Python 课程\n", | |
| 113 | "- [吴恩达机器学习](https://momodel.cn/classroom/class/5c5696191afd94720cc94533):机器学习经典课程\n", | |
| 114 | "- [李宏毅机器学习](https://momodel.cn/classroom/class/5d63dde21afd9461419f5ebf):中文世界最好的机器学习课程\n", | |
| 115 | "- [机器学习实战](https://momodel.cn/classroom/class/60af61b6f955c61c2cddfcb5):通过实操指引完成独立的模型,掌握相应的机器学习知识\n", | |
| 116 | "- [深度学习实战](https://momodel.cn/classroom/class/5c680b311afd943a9f70901b):通过实操指引完成独立的模型,掌握相应的深度学习知识\n", | |
| 117 | "- [模块开发](https://momodel.cn/modules):关于模型训练、开发与部署的高阶教程" | |
| 118 | ] | |
| 119 | } | |
| 120 | ], | |
| 121 | "metadata": { | |
| 122 | "kernelspec": { | |
| 123 | "display_name": "Python 3", | |
| 124 | "language": "python", | |
| 125 | "name": "python3" | |
| 126 | }, | |
| 127 | "language_info": { | |
| 128 | "codemirror_mode": { | |
| 129 | "name": "ipython", | |
| 130 | "version": 3 | |
| 131 | }, | |
| 132 | "file_extension": ".py", | |
| 133 | "mimetype": "text/x-python", | |
| 134 | "name": "python", | |
| 135 | "nbconvert_exporter": "python", | |
| 136 | "pygments_lexer": "ipython3", | |
| 137 | "version": "3.5.2" | |
| 138 | }, | |
| 139 | "pycharm": { | |
| 140 | "stem_cell": { | |
| 141 | "cell_type": "raw", | |
| 142 | "source": [], | |
| 143 | "metadata": { | |
| 144 | "collapsed": false | |
| 145 | } | |
| 146 | } | |
| 147 | } | |
| 148 | }, | |
| 149 | "nbformat": 4, | |
| 150 | "nbformat_minor": 2 | |
| 151 | } |
| 0 | ||
| 1 | { | |
| 2 | "cells": [ | |
| 3 | { | |
| 4 | "cell_type": "code", | |
| 5 | "execution_count": null, | |
| 6 | "metadata": {}, | |
| 7 | "outputs": [], | |
| 8 | "source": [ | |
| 9 | "print('Hello Mo!')" | |
| 10 | ] | |
| 11 | } | |
| 12 | ], | |
| 13 | "metadata": { | |
| 14 | "kernelspec": { | |
| 15 | "display_name": "Python 3", | |
| 16 | "language": "python", | |
| 17 | "name": "python3" | |
| 18 | }, | |
| 19 | "language_info": { | |
| 20 | "codemirror_mode": { | |
| 21 | "name": "ipython", | |
| 22 | "version": 3 | |
| 23 | }, | |
| 24 | "file_extension": ".py", | |
| 25 | "mimetype": "text/x-python", | |
| 26 | "name": "python", | |
| 27 | "nbconvert_exporter": "python", | |
| 28 | "pygments_lexer": "ipython3", | |
| 29 | "version": "3.5.2" | |
| 30 | } | |
| 31 | }, | |
| 32 | "nbformat": 4, | |
| 33 | "nbformat_minor": 2 | |
| 34 | } | |
| 35 | ⏎ |
Binary diff not shown
| 0 | Indices (0-indexed) of the 100 images held out from training. | |
| 1 | [ 113 509 242 280 533 638 644 698 751 832 10989 16008 | |
| 2 | 13473 6659 20401 24841 26378 8103 11730 8363 16512 6736 27666 30287 | |
| 3 | 6685 30696 16591 8424 26689 21078 27971 7202 6615 36150 9681 13137 | |
| 4 | 1598 9726 4825 2864 1346 21784 4159 13270 19239 9844 16056 2822 | |
| 5 | 15792 19837 5198 19980 30042 36491 15648 20315 3604 8020 1108 18235 | |
| 6 | 16373 25717 32200 10547 6786 31384 33999 25763 20226 9447 4573 5938 | |
| 7 | 1837 25121 17611 32751 28158 29381 13090 32210 17027 30171 12001 16240 | |
| 8 | 22205 11808 20113 10682 33338 24015 15154 10449 11373 8736 26320 4095 | |
| 9 | 13855 23504 2004 33307] |
Binary diff not shown
Binary diff not shown
Binary diff not shown
Binary diff not shown
Binary diff not shown
Binary diff not shown
Binary diff not shown
| 0 | # proj: image-outpainting | |
| 1 | # file: figs.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Collection of utilities for generating figures. | |
| 4 | # ------------------------------------------------------------- | |
| 5 | import numpy as np | |
| 6 | from PIL import Image | |
| 7 | import matplotlib.pyplot as plt | |
| 8 | import util | |
| 9 | ||
| 10 | IMAGE_SZ = 128 | |
| 11 | ||
| 12 | def resize(in_PATH, out_PATH): | |
| 13 | img = Image.open(in_PATH).convert('RGB') | |
| 14 | img_scale = img.resize((IMAGE_SZ, IMAGE_SZ), Image.ANTIALIAS) | |
| 15 | img_scale.save(out_PATH, format='PNG') | |
| 16 | ||
| 17 | def mask(in_PATH, out_PATH): | |
| 18 | img = np.array(Image.open(in_PATH).convert('RGB')) | |
| 19 | pix_avg = np.mean(img) | |
| 20 | img[:, :int(2 * IMAGE_SZ / 8), :] = img[:, int(-2 * IMAGE_SZ / 8):, :] = pix_avg | |
| 21 | img = Image.fromarray(img.astype(np.uint8), 'RGB') | |
| 22 | img.save(out_PATH, format='PNG') |
| 0 | # proj: image-outpainting | |
| 1 | # file: gen.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Script for generating new images. Pads the image, creates | |
| 4 | # a mask, feeds it through the network, and postprocesses. | |
| 5 | # ------------------------------------------------------------- | |
| 6 | import tensorflow as tf | |
| 7 | import numpy as np | |
| 8 | from PIL import Image | |
| 9 | import model | |
| 10 | import util | |
| 11 | import os | |
| 12 | import sys | |
| 13 | ||
| 14 | if len(sys.argv) != 4: | |
| 15 | print('Usage: python gen.py [model_PATH] [in_PATH] [out_PATH]') | |
| 16 | exit() | |
| 17 | ||
| 18 | _, model_PATH, in_PATH, out_PATH = sys.argv | |
| 19 | ||
| 20 | tf.reset_default_graph() | |
| 21 | ||
| 22 | IMAGE_SZ = 128 | |
| 23 | ||
| 24 | img = np.array(Image.open(in_PATH).convert('RGB')) | |
| 25 | img_p = util.preprocess_images_gen(img / 255.0) | |
| 26 | ||
| 27 | G_Z = tf.placeholder(tf.float32, shape=[1, img_p.shape[1], img_p.shape[2], 4], name='G_Z') | |
| 28 | G_sample = model.generator(G_Z) | |
| 29 | ||
| 30 | saver = tf.train.Saver() | |
| 31 | ||
| 32 | with tf.Session() as sess: | |
| 33 | saver.restore(sess, model_PATH) | |
| 34 | output, = sess.run([G_sample], feed_dict={G_Z: img_p}) | |
| 35 | output = util.norm_image(output[0]) | |
| 36 | output_p = util.postprocess_images_gen(img, output, blend=True) | |
| 37 | img_o = Image.fromarray(output_p, 'RGB') | |
| 38 | img_o.save(out_PATH, format='PNG') |
| 0 | # proj: image-outpainting | |
| 1 | # file: model.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Model for outpainting on 128x128 images with only | |
| 4 | # a global discriminator. | |
| 5 | # ------------------------------------------------------------- | |
| 6 | import tensorflow as tf | |
| 7 | ||
| 8 | print('Imported model (for Places365, 128x128 images)') | |
| 9 | ||
| 10 | def generator(z): | |
| 11 | with tf.variable_scope('G', reuse=tf.AUTO_REUSE): | |
| 12 | conv1 = tf.layers.conv2d( | |
| 13 | inputs=z, | |
| 14 | filters=64, | |
| 15 | kernel_size=[5, 5], | |
| 16 | strides=(1, 1), | |
| 17 | padding="same", | |
| 18 | activation=tf.nn.relu) | |
| 19 | ||
| 20 | conv2 = tf.layers.conv2d( | |
| 21 | inputs=conv1, | |
| 22 | filters=128, | |
| 23 | kernel_size=[3, 3], | |
| 24 | strides=(2, 2), | |
| 25 | padding="same", | |
| 26 | activation=tf.nn.relu) | |
| 27 | ||
| 28 | conv3 = tf.layers.conv2d( | |
| 29 | inputs=conv2, | |
| 30 | filters=256, | |
| 31 | kernel_size=[3, 3], | |
| 32 | strides=(1, 1), | |
| 33 | padding="same", | |
| 34 | activation=tf.nn.relu) | |
| 35 | ||
| 36 | conv4 = tf.layers.conv2d( | |
| 37 | inputs=conv3, | |
| 38 | filters=256, | |
| 39 | kernel_size=[3, 3], | |
| 40 | strides=(1, 1), | |
| 41 | dilation_rate=(2, 2), | |
| 42 | padding="same", | |
| 43 | activation=tf.nn.relu) | |
| 44 | ||
| 45 | conv5 = tf.layers.conv2d( | |
| 46 | inputs=conv4, | |
| 47 | filters=256, | |
| 48 | kernel_size=[3, 3], | |
| 49 | strides=(1, 1), | |
| 50 | dilation_rate=(4, 4), | |
| 51 | padding="same", | |
| 52 | activation=tf.nn.relu) | |
| 53 | ||
| 54 | conv5_p = tf.layers.conv2d( | |
| 55 | inputs=conv5, | |
| 56 | filters=256, | |
| 57 | kernel_size=[3, 3], | |
| 58 | strides=(1, 1), | |
| 59 | dilation_rate=(8, 8), | |
| 60 | padding="same", | |
| 61 | activation=tf.nn.relu) | |
| 62 | ||
| 63 | conv6 = tf.layers.conv2d( | |
| 64 | inputs=conv5_p, | |
| 65 | filters=256, | |
| 66 | kernel_size=[3, 3], | |
| 67 | strides=(1, 1), | |
| 68 | padding="same", | |
| 69 | activation=tf.nn.relu) | |
| 70 | ||
| 71 | deconv7 = tf.layers.conv2d_transpose( | |
| 72 | inputs=conv6, | |
| 73 | filters=128, | |
| 74 | kernel_size=[4, 4], | |
| 75 | strides=(2, 2), | |
| 76 | padding="same", | |
| 77 | activation=tf.nn.relu) | |
| 78 | ||
| 79 | conv8 = tf.layers.conv2d( | |
| 80 | inputs=deconv7, | |
| 81 | filters=64, | |
| 82 | kernel_size=[3, 3], | |
| 83 | strides=(1, 1), | |
| 84 | padding="same", | |
| 85 | activation=tf.nn.relu) | |
| 86 | ||
| 87 | out = tf.layers.conv2d( | |
| 88 | inputs=conv8, | |
| 89 | filters=3, | |
| 90 | kernel_size=[3, 3], | |
| 91 | strides=(1, 1), | |
| 92 | padding="same", | |
| 93 | activation=tf.sigmoid) | |
| 94 | ||
| 95 | return out | |
| 96 | ||
| 97 | def global_discriminator(x): | |
| 98 | with tf.variable_scope('DG', reuse=tf.AUTO_REUSE): | |
| 99 | conv1 = tf.layers.conv2d( | |
| 100 | inputs=x, | |
| 101 | filters=32, | |
| 102 | kernel_size=[5, 5], | |
| 103 | strides=(2, 2), | |
| 104 | padding="same", | |
| 105 | activation=tf.nn.relu) | |
| 106 | ||
| 107 | conv2 = tf.layers.conv2d( | |
| 108 | inputs=conv1, | |
| 109 | filters=64, | |
| 110 | kernel_size=[5, 5], | |
| 111 | strides=(2, 2), | |
| 112 | padding="same", | |
| 113 | activation=tf.nn.relu) | |
| 114 | ||
| 115 | conv3 = tf.layers.conv2d( | |
| 116 | inputs=conv2, | |
| 117 | filters=64, | |
| 118 | kernel_size=[5, 5], | |
| 119 | strides=(2, 2), | |
| 120 | padding="same", | |
| 121 | activation=tf.nn.relu) | |
| 122 | ||
| 123 | conv4 = tf.layers.conv2d( | |
| 124 | inputs=conv3, | |
| 125 | filters=64, | |
| 126 | kernel_size=[5, 5], | |
| 127 | strides=(2, 2), | |
| 128 | padding="same", | |
| 129 | activation=tf.nn.relu) | |
| 130 | ||
| 131 | conv5 = tf.layers.conv2d( | |
| 132 | inputs=conv4, | |
| 133 | filters=64, | |
| 134 | kernel_size=[5, 5], | |
| 135 | strides=(2, 2), | |
| 136 | padding="same", | |
| 137 | activation=tf.nn.relu) | |
| 138 | ||
| 139 | conv5_flat = tf.layers.flatten( | |
| 140 | inputs=conv5) | |
| 141 | ||
| 142 | dense6 = tf.layers.dense( | |
| 143 | inputs=conv5_flat, | |
| 144 | units=512, | |
| 145 | activation=tf.nn.relu) | |
| 146 | ||
| 147 | return dense6 | |
| 148 | ||
| 149 | def concatenator(global_x): | |
| 150 | with tf.variable_scope('C', reuse=tf.AUTO_REUSE): | |
| 151 | dense1 = tf.layers.dense( | |
| 152 | inputs=global_x, | |
| 153 | units=1, | |
| 154 | activation=tf.sigmoid) | |
| 155 | ||
| 156 | return dense1 |
| 0 | # proj: image-outpainting | |
| 1 | # file: model_ld.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Model for outpainting on 128x128 images with both | |
| 4 | # global and local discriminators. | |
| 5 | # ------------------------------------------------------------- | |
| 6 | import tensorflow as tf | |
| 7 | ||
| 8 | print('Imported model_ld (for Places365, 128x128 images with local discriminator)') | |
| 9 | ||
| 10 | def generator(z): | |
| 11 | with tf.variable_scope('G', reuse=tf.AUTO_REUSE): | |
| 12 | conv1 = tf.layers.conv2d( | |
| 13 | inputs=z, | |
| 14 | filters=64, | |
| 15 | kernel_size=[5, 5], | |
| 16 | strides=(1, 1), | |
| 17 | padding="same", | |
| 18 | activation=tf.nn.relu) | |
| 19 | ||
| 20 | conv2 = tf.layers.conv2d( | |
| 21 | inputs=conv1, | |
| 22 | filters=128, | |
| 23 | kernel_size=[3, 3], | |
| 24 | strides=(2, 2), | |
| 25 | padding="same", | |
| 26 | activation=tf.nn.relu) | |
| 27 | ||
| 28 | conv3 = tf.layers.conv2d( | |
| 29 | inputs=conv2, | |
| 30 | filters=256, | |
| 31 | kernel_size=[3, 3], | |
| 32 | strides=(1, 1), | |
| 33 | padding="same", | |
| 34 | activation=tf.nn.relu) | |
| 35 | ||
| 36 | conv4 = tf.layers.conv2d( | |
| 37 | inputs=conv3, | |
| 38 | filters=256, | |
| 39 | kernel_size=[3, 3], | |
| 40 | strides=(1, 1), | |
| 41 | dilation_rate=(2, 2), | |
| 42 | padding="same", | |
| 43 | activation=tf.nn.relu) | |
| 44 | ||
| 45 | conv5 = tf.layers.conv2d( | |
| 46 | inputs=conv4, | |
| 47 | filters=256, | |
| 48 | kernel_size=[3, 3], | |
| 49 | strides=(1, 1), | |
| 50 | dilation_rate=(4, 4), | |
| 51 | padding="same", | |
| 52 | activation=tf.nn.relu) | |
| 53 | ||
| 54 | conv5_p = tf.layers.conv2d( | |
| 55 | inputs=conv5, | |
| 56 | filters=256, | |
| 57 | kernel_size=[3, 3], | |
| 58 | strides=(1, 1), | |
| 59 | dilation_rate=(8, 8), | |
| 60 | padding="same", | |
| 61 | activation=tf.nn.relu) | |
| 62 | ||
| 63 | conv6 = tf.layers.conv2d( | |
| 64 | inputs=conv5_p, | |
| 65 | filters=256, | |
| 66 | kernel_size=[3, 3], | |
| 67 | strides=(1, 1), | |
| 68 | padding="same", | |
| 69 | activation=tf.nn.relu) | |
| 70 | ||
| 71 | deconv7 = tf.layers.conv2d_transpose( | |
| 72 | inputs=conv6, | |
| 73 | filters=128, | |
| 74 | kernel_size=[4, 4], | |
| 75 | strides=(2, 2), | |
| 76 | padding="same", | |
| 77 | activation=tf.nn.relu) | |
| 78 | ||
| 79 | conv8 = tf.layers.conv2d( | |
| 80 | inputs=deconv7, | |
| 81 | filters=64, | |
| 82 | kernel_size=[3, 3], | |
| 83 | strides=(1, 1), | |
| 84 | padding="same", | |
| 85 | activation=tf.nn.relu) | |
| 86 | ||
| 87 | out = tf.layers.conv2d( | |
| 88 | inputs=conv8, | |
| 89 | filters=3, | |
| 90 | kernel_size=[3, 3], | |
| 91 | strides=(1, 1), | |
| 92 | padding="same", | |
| 93 | activation=tf.sigmoid) | |
| 94 | ||
| 95 | return out | |
| 96 | ||
| 97 | def global_discriminator(x): | |
| 98 | with tf.variable_scope('DG', reuse=tf.AUTO_REUSE): | |
| 99 | conv1 = tf.layers.conv2d( | |
| 100 | inputs=x, | |
| 101 | filters=32, | |
| 102 | kernel_size=[5, 5], | |
| 103 | strides=(2, 2), | |
| 104 | padding="same", | |
| 105 | activation=tf.nn.relu) | |
| 106 | ||
| 107 | conv2 = tf.layers.conv2d( | |
| 108 | inputs=conv1, | |
| 109 | filters=64, | |
| 110 | kernel_size=[5, 5], | |
| 111 | strides=(2, 2), | |
| 112 | padding="same", | |
| 113 | activation=tf.nn.relu) | |
| 114 | ||
| 115 | conv3 = tf.layers.conv2d( | |
| 116 | inputs=conv2, | |
| 117 | filters=64, | |
| 118 | kernel_size=[5, 5], | |
| 119 | strides=(2, 2), | |
| 120 | padding="same", | |
| 121 | activation=tf.nn.relu) | |
| 122 | ||
| 123 | conv4 = tf.layers.conv2d( | |
| 124 | inputs=conv3, | |
| 125 | filters=64, | |
| 126 | kernel_size=[5, 5], | |
| 127 | strides=(2, 2), | |
| 128 | padding="same", | |
| 129 | activation=tf.nn.relu) | |
| 130 | ||
| 131 | conv5 = tf.layers.conv2d( | |
| 132 | inputs=conv4, | |
| 133 | filters=64, | |
| 134 | kernel_size=[5, 5], | |
| 135 | strides=(2, 2), | |
| 136 | padding="same", | |
| 137 | activation=tf.nn.relu) | |
| 138 | ||
| 139 | conv5_flat = tf.layers.flatten( | |
| 140 | inputs=conv5) | |
| 141 | ||
| 142 | dense6 = tf.layers.dense( | |
| 143 | inputs=conv5_flat, | |
| 144 | units=512, | |
| 145 | activation=tf.nn.relu) | |
| 146 | ||
| 147 | return dense6 | |
| 148 | ||
| 149 | def local_discriminator(x): | |
| 150 | with tf.variable_scope('DL', reuse=tf.AUTO_REUSE): | |
| 151 | conv1 = tf.layers.conv2d( | |
| 152 | inputs=x, | |
| 153 | filters=32, | |
| 154 | kernel_size=[5, 5], | |
| 155 | strides=(2, 2), | |
| 156 | padding="same", | |
| 157 | activation=tf.nn.relu) | |
| 158 | ||
| 159 | conv2 = tf.layers.conv2d( | |
| 160 | inputs=conv1, | |
| 161 | filters=64, | |
| 162 | kernel_size=[5, 5], | |
| 163 | strides=(2, 2), | |
| 164 | padding="same", | |
| 165 | activation=tf.nn.relu) | |
| 166 | ||
| 167 | conv3 = tf.layers.conv2d( | |
| 168 | inputs=conv2, | |
| 169 | filters=64, | |
| 170 | kernel_size=[5, 5], | |
| 171 | strides=(2, 2), | |
| 172 | padding="same", | |
| 173 | activation=tf.nn.relu) | |
| 174 | ||
| 175 | conv4 = tf.layers.conv2d( | |
| 176 | inputs=conv3, | |
| 177 | filters=64, | |
| 178 | kernel_size=[5, 5], | |
| 179 | strides=(2, 2), | |
| 180 | padding="same", | |
| 181 | activation=tf.nn.relu) | |
| 182 | ||
| 183 | conv4_flat = tf.layers.flatten( | |
| 184 | inputs=conv4) | |
| 185 | ||
| 186 | dense5 = tf.layers.dense( | |
| 187 | inputs=conv4_flat, | |
| 188 | units=512, | |
| 189 | activation=tf.nn.relu) | |
| 190 | ||
| 191 | return dense5 | |
| 192 | ||
| 193 | def concatenator(global_x, local_x_left, local_x_right): | |
| 194 | with tf.variable_scope('C', reuse=tf.AUTO_REUSE): | |
| 195 | dense1 = tf.layers.dense( | |
| 196 | inputs=tf.concat([global_x, local_x_left, local_x_right], axis=-1), | |
| 197 | units=1, | |
| 198 | activation=tf.sigmoid) | |
| 199 | ||
| 200 | return dense1 |
| 0 | #!/usr/bin/env bash | |
| 1 | ||
| 2 | # Runs train.py and saves the console output to output/out | |
| 3 | stdbuf -i0 -o0 -e0 python -u train.py | tee output/out |
| 0 | #!/usr/bin/env bash | |
| 1 | ||
| 2 | # Runs train_ld.py and saves the console output to output/out | |
| 3 | stdbuf -i0 -o0 -e0 python -u train_ld.py | tee output/out |
| 0 | # proj: image-outpainting | |
| 1 | # file: test.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Script for simulating the training pipeline. Masks out | |
| 4 | # the sides of an image, feeds it through the network, and | |
| 5 | # compares the network output to the original image. | |
| 6 | # ------------------------------------------------------------- | |
| 7 | import tensorflow as tf | |
| 8 | import numpy as np | |
| 9 | from PIL import Image | |
| 10 | import model | |
| 11 | import util | |
| 12 | import os | |
| 13 | import sys | |
| 14 | ||
| 15 | if len(sys.argv) != 4: | |
| 16 | print('Usage: python test.py [model_PATH] [in_PATH] [out_PATH]') | |
| 17 | exit() | |
| 18 | ||
| 19 | _, model_PATH, in_PATH, out_PATH = sys.argv | |
| 20 | ||
| 21 | tf.reset_default_graph() | |
| 22 | ||
| 23 | IMAGE_SZ = 128 | |
| 24 | ||
| 25 | img = np.array(Image.open(in_PATH).convert('RGB'))[np.newaxis] / 255.0 | |
| 26 | img_p = util.preprocess_images_outpainting(img) | |
| 27 | ||
| 28 | G_Z = tf.placeholder(tf.float32, shape=[None, IMAGE_SZ, IMAGE_SZ, 4], name='G_Z') | |
| 29 | G_sample = model.generator(G_Z) | |
| 30 | ||
| 31 | saver = tf.train.Saver() | |
| 32 | ||
| 33 | with tf.Session() as sess: | |
| 34 | saver.restore(sess, model_PATH) | |
| 35 | output, = sess.run([G_sample], feed_dict={G_Z: img_p}) | |
| 36 | util.save_image(output[0], out_PATH) |
| 0 | # proj: image-outpainting | |
| 1 | # file: train.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Train the model specified in model.py, which only | |
| 4 | # uses a global discriminator. | |
| 5 | # ------------------------------------------------------------- | |
| 6 | import tensorflow as tf | |
| 7 | import numpy as np | |
| 8 | from PIL import Image | |
| 9 | import model | |
| 10 | import util | |
| 11 | import os | |
| 12 | import sys | |
| 13 | ||
| 14 | tf.reset_default_graph() | |
| 15 | ||
| 16 | # Places365 Training Hyperparameters | |
| 17 | BATCH_SZ = 16 | |
| 18 | VERBOSE = False | |
| 19 | EPSILON = 1e-9 | |
| 20 | IMAGE_SZ = 128 | |
| 21 | OUT_DIR = 'output' | |
| 22 | MODEL_DIR = os.path.join(OUT_DIR, 'models') | |
| 23 | INFO_PATH = os.path.join(OUT_DIR, 'run.txt') | |
| 24 | N_TEST = 10 | |
| 25 | N_ITERS = 227500 | |
| 26 | N_ITERS_P1 = 40950 # How many iterations to train in phase 1 | |
| 27 | N_ITERS_P2 = 4550 # How many iterations to train in phase 2 | |
| 28 | INTV_PRINT = 200 # How often to print | |
| 29 | INTV_SAVE = 1000 # How often to save the model | |
| 30 | ALPHA = 0.0004 | |
| 31 | ||
| 32 | ''' | |
| 33 | # City Training Hyperparameters | |
| 34 | BATCH_SZ = 1 | |
| 35 | VERBOSE = False | |
| 36 | EPSILON = 1e-9 | |
| 37 | IMAGE_SZ = 128 | |
| 38 | OUT_DIR = 'output' | |
| 39 | MODEL_DIR = os.path.join(OUT_DIR, 'models') | |
| 40 | INFO_PATH = os.path.join(OUT_DIR, 'run.txt') | |
| 41 | N_TEST = 1 | |
| 42 | N_ITERS = 5000 | |
| 43 | N_ITERS_P1 = 1000 # How many iterations to train in phase 1 | |
| 44 | N_ITERS_P2 = 400 # How many iterations to train in phase 2 | |
| 45 | INTV_PRINT = 50 # How often to print | |
| 46 | INTV_SAVE = 10000 # How often to save the model | |
| 47 | ALPHA = 0.0004 | |
| 48 | ''' | |
| 49 | ||
| 50 | # Check that we don't clobber a pre-existing run | |
| 51 | if len(sys.argv) < 2 and os.path.isdir(OUT_DIR) and len(os.listdir(OUT_DIR)) > 2: | |
| 52 | print('Warning, OUT_DIR already exists. Aborting.') | |
| 53 | exit() | |
| 54 | ||
| 55 | # Load in a model if specified as the second argument. | |
| 56 | start_iter = 0 | |
| 57 | model_filename = None | |
| 58 | if len(sys.argv) >= 2: | |
| 59 | start_iter = int(sys.argv[1]) | |
| 60 | model_filename = os.path.join(MODEL_DIR, 'model%d.ckpt' % start_iter) | |
| 61 | ||
| 62 | # Generator code | |
| 63 | G_Z = tf.placeholder(tf.float32, shape=[None, IMAGE_SZ, IMAGE_SZ, 4], name='G_Z') | |
| 64 | DG_X = tf.placeholder(tf.float32, shape=[None, IMAGE_SZ, IMAGE_SZ, 3], name='DG_X') | |
| 65 | ||
| 66 | # Load Places365 data | |
| 67 | data = np.load('places/places_128.npz') | |
| 68 | imgs = data['imgs_train'] # Originally from http://data.csail.mit.edu/places/places365/val_256.tar | |
| 69 | imgs_p = util.preprocess_images_outpainting(imgs) | |
| 70 | ||
| 71 | test_imgs = data['imgs_test'] | |
| 72 | test_imgs_p = util.preprocess_images_outpainting(test_imgs) | |
| 73 | ||
| 74 | test_img = test_imgs[:N_TEST] | |
| 75 | test_img_p = test_imgs_p[:N_TEST] | |
| 76 | ||
| 77 | train_img = imgs[4, np.newaxis] | |
| 78 | train_img_p = imgs_p[4, np.newaxis] | |
| 79 | ||
| 80 | ''' | |
| 81 | # Load city image data | |
| 82 | imgs = util.load_city_image() | |
| 83 | imgs_p = util.preprocess_images_outpainting(imgs) | |
| 84 | ||
| 85 | test_imgs = util.load_city_image() | |
| 86 | test_imgs_p = util.preprocess_images_outpainting(test_imgs) | |
| 87 | ||
| 88 | test_img = test_imgs | |
| 89 | test_img_p = test_imgs_p | |
| 90 | ||
| 91 | train_img = imgs | |
| 92 | train_img_p = imgs_p | |
| 93 | ''' | |
| 94 | ||
| 95 | # Write training and testing sample ground truths as reference | |
| 96 | util.save_image(train_img[0], os.path.join(OUT_DIR, 'train_img.png')) | |
| 97 | for i_test in range(N_TEST): | |
| 98 | util.save_image(test_imgs[i_test], os.path.join(OUT_DIR, 'test_img_%d.png' % i_test)) | |
| 99 | ||
| 100 | G_sample = model.generator(G_Z) | |
| 101 | vars_G = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='G') | |
| 102 | ||
| 103 | C_real = model.concatenator(model.global_discriminator(DG_X)) | |
| 104 | C_fake = model.concatenator(model.global_discriminator(G_sample)) | |
| 105 | vars_DG = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='DG') | |
| 106 | vars_C = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='C') | |
| 107 | ||
| 108 | C_loss = -tf.reduce_mean(tf.log(tf.maximum(C_real, EPSILON)) + tf.log(tf.maximum(1. - C_fake, EPSILON))) | |
| 109 | G_MSE_loss = tf.losses.mean_squared_error(G_sample, DG_X, weights=tf.expand_dims(G_Z[:,:,:,3], -1)) # TODO: MULTIPLY with mask. Actually see if we want to remove this. | |
| 110 | G_loss = G_MSE_loss - ALPHA * tf.reduce_mean(tf.log(tf.maximum(C_fake, EPSILON))) | |
| 111 | ||
| 112 | C_solver = tf.train.AdamOptimizer().minimize(C_loss, var_list=(vars_DG + vars_C)) | |
| 113 | G_solver = tf.train.AdamOptimizer().minimize(G_loss, var_list=vars_G) | |
| 114 | G_MSE_solver = tf.train.AdamOptimizer().minimize(G_MSE_loss, var_list=vars_G) | |
| 115 | ||
| 116 | train_MSE_loss = [] | |
| 117 | dev_MSE_loss = [] | |
| 118 | ||
| 119 | last_output_PATH = [None] * N_TEST | |
| 120 | ||
| 121 | assert N_ITERS > N_ITERS_P1 + N_ITERS_P2 | |
| 122 | ||
| 123 | # Saver to save the session | |
| 124 | saver = tf.train.Saver() | |
| 125 | ||
| 126 | with tf.Session() as sess: | |
| 127 | if model_filename is None: | |
| 128 | sess.run(tf.global_variables_initializer()) | |
| 129 | else: | |
| 130 | saver.restore(sess, model_filename) | |
| 131 | for i in range(start_iter, N_ITERS + 1): | |
| 132 | batch, batch_p = util.sample_random_minibatch(imgs, imgs_p, BATCH_SZ) | |
| 133 | G_sample_ = None | |
| 134 | C_loss_curr, G_loss_curr, G_MSE_loss_curr = None, None, None | |
| 135 | if i < N_ITERS_P1: # Stage 1 - Train Generator Only | |
| 136 | if i == 0: | |
| 137 | print('------------------> Beginning Phase 1...') | |
| 138 | _, G_MSE_loss_curr, G_sample_ = sess.run([G_MSE_solver, G_MSE_loss, G_sample], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 139 | elif i < N_ITERS_P1 + N_ITERS_P2: # Stage 2 - Train Discriminator Only | |
| 140 | if i == N_ITERS_P1: | |
| 141 | print('------------------> Beginning Phase 2...') | |
| 142 | _, C_loss_curr, C_real_, C_fake_ = sess.run([C_solver, C_loss, C_real, C_fake], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 143 | if VERBOSE: | |
| 144 | print((i, C_loss_curr, np.min(C_real_), np.max(C_real_), np.min(C_fake_), np.max(C_fake_))) | |
| 145 | else: # Stage 3 - Train both Generator and Discriminator | |
| 146 | if i == N_ITERS_P1 + N_ITERS_P2: | |
| 147 | print('------------------> Beginning Phase 3...') | |
| 148 | _, C_loss_curr, C_real_, C_fake_ = sess.run([C_solver, C_loss, C_real, C_fake], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 149 | if VERBOSE: | |
| 150 | print((i, C_loss_curr, 'D', np.min(C_real_), np.max(C_real_), np.min(C_fake_), np.max(C_fake_))) | |
| 151 | _, G_loss_curr, G_MSE_loss_curr, G_sample_, C_fake_ = sess.run([G_solver, G_loss, G_MSE_loss, G_sample, C_fake], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 152 | if VERBOSE: | |
| 153 | print((i, G_loss_curr, 'G', np.min(C_fake_), np.max(C_fake_))) | |
| 154 | ||
| 155 | # Periodically test the generator on held-out images | |
| 156 | if i % INTV_PRINT == 0: | |
| 157 | G_MSE_loss_curr_dev = None | |
| 158 | if G_sample_ is not None: | |
| 159 | # Print out the dev image | |
| 160 | output, G_MSE_loss_curr_dev = sess.run([G_sample, G_MSE_loss], feed_dict={DG_X: test_img, G_Z: test_img_p}) | |
| 161 | for i_test in range(N_TEST): | |
| 162 | util.save_image(output[i_test], os.path.join(OUT_DIR, 'dev_%d_%d.png' % (i_test, i))) | |
| 163 | last_output_PATH[i_test] = os.path.join(OUT_DIR, 'dev_%d_%d.png' % (i_test, i)) | |
| 164 | # Also save the train image | |
| 165 | output, = sess.run([G_sample], feed_dict={DG_X: train_img, G_Z: train_img_p}) | |
| 166 | util.save_image(output[0], os.path.join(OUT_DIR, 'train%d.png' % i)) | |
| 167 | print('Iteration [%d/%d]:' % (i, N_ITERS)) | |
| 168 | if G_MSE_loss_curr is not None: | |
| 169 | print('\tG_MSE_loss (train) = %f' % G_MSE_loss_curr) | |
| 170 | if G_MSE_loss_curr_dev is not None: | |
| 171 | print('\tG_MSE_loss (dev) = %f' % G_MSE_loss_curr_dev) | |
| 172 | if G_loss_curr is not None: | |
| 173 | print('\tG_loss = %f' % G_loss_curr) | |
| 174 | if C_loss_curr is not None: | |
| 175 | print('\tC_loss = %f' % C_loss_curr) | |
| 176 | ||
| 177 | # Keep track of losses for logging | |
| 178 | if G_MSE_loss_curr is not None: | |
| 179 | train_MSE_loss.append([i, G_MSE_loss_curr]) | |
| 180 | if G_MSE_loss_curr_dev is not None: | |
| 181 | dev_MSE_loss.append([i, G_MSE_loss_curr_dev]) | |
| 182 | ||
| 183 | # Save the model every so often | |
| 184 | if i % INTV_SAVE == 0: | |
| 185 | save_path = saver.save(sess, os.path.join(MODEL_DIR, 'model%d.ckpt' % i)) | |
| 186 | print('Model saved in path: %s' % save_path) | |
| 187 | ||
| 188 | # Save the loss every so often | |
| 189 | if i % INTV_SAVE == 0: | |
| 190 | np.savez(os.path.join(OUT_DIR, 'loss.npz'), train_MSE_loss=np.array(train_MSE_loss), dev_MSE_loss=np.array(dev_MSE_loss)) | |
| 191 | ||
| 192 | # Save the loss | |
| 193 | np.savez(os.path.join(OUT_DIR, 'loss.npz'), train_MSE_loss=np.array(train_MSE_loss), dev_MSE_loss=np.array(dev_MSE_loss)) | |
| 194 | # Save the final blended output, and make a graph of the loss. | |
| 195 | util.plot_loss(os.path.join(OUT_DIR, 'loss.npz'), 'MSE Loss During Training', os.path.join(OUT_DIR, 'loss_plot.png')) | |
| 196 | for i_test in range(N_TEST): | |
| 197 | util.postprocess_images_outpainting(os.path.join(OUT_DIR, 'test_img_%d.png' % i_test), last_output_PATH[i_test], os.path.join(OUT_DIR, 'out_paste_%d.png' % i_test), blend=False) | |
| 198 | util.postprocess_images_outpainting(os.path.join(OUT_DIR, 'test_img_%d.png' % i_test), last_output_PATH[i_test], os.path.join(OUT_DIR, 'out_blend_%d.png' % i_test), blend=True) |
| 0 | # proj: image-outpainting | |
| 1 | # file: train_ld.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Train the model specified in model_ld.py, which | |
| 4 | # uses both global and local discriminators. | |
| 5 | # ------------------------------------------------------------- | |
| 6 | import tensorflow as tf | |
| 7 | import numpy as np | |
| 8 | from PIL import Image | |
| 9 | import model_ld as model | |
| 10 | import util | |
| 11 | import os | |
| 12 | import sys | |
| 13 | ||
| 14 | tf.reset_default_graph() | |
| 15 | ||
| 16 | # Places365 Training Hyperparameters | |
| 17 | BATCH_SZ = 16 | |
| 18 | VERBOSE = False | |
| 19 | EPSILON = 1e-9 | |
| 20 | IMAGE_SZ = 128 | |
| 21 | OUT_DIR = 'output' | |
| 22 | MODEL_DIR = os.path.join(OUT_DIR, 'models') | |
| 23 | INFO_PATH = os.path.join(OUT_DIR, 'run.txt') | |
| 24 | N_TEST = 10 | |
| 25 | N_ITERS = 64000 | |
| 26 | N_ITERS_P1 = 20000 # How many iterations to train in phase 1 | |
| 27 | N_ITERS_P2 = 4000 # How many iterations to train in phase 2 | |
| 28 | INTV_PRINT = 200 # How often to print | |
| 29 | INTV_SAVE = 1000 # How often to save the model | |
| 30 | ALPHA = 0.0004 | |
| 31 | ||
| 32 | ''' | |
| 33 | # City Training Hyperparameters | |
| 34 | BATCH_SZ = 1 | |
| 35 | VERBOSE = False | |
| 36 | EPSILON = 1e-9 | |
| 37 | IMAGE_SZ = 128 | |
| 38 | OUT_DIR = 'output' | |
| 39 | MODEL_DIR = os.path.join(OUT_DIR, 'models') | |
| 40 | INFO_PATH = os.path.join(OUT_DIR, 'run.txt') | |
| 41 | N_TEST = 1 | |
| 42 | N_ITERS = 5000 | |
| 43 | N_ITERS_P1 = 1000 # How many iterations to train in phase 1 | |
| 44 | N_ITERS_P2 = 400 # How many iterations to train in phase 2 | |
| 45 | INTV_PRINT = 50 # How often to print | |
| 46 | INTV_SAVE = 10000 # How often to save the model | |
| 47 | ALPHA = 0.0004 | |
| 48 | ''' | |
| 49 | ||
| 50 | # Check that we don't clobber a pre-existing run | |
| 51 | if len(sys.argv) < 2 and os.path.isdir(OUT_DIR) and len(os.listdir(OUT_DIR)) > 2: | |
| 52 | print('Warning, OUT_DIR already exists. Aborting.') | |
| 53 | exit() | |
| 54 | ||
| 55 | # Load in a model if specified as the second argument. | |
| 56 | start_iter = 0 | |
| 57 | model_filename = None | |
| 58 | if len(sys.argv) >= 2: | |
| 59 | start_iter = int(sys.argv[1]) | |
| 60 | model_filename = os.path.join(MODEL_DIR, 'model%d.ckpt' % start_iter) | |
| 61 | ||
| 62 | # Generator code | |
| 63 | G_Z = tf.placeholder(tf.float32, shape=[None, IMAGE_SZ, IMAGE_SZ, 4], name='G_Z') | |
| 64 | DG_X = tf.placeholder(tf.float32, shape=[None, IMAGE_SZ, IMAGE_SZ, 3], name='DG_X') | |
| 65 | ||
| 66 | # Load Places365 data | |
| 67 | data = np.load('places/places_128.npz') | |
| 68 | imgs = data['imgs_train'] # Originally from http://data.csail.mit.edu/places/places365/val_256.tar | |
| 69 | imgs_p = util.preprocess_images_outpainting(imgs) | |
| 70 | ||
| 71 | test_imgs = data['imgs_test'] | |
| 72 | test_imgs_p = util.preprocess_images_outpainting(test_imgs) | |
| 73 | ||
| 74 | test_img = test_imgs[:N_TEST] | |
| 75 | test_img_p = test_imgs_p[:N_TEST] | |
| 76 | ||
| 77 | train_img = imgs[4, np.newaxis] | |
| 78 | train_img_p = imgs_p[4, np.newaxis] | |
| 79 | ||
| 80 | ''' | |
| 81 | # Load city image data | |
| 82 | imgs = util.load_city_image() | |
| 83 | imgs_p = util.preprocess_images_outpainting(imgs) | |
| 84 | ||
| 85 | test_imgs = util.load_city_image() | |
| 86 | test_imgs_p = util.preprocess_images_outpainting(test_imgs) | |
| 87 | ||
| 88 | test_img = test_imgs | |
| 89 | test_img_p = test_imgs_p | |
| 90 | ||
| 91 | train_img = imgs | |
| 92 | train_img_p = imgs_p | |
| 93 | ''' | |
| 94 | ||
| 95 | # Write training and testing sample ground truths as reference | |
| 96 | util.save_image(train_img[0], os.path.join(OUT_DIR, 'train_img.png')) | |
| 97 | for i_test in range(N_TEST): | |
| 98 | util.save_image(test_imgs[i_test], os.path.join(OUT_DIR, 'test_img_%d.png' % i_test)) | |
| 99 | ||
| 100 | G_sample = model.generator(G_Z) | |
| 101 | vars_G = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='G') | |
| 102 | ||
| 103 | C_real = model.concatenator(model.global_discriminator(DG_X), model.local_discriminator(DG_X[:, :, :IMAGE_SZ // 2, :]), model.local_discriminator(tf.reverse(DG_X[:, :, -IMAGE_SZ // 2:, :], axis=[2]))) | |
| 104 | C_fake = model.concatenator(model.global_discriminator(G_sample), model.local_discriminator(G_sample[:, :, :IMAGE_SZ // 2, :]), model.local_discriminator(tf.reverse(G_sample[:, :, -IMAGE_SZ // 2:, :], axis=[2]))) | |
| 105 | vars_DG = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='DG') | |
| 106 | vars_DL = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='DL') | |
| 107 | vars_C = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='C') | |
| 108 | ||
| 109 | C_loss = -tf.reduce_mean(tf.log(tf.maximum(C_real, EPSILON)) + tf.log(tf.maximum(1. - C_fake, EPSILON))) | |
| 110 | G_MSE_loss = tf.losses.mean_squared_error(G_sample, DG_X, weights=tf.expand_dims(G_Z[:,:,:,3], -1)) # TODO: MULTIPLY with mask. Actually see if we want to remove this. | |
| 111 | G_loss = G_MSE_loss - ALPHA * tf.reduce_mean(tf.log(tf.maximum(C_fake, EPSILON))) | |
| 112 | ||
| 113 | C_solver = tf.train.AdamOptimizer().minimize(C_loss, var_list=(vars_DG + vars_DL + vars_C)) | |
| 114 | G_solver = tf.train.AdamOptimizer().minimize(G_loss, var_list=vars_G) | |
| 115 | G_MSE_solver = tf.train.AdamOptimizer().minimize(G_MSE_loss, var_list=vars_G) | |
| 116 | ||
| 117 | train_MSE_loss = [] | |
| 118 | dev_MSE_loss = [] | |
| 119 | ||
| 120 | last_output_PATH = [None] * N_TEST | |
| 121 | ||
| 122 | assert N_ITERS > N_ITERS_P1 + N_ITERS_P2 | |
| 123 | ||
| 124 | # Saver to save the session | |
| 125 | saver = tf.train.Saver() | |
| 126 | ||
| 127 | with tf.Session() as sess: | |
| 128 | if model_filename is None: | |
| 129 | sess.run(tf.global_variables_initializer()) | |
| 130 | else: | |
| 131 | saver.restore(sess, model_filename) | |
| 132 | for i in range(start_iter, N_ITERS + 1): | |
| 133 | batch, batch_p = util.sample_random_minibatch(imgs, imgs_p, BATCH_SZ) | |
| 134 | G_sample_ = None | |
| 135 | C_loss_curr, G_loss_curr, G_MSE_loss_curr = None, None, None | |
| 136 | if i < N_ITERS_P1: # Stage 1 - Train Generator Only | |
| 137 | if i == 0: | |
| 138 | print('------------------> Beginning Phase 1...') | |
| 139 | _, G_MSE_loss_curr, G_sample_ = sess.run([G_MSE_solver, G_MSE_loss, G_sample], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 140 | elif i < N_ITERS_P1 + N_ITERS_P2: # Stage 2 - Train Discriminator Only | |
| 141 | if i == N_ITERS_P1: | |
| 142 | print('------------------> Beginning Phase 2...') | |
| 143 | _, C_loss_curr, C_real_, C_fake_ = sess.run([C_solver, C_loss, C_real, C_fake], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 144 | if VERBOSE: | |
| 145 | print((i, C_loss_curr, np.min(C_real_), np.max(C_real_), np.min(C_fake_), np.max(C_fake_))) | |
| 146 | else: # Stage 3 - Train both Generator and Discriminator | |
| 147 | if i == N_ITERS_P1 + N_ITERS_P2: | |
| 148 | print('------------------> Beginning Phase 3...') | |
| 149 | _, C_loss_curr, C_real_, C_fake_ = sess.run([C_solver, C_loss, C_real, C_fake], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 150 | if VERBOSE: | |
| 151 | print((i, C_loss_curr, 'D', np.min(C_real_), np.max(C_real_), np.min(C_fake_), np.max(C_fake_))) | |
| 152 | _, G_loss_curr, G_MSE_loss_curr, G_sample_, C_fake_ = sess.run([G_solver, G_loss, G_MSE_loss, G_sample, C_fake], feed_dict={DG_X: batch, G_Z: batch_p}) | |
| 153 | if VERBOSE: | |
| 154 | print((i, G_loss_curr, 'G', np.min(C_fake_), np.max(C_fake_))) | |
| 155 | ||
| 156 | # Periodically test the generator on held-out images | |
| 157 | if i % INTV_PRINT == 0: | |
| 158 | G_MSE_loss_curr_dev = None | |
| 159 | if G_sample_ is not None: | |
| 160 | # Print out the dev image | |
| 161 | output, G_MSE_loss_curr_dev = sess.run([G_sample, G_MSE_loss], feed_dict={DG_X: test_img, G_Z: test_img_p}) | |
| 162 | for i_test in range(N_TEST): | |
| 163 | util.save_image(output[i_test], os.path.join(OUT_DIR, 'dev_%d_%d.png' % (i_test, i))) | |
| 164 | last_output_PATH[i_test] = os.path.join(OUT_DIR, 'dev_%d_%d.png' % (i_test, i)) | |
| 165 | # Also save the train image | |
| 166 | output, = sess.run([G_sample], feed_dict={DG_X: train_img, G_Z: train_img_p}) | |
| 167 | util.save_image(output[0], os.path.join(OUT_DIR, 'train%d.png' % i)) | |
| 168 | print('Iteration [%d/%d]:' % (i, N_ITERS)) | |
| 169 | if G_MSE_loss_curr is not None: | |
| 170 | print('\tG_MSE_loss (train) = %f' % G_MSE_loss_curr) | |
| 171 | if G_MSE_loss_curr_dev is not None: | |
| 172 | print('\tG_MSE_loss (dev) = %f' % G_MSE_loss_curr_dev) | |
| 173 | if G_loss_curr is not None: | |
| 174 | print('\tG_loss = %f' % G_loss_curr) | |
| 175 | if C_loss_curr is not None: | |
| 176 | print('\tC_loss = %f' % C_loss_curr) | |
| 177 | ||
| 178 | # Keep track of losses for logging | |
| 179 | if G_MSE_loss_curr is not None: | |
| 180 | train_MSE_loss.append([i, G_MSE_loss_curr]) | |
| 181 | if G_MSE_loss_curr_dev is not None: | |
| 182 | dev_MSE_loss.append([i, G_MSE_loss_curr_dev]) | |
| 183 | ||
| 184 | # Save the model every so often | |
| 185 | if i % INTV_SAVE == 0 and i > 0: | |
| 186 | save_path = saver.save(sess, os.path.join(MODEL_DIR, 'model%d.ckpt' % i)) | |
| 187 | print('Model saved in path: %s' % save_path) | |
| 188 | ||
| 189 | # Save the loss every so often | |
| 190 | if i % INTV_SAVE == 0: | |
| 191 | np.savez(os.path.join(OUT_DIR, 'loss.npz'), train_MSE_loss=np.array(train_MSE_loss), dev_MSE_loss=np.array(dev_MSE_loss)) | |
| 192 | ||
| 193 | # Save the loss | |
| 194 | np.savez(os.path.join(OUT_DIR, 'loss.npz'), train_MSE_loss=np.array(train_MSE_loss), dev_MSE_loss=np.array(dev_MSE_loss)) | |
| 195 | # Save the final blended output, and make a graph of the loss. | |
| 196 | util.plot_loss(os.path.join(OUT_DIR, 'loss.npz'), 'MSE Loss During Training', os.path.join(OUT_DIR, 'loss_plot.png')) | |
| 197 | for i_test in range(N_TEST): | |
| 198 | util.postprocess_images_outpainting(os.path.join(OUT_DIR, 'test_img_%d.png' % i_test), last_output_PATH[i_test], os.path.join(OUT_DIR, 'out_paste_%d.png' % i_test), blend=False) | |
| 199 | util.postprocess_images_outpainting(os.path.join(OUT_DIR, 'test_img_%d.png' % i_test), last_output_PATH[i_test], os.path.join(OUT_DIR, 'out_blend_%d.png' % i_test), blend=True) |
| 0 | # proj: image-outpainting | |
| 1 | # file: util.py | |
| 2 | # authors: Mark Sabini, Gili Rusak | |
| 3 | # desc: Various utility functions for all sorts of things. | |
| 4 | # ------------------------------------------------------------- | |
| 5 | import numpy as np | |
| 6 | from PIL import Image | |
| 7 | import scipy.misc | |
| 8 | import matplotlib.pyplot as plt | |
| 9 | import cv2 | |
| 10 | import os | |
| 11 | import re | |
| 12 | import imageio | |
| 13 | ||
| 14 | IMAGE_SZ = 128 # Should be a power of 2 | |
| 15 | ||
| 16 | # Loads the city image. | |
| 17 | # Returns: normalized numpy array of size (1, IMAGE_SZ, IMAGE_SZ, 3) | |
| 18 | def load_city_image(): | |
| 19 | im = Image.open('images/city_128.png').convert('RGB') | |
| 20 | width, height = im.size | |
| 21 | left = (width - IMAGE_SZ) / 2 | |
| 22 | top = (height - IMAGE_SZ) / 2 | |
| 23 | im = im.crop((left, top, left + IMAGE_SZ, top + IMAGE_SZ)) | |
| 24 | pix = np.array(im) | |
| 25 | assert pix.shape == (IMAGE_SZ, IMAGE_SZ, 3) | |
| 26 | return pix[np.newaxis] / 255.0 # Need to normalize images to [0, 1] | |
| 27 | ||
| 28 | # Loads multiple images from a directory. | |
| 29 | # Returns: normalized numpy array of size (m, IMAGE_SZ, IMAGE_SZ, 3) | |
| 30 | def load_images(in_PATH, verbose=False): | |
| 31 | imgs = [] | |
| 32 | for filename in sorted(os.listdir(in_PATH)): | |
| 33 | if verbose: | |
| 34 | print('Processing %s' % filename) | |
| 35 | full_filename = os.path.join(os.path.abspath(in_PATH), filename) | |
| 36 | img = Image.open(full_filename).convert('RGB') | |
| 37 | pix = np.array(img) | |
| 38 | pix_norm = pix / 255.0 | |
| 39 | imgs.append(pix_norm) | |
| 40 | return np.array(imgs) | |
| 41 | ||
| 42 | # Reads in all the images in a directory and saves them to an .npy file. | |
| 43 | def compile_images(in_PATH, out_PATH): | |
| 44 | imgs = load_images(in_PATH, verbose=True) | |
| 45 | np.save(out_PATH, imgs) | |
| 46 | ||
| 47 | # Masks and preprocesses an (m, IMAGE_SZ, IMAGE_SZ, 3) batch of images for image outpainting. | |
| 48 | # Returns: numpy array of size (m, IMAGE_SZ, IMAGE_SZ, 4) | |
| 49 | def preprocess_images_outpainting(imgs, crop=True): | |
| 50 | m = imgs.shape[0] | |
| 51 | imgs = np.array(imgs, copy=True) | |
| 52 | pix_avg = np.mean(imgs, axis=(1, 2, 3)) | |
| 53 | if crop: | |
| 54 | imgs[:, :, :int(2 * IMAGE_SZ / 8), :] = imgs[:, :, int(-2 * IMAGE_SZ / 8):, :] = pix_avg[:, np.newaxis, np.newaxis, np.newaxis] | |
| 55 | mask = np.zeros((m, IMAGE_SZ, IMAGE_SZ, 1)) | |
| 56 | mask[:, :, :int(2 * IMAGE_SZ / 8), :] = mask[:, :, int(-2 * IMAGE_SZ / 8):, :] = 1.0 | |
| 57 | imgs_p = np.concatenate((imgs, mask), axis=3) | |
| 58 | return imgs_p | |
| 59 | ||
| 60 | # Expands and preprocesses a single (h, w, 3) image for image outpainting. | |
| 61 | # Returns: numpy array of size (h, w + 2 * dw, 4) | |
| 62 | def preprocess_images_gen(img): | |
| 63 | img = np.array(img, copy=True) | |
| 64 | pix_avg = np.mean(img) | |
| 65 | dw = int(2 * IMAGE_SZ / 8) # Amount that will be outpainted on each side | |
| 66 | img_expand = np.ones((img.shape[0], img.shape[1] + 2 * dw, img.shape[2])) * pix_avg | |
| 67 | img_expand[:, dw:-dw, :] = img | |
| 68 | mask = np.zeros((img_expand.shape[0], img_expand.shape[1], 1)) | |
| 69 | mask[:, :int(2 * IMAGE_SZ / 8), :] = mask[:, int(-2 * IMAGE_SZ / 8):, :] = 1.0 | |
| 70 | img_p = np.concatenate((img_expand, mask), axis=2) | |
| 71 | return img_p[np.newaxis] | |
| 72 | ||
| 73 | # Renormalizes an image to [0, 255]. | |
| 74 | def norm_image(img_r): | |
| 75 | img_norm = (img_r * 255.0).astype(np.uint8) | |
| 76 | return img_norm | |
| 77 | ||
| 78 | # Visualize an image. | |
| 79 | def vis_image(img_r, mode='RGB'): | |
| 80 | img_norm = norm_image(img_r) | |
| 81 | img = Image.fromarray(img_norm, mode) | |
| 82 | img.show() | |
| 83 | ||
| 84 | # Save an image as a .png file. | |
| 85 | def save_image(img_r, name, mode='RGB'): | |
| 86 | img_norm = norm_image(img_r) | |
| 87 | img = Image.fromarray(img_norm, mode) | |
| 88 | img.save(name, format='PNG') | |
| 89 | ||
| 90 | # Sample a random minibatch from data. | |
| 91 | # Returns: Two numpy arrays, representing examples and their corresponding | |
| 92 | # preprocessed arrays. | |
| 93 | def sample_random_minibatch(data, data_p, m): | |
| 94 | indices = np.random.randint(0, data.shape[0], m) | |
| 95 | return data[indices], data_p[indices] | |
| 96 | ||
| 97 | # Plots the loss and saves the plot. | |
| 98 | def plot_loss(loss_filename, title, out_filename): | |
| 99 | loss = np.load(loss_filename) | |
| 100 | assert 'train_MSE_loss' in loss and 'dev_MSE_loss' in loss | |
| 101 | train_MSE_loss = loss['train_MSE_loss'] | |
| 102 | dev_MSE_loss = loss['dev_MSE_loss'] # TODO: Deal with dev_MSE_loss not changing during Phase 2 | |
| 103 | label_train, = plt.plot(train_MSE_loss[:, 0], train_MSE_loss[:, 1], label='Training MSE loss') | |
| 104 | label_dev, = plt.plot(dev_MSE_loss[:, 0], dev_MSE_loss[:, 1], label='Dev MSE loss') | |
| 105 | plt.legend(handles=[label_train, label_dev]) | |
| 106 | plt.xlabel('Iteration') | |
| 107 | plt.ylabel('MSE Loss') | |
| 108 | plt.title(title) | |
| 109 | plt.savefig(out_filename) | |
| 110 | plt.clf() | |
| 111 | ||
| 112 | # Plots the loss and saves the plot, but fancier. | |
| 113 | def plot_loss2(loss_filename, title, out_filename): | |
| 114 | loss = np.load(loss_filename) | |
| 115 | itrain_MSE_loss, train_MSE_loss = loss['itrain_MSE_loss'], loss['train_MSE_loss'] | |
| 116 | idev_MSE_loss, dev_MSE_loss = loss['idev_MSE_loss'], loss['dev_MSE_loss'] | |
| 117 | iG_loss, G_loss = loss['iG_loss'], loss['G_loss'] | |
| 118 | iD_loss, D_loss = loss['iD_loss'], loss['D_loss'] | |
| 119 | label_train, = plt.plot(itrain_MSE_loss, train_MSE_loss, label='Training MSE loss') | |
| 120 | label_dev, = plt.plot(idev_MSE_loss, dev_MSE_loss, label='Dev MSE loss') | |
| 121 | label_G, = plt.plot(iG_loss, G_loss, label='Generator loss') | |
| 122 | label_D, = plt.plot(iD_loss, D_loss, label='Discriminator loss') | |
| 123 | plt.legend(handles=[label_train, label_dev, label_G, label_D]) | |
| 124 | plt.xlabel('Iteration') | |
| 125 | plt.ylabel('Loss') | |
| 126 | plt.title(title) | |
| 127 | plt.savefig(out_filename) | |
| 128 | plt.clf() | |
| 129 | ||
| 130 | # Use seamless cloning to improve the generator's output. | |
| 131 | def postprocess_images_outpainting(img_PATH, img_o_PATH, out_PATH, blend=False): # img, img_0 are (64, 64, 3), mask is (64, 64, 1) | |
| 132 | src = cv2.imread(img_PATH)[:, int(2 * IMAGE_SZ / 8):-int(2 * IMAGE_SZ / 8), :] | |
| 133 | dst = cv2.imread(img_o_PATH) | |
| 134 | if blend: | |
| 135 | mask = np.ones(src.shape, src.dtype) * 255 | |
| 136 | center = (int(IMAGE_SZ / 2) - 1, int(IMAGE_SZ / 2) - 1) | |
| 137 | out = cv2.seamlessClone(src, dst, mask, center, cv2.NORMAL_CLONE) | |
| 138 | else: | |
| 139 | out = dst.copy() | |
| 140 | out[:, int(2 * IMAGE_SZ / 8):-int(2 * IMAGE_SZ / 8), :] = src | |
| 141 | cv2.imwrite(out_PATH, out) | |
| 142 | ||
| 143 | # Use seamless cloning to improve the generator's output. | |
| 144 | def postprocess_images_gen(img, img_o, blend=False): | |
| 145 | src = img[:, :, ::-1].copy() | |
| 146 | dst = img_o[:, :, ::-1].copy() | |
| 147 | if blend: | |
| 148 | mask = np.ones(src.shape, src.dtype) * 255 | |
| 149 | center = (int(dst.shape[1] / 2) - 1, int(dst.shape[0] / 2) - 1) | |
| 150 | out = cv2.seamlessClone(src, dst, mask, center, cv2.NORMAL_CLONE) | |
| 151 | else: | |
| 152 | out = dst.copy() | |
| 153 | out[:, int(2 * IMAGE_SZ / 8):-int(2 * IMAGE_SZ / 8), :] = src | |
| 154 | return out[:, :, ::-1].copy() | |
| 155 | ||
| 156 | # Crop and resize all the images in a directory. | |
| 157 | def resize_images(src_PATH, dst_PATH): | |
| 158 | for filename in os.listdir(src_PATH): | |
| 159 | print('Processing %s' % filename) | |
| 160 | full_filename = os.path.join(os.path.abspath(src_PATH), filename) | |
| 161 | img_raw = Image.open(full_filename).convert('RGB') | |
| 162 | w, h = img_raw.size | |
| 163 | if w <= h: | |
| 164 | dim = w | |
| 165 | y_start = int((h - dim) / 2) | |
| 166 | img_crop = img_raw.crop(box=(0, y_start, dim, y_start + dim)) | |
| 167 | else: # w > h | |
| 168 | dim = h | |
| 169 | x_start = int((w - dim) / 2) | |
| 170 | img_crop = img_raw.crop(box=(x_start, 0, x_start + dim, dim)) | |
| 171 | img_scale = img_crop.resize((IMAGE_SZ, IMAGE_SZ), Image.ANTIALIAS) | |
| 172 | full_outfilename = os.path.join(os.path.abspath(dst_PATH), filename) | |
| 173 | img_scale.save(full_outfilename, format='PNG') | |
| 174 | ||
| 175 | # Parse the output of train.py to extract the various losses. | |
| 176 | def parse_log(in_PATH, out_PATH): | |
| 177 | data = [] | |
| 178 | curr_list = [] | |
| 179 | with open(in_PATH, 'r') as fp: | |
| 180 | for i, line in enumerate(fp): | |
| 181 | if i == 0: | |
| 182 | continue | |
| 183 | line = line.strip() | |
| 184 | if line.startswith('----'): | |
| 185 | continue | |
| 186 | elif line.startswith('Model'): | |
| 187 | continue | |
| 188 | elif line.startswith('Iteration'): | |
| 189 | if len(curr_list): | |
| 190 | data.append(curr_list) | |
| 191 | curr_list = [] | |
| 192 | curr_list.append(line) | |
| 193 | else: | |
| 194 | curr_list.append(line) | |
| 195 | if len(curr_list): | |
| 196 | data.append(curr_list) | |
| 197 | G_MSE_train, G_MSE_dev, G, C = None, None, None, None | |
| 198 | G_MSE_train_s, G_MSE_dev_s, G_s, C_s = [], [], [], [] | |
| 199 | G_MSE_train_is, G_MSE_dev_is, G_is, C_is = [], [], [], [] | |
| 200 | def extract_loss(str): | |
| 201 | return float(re.findall('= ([\d, .]+)', str)[0]) | |
| 202 | for entry in data: | |
| 203 | i = int(re.findall('\[(\d+)/', entry[0])[0]) | |
| 204 | if len(entry) == 3: # Phase 1 | |
| 205 | G_MSE_train = extract_loss(entry[1]) | |
| 206 | G_MSE_dev = extract_loss(entry[2]) | |
| 207 | elif len(entry) == 2: # Phase 2 | |
| 208 | C = extract_loss(entry[1]) | |
| 209 | elif len(entry) == 5: # Phase 3 | |
| 210 | G_MSE_train = extract_loss(entry[1]) | |
| 211 | G_MSE_dev = extract_loss(entry[2]) | |
| 212 | G = extract_loss(entry[3]) | |
| 213 | C = extract_loss(entry[4]) | |
| 214 | if G_MSE_train is not None: | |
| 215 | G_MSE_train_s.append(G_MSE_train) | |
| 216 | G_MSE_train_is.append(i) | |
| 217 | if G_MSE_dev is not None: | |
| 218 | G_MSE_dev_s.append(G_MSE_dev) | |
| 219 | G_MSE_dev_is.append(i) | |
| 220 | if G is not None: | |
| 221 | G_s.append(G) | |
| 222 | G_is.append(i) | |
| 223 | if C is not None: | |
| 224 | C_s.append(C) | |
| 225 | C_is.append(i) | |
| 226 | G_MSE_train_sm = np.array(G_MSE_train_s) | |
| 227 | G_MSE_dev_sm = np.array(G_MSE_dev_s) | |
| 228 | G_sm = np.array(G_s) | |
| 229 | C_sm = np.array(C_s) | |
| 230 | G_MSE_train_ism = np.array(G_MSE_train_is) | |
| 231 | G_MSE_dev_ism = np.array(G_MSE_dev_is) | |
| 232 | G_ism = np.array(G_is) | |
| 233 | C_ism = np.array(C_is) | |
| 234 | np.savez(out_PATH, train_MSE_loss=G_MSE_train_sm, dev_MSE_loss=G_MSE_dev_sm, G_loss=G_sm, D_loss=C_sm, | |
| 235 | itrain_MSE_loss=G_MSE_train_ism, idev_MSE_loss=G_MSE_dev_ism, iG_loss=G_ism, iD_loss=C_ism) | |
| 236 | ||
| 237 | # Smoothes the MSE loss in the output loss file to make plotting easier. | |
| 238 | def smooth_MSE_loss(loss_file, window_size, outfile): | |
| 239 | losses = np.load(loss_file) | |
| 240 | train = losses['train_MSE_loss'] | |
| 241 | dev = losses['dev_MSE_loss'] | |
| 242 | num_train = train.shape[0] | |
| 243 | new_train_list = [] | |
| 244 | for i in range(0, num_train, window_size): | |
| 245 | window_avg = np.sum(train[i:i+window_size, 1]) / float(window_size) | |
| 246 | window_avg_val = np.sum(train[i:i+window_size, 0]) / float(window_size) | |
| 247 | new_train_list.append([window_avg_val, window_avg]) | |
| 248 | np_train = np.array(new_train_list[:-2]) | |
| 249 | np.savez(outfile, train_MSE_loss=np_train, dev_MSE_loss=dev) | |
| 250 | ||
| 251 | # Create a GIF to enable visualization of generator outputs over the course of training. | |
| 252 | def create_GIF(in_PATH, prefix, out_PATH): | |
| 253 | indices = range(0, 227401, 200) | |
| 254 | images = [] | |
| 255 | for index in indices: | |
| 256 | full_filename = os.path.join(os.path.abspath(in_PATH), prefix + str(index) + '.png') | |
| 257 | try: | |
| 258 | images.append(imageio.imread(full_filename)) | |
| 259 | except: | |
| 260 | continue | |
| 261 | images = images[:50] + images[50::10] + [images[-1]] | |
| 262 | imageio.mimwrite(out_PATH, images, loop=1, duration=0.1) | |
| 263 | ||
| 264 | # Compute the RMSE between a ground truth and outpainted image. | |
| 265 | def compute_RMSE(image_gt_PATH, image_o_PATH): | |
| 266 | im_gt = np.array(Image.open(image_gt_PATH).convert('RGB')).astype(np.float64) | |
| 267 | im_o = np.array(Image.open(image_o_PATH).convert('RGB')).astype(np.float64) | |
| 268 | assert im_gt.shape == (128, 128, 3) | |
| 269 | assert im_o.shape == (128, 128, 3) | |
| 270 | M = np.ones((128, 128, 3)) | |
| 271 | M[:, 32:96, :] = 0 | |
| 272 | num_pixels = 128 * 64 * 3 | |
| 273 | return np.sqrt(np.sum(((im_gt - im_o) * M) ** 2) / num_pixels) |