Implementation of the pix2pix model on satellite images

Overview

This repo shows how to implement and use the pix2pix GAN model for image to image translation. The model is demonstrated on satellite images, and the purpose is to convert the sattelite images to map images.


The Model

The pix2pix model is composed from a generator and discriminator. The purpose of the generator is to convert the original image to a new image that is similar to target image - in our case convert a sattelite image to a street maps image. The Discriminator goal is to detect which of the images are a generated images and which of them are actually the target images. In that way, the generator and discriminator are competing each other, result in a model that learnes the mathematical mapping of the input sattelite images to the street view images.

RTST

Generator architecture:

The input image is inserted into a the generator, which is made from a Unet convolution model. The Unet model is composed of encoder and decoder with a skips connection between them. The Unet architecture is describe in the following image:

RTST

The input image is inserted into the model, the encoder module is composed of several convolution layers that shrinks the original image to the basic image feauture. The decoder module is then reconstruct the image to the original image size using a transposed convolutions layers. A skip connection between the encoder and decoder is used in each layer of the the encoder-decoter convolutions in order to preserve more information of the original image. The idea behind using this architecure is very intiutive - we want to transform image of sattelite maps to an image of a street maps. Therfore we want to convert the image to another image, but we want to keep the basic structure of the image. The Unet encoder decoder module allows us to acheieve that.


Discriminator architecture:

The Discriminator receives the images and shrinks it to a smaller image. It is doint that by using several convolution layers, each layers shrinks the image to a smaller size. The outputs is a smaller image, in our case it's a 30x30x1 image. Each pixel represent transformation of part of the image to a value between 0 1. The pixels value will represent the probability of the image slice to come from the real target. The method of converting the image to slices of smaller imagine in order to decide wheather this image is real or fake is called "Patch GAN". Transforming the image to patches of images gives better result then just converting the image to one outpat like was use in the original GAN.

RTST

The Loss Function

We will have two losses - one for the generator loss and one for the discriminator loss.

Then Generator loss is responsible to "fool" the discriminator and will try make it predict the generated image is real, and in the other hand it will also want to let the output image to be close to the target image. Therefore, the first part of the loss will be a Binary Crossentropy loss of the discriminator output for the generated images, together with labels of 1. This part will be responsiple for "tricking" the discriminator. The other part will be L1 loss - it will make the output to be symilar to the targets.

The Discriminator loss will also be combined from two parts - the first part is making the discriminator output to predict value close to 1 for all the images that came from the true targets, and the second part will make the discriminator predict value close to 0 for all the images that came from the generator. Both of the losses will be using Binary Crossentropy loss for this purpose.


Data Preperation

The dataset contains combined images of the sattelite images and it's correconponded street maps images. We will split this images to two images - the input images (the sattelite image) and target images (the street maps images). We will load the images to a pytorch DataLoader to make the training more efficient. This is how random input and target image looks like:

RTST


Results

We will inset the data into the models and run the training loop.

After 100 epochs, we get a result that is very similar to the target images. All the following example are taken from the test dataset, which the model wasn't train on.

Here are some of the results:

image image image

Summary

The model worked well and was able to generate images that are very similar to target images. It was able to generalize it very well to the testing set as well.

This project is used for the paper Differentiable Programming of Isometric Tensor Network

This project is used for the paper "Differentiable Programming of Isometric Tensor Network". (arXiv:2110.03898)

Chenhua Geng 15 Dec 13, 2022
A hue shift helper for OBS

obs-hue-shift A hue shift helper for OBS This is a repo based on the really nice script Hegemege made. The original script can be found https://gist.g

Alexis Tyler 1 Jan 10, 2022
[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

[CVPR 2021] Generative Hierarchical Features from Synthesizing Images

GenForce: May Generative Force Be with You 148 Dec 09, 2022
Scripts and misc. stuff related to the PortSwigger Web Academy

PortSwigger Web Academy Notes Mostly scripts to automate the exploits. Going in the order of the recomended learning path - starting with SQLi. Commun

pageinsec 17 Dec 30, 2022
Fine-tune pretrained Convolutional Neural Networks with PyTorch

Fine-tune pretrained Convolutional Neural Networks with PyTorch. Features Gives access to the most popular CNN architectures pretrained on ImageNet. A

Alex Parinov 694 Nov 23, 2022
Official implementation of the NRNS paper: No RL, No Simulation: Learning to Navigate without Navigating

No RL No Simulation (NRNS) Official implementation of the NRNS paper: No RL, No Simulation: Learning to Navigate without Navigating NRNS is a heriarch

Meera Hahn 20 Nov 29, 2022
To build a regression model to predict the concrete compressive strength based on the different features in the training data.

Cement-Strength-Prediction Problem Statement To build a regression model to predict the concrete compressive strength based on the different features

Ashish Kumar 4 Jun 11, 2022
This is a repo of basic Machine Learning!

Basic Machine Learning This repository contains a topic-wise curated list of Machine Learning and Deep Learning tutorials, articles and other resource

Ekram Asif 53 Dec 31, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Neural Turing Machines (NTM) - PyTorch Implementation

PyTorch Neural Turing Machine (NTM) PyTorch implementation of Neural Turing Machines (NTM). An NTM is a memory augumented neural network (attached to

Guy Zana 519 Dec 21, 2022
The source code of "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022.

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation The source code of our work "SIDE: Center-based Stereo 3D Detecto

10 Dec 18, 2022
3D-printable hand-strapped keyboard

Note: This repo has not been cleaned up and prepared for general consumption at all. This is just a dump of the project files. If there is any interes

Wojciech Baranowski 41 Dec 31, 2022
Analyzes your GitHub Profile and presents you with a report on how likely you are to become the next MLH Fellow!

Fellowship Prediction GitHub Profile Comparative Analysis Tool Built with BentoML Table of Contents: Features Disclaimer Technologies Used Contributin

Damir Temir 51 Dec 29, 2022
Official PyTorch implementation of "Evolving Search Space for Neural Architecture Search"

Evolving Search Space for Neural Architecture Search Usage Install all required dependencies in requirements.txt and replace all ..path/..to in the co

Yuanzheng Ci 10 Oct 24, 2022
Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

MosaicOS Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation. Introduction M

Cheng Zhang 27 Oct 12, 2022
This Artificial Intelligence program can take a black and white/grayscale image and generate a realistic or plausible colorized version of the same picture.

Colorizer The point of this project is to write a program capable of taking a black and white / grayscale image, and generating a realistic or plausib

Maitri Shah 1 Jan 06, 2022
A semantic segmentation toolbox based on PyTorch

Introduction vedaseg is an open source semantic segmentation toolbox based on PyTorch. Features Modular Design We decompose the semantic segmentation

407 Dec 15, 2022
Official implementation of Rich Semantics Improve Few-Shot Learning (BMVC, 2021)

Rich Semantics Improve Few-Shot Learning Paper Link Abstract : Human learning benefits from multi-modal inputs that often appear as rich semantics (e.

Mohamed Afham 11 Jul 26, 2022
Official Pytorch implementation for video neural representation (NeRV)

NeRV: Neural Representations for Videos (NeurIPS 2021) Project Page | Paper | UVG Data Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav S

hao 214 Dec 28, 2022
Vertex AI: Serverless framework for MLOPs (ESP / ENG)

Vertex AI: Serverless framework for MLOPs (ESP / ENG) Español Qué es esto? Este repo contiene un pipeline end to end diseñado usando el SDK de Kubeflo

Hernán Escudero 2 Apr 28, 2022