Pre-Trained Image Processing Transformer (IPT)

Overview

Pre-Trained Image Processing Transformer (IPT)

By Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao. [arXiv]

We study the low-level computer vision task (such as denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). We present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks.

MindSpore Code

Requirements

  • python 3
  • pytorch == 1.4.0
  • torchvision

Dataset

The benchmark datasets can be downloaded as follows:

For super-resolution:

Set5, Set14, B100, Urban100.

For denoising:

CBSD68, Urban100.

For deraining:

Rain100L.

The result images are converted into YCbCr color space. The PSNR is evaluated on the Y channel only.

Script Description

This is the inference script of IPT, you can following steps to finish the test of image processing tasks, like SR, denoise and derain, via the corresponding pretrained models.

Script Parameter

For details about hyperparameters, see option.py.

Evaluation

Pretrained models

The pretrained models are available in google drive

Evaluation Process

Inference example: For SR x2,x3,x4:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --data_test Set5+Set14+B100+Urban100 --scale $SCALE

For Denoise 30,50:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --data_test CBSD68+Urban100 --scale 1 --denoise --sigma $NOISY_LEVEL

For derain:

python main.py --dir_data $DATA_PATH --pretrain $MODEL_PATH --scale 1 --derain

Results

  • Detailed results on image super-resolution task.
Method Scale Set5 Set14 B100 Urban100
VDSR X2 37.53 33.05 31.90 30.77
EDSR X2 38.11 33.92 32.32 32.93
RCAN X2 38.27 34.12 32.41 33.34
RDN X2 38.24 34.01 32.34 32.89
OISR-RK3 X2 38.21 33.94 32.36 33.03
RNAN X2 38.17 33.87 32.32 32.73
SAN X2 38.31 34.07 32.42 33.1
HAN X2 38.27 34.16 32.41 33.35
IGNN X2 38.24 34.07 32.41 33.23
IPT (ours) X2 38.37 34.43 32.48 33.76
Method Scale Set5 Set14 B100 Urban100
VDSR X3 33.67 29.78 28.83 27.14
EDSR X3 34.65 30.52 29.25 28.80
RCAN X3 34.74 30.65 29.32 29.09
RDN X3 34.71 30.57 29.26 28.80
OISR-RK3 X3 34.72 30.57 29.29 28.95
RNAN X3 34.66 30.52 29.26 28.75
SAN X3 34.75 30.59 29.33 28.93
HAN X3 34.75 30.67 29.32 29.10
IGNN X3 34.72 30.66 29.31 29.03
IPT (ours) X3 34.81 30.85 29.38 29.49
Method Scale Set5 Set14 B100 Urban100
VDSR X4 31.35 28.02 27.29 25.18
EDSR X4 32.46 28.80 27.71 26.64
RCAN X4 32.63 28.87 27.77 26.82
SAN X4 32.64 28.92 27.78 26.79
RDN X4 32.47 28.81 27.72 26.61
OISR-RK3 X4 32.53 28.86 27.75 26.79
RNAN X4 32.49 28.83 27.72 26.61
HAN X4 32.64 28.90 27.80 26.85
IGNN X4 32.57 28.85 27.77 26.84
IPT (ours) X4 32.64 29.01 27.82 27.26
  • Super-resolution result

  • Denoising result

  • Derain result

Citation

@misc{chen2020pre,
      title={Pre-Trained Image Processing Transformer}, 
      author={Chen, Hanting and Wang, Yunhe and Guo, Tianyu and Xu, Chang and Deng, Yiping and Liu, Zhenhua and Ma, Siwei and Xu, Chunjing and Xu, Chao and Gao, Wen},
      year={2021},
      eprint={2012.00364},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Owner
HUAWEI Noah's Ark Lab
Working with and contributing to the open source community in data mining, artificial intelligence, and related fields.
HUAWEI Noah's Ark Lab
Tutorial repo for an end-to-end Data Science project

End-to-end Data Science project This is the repo with the notebooks, code, and additional material used in the ITI's workshop. The goal of the session

Deena Gergis 127 Dec 30, 2022
Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection

Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection (NimPme) The official implementation of Novel Instances Mining with

12 Sep 08, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Microsoft 8.4k Jan 01, 2023
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
Adversarial Autoencoders

Adversarial Autoencoders (with Pytorch) Dependencies argparse time torch torchvision numpy itertools matplotlib Create Datasets python create_datasets

Felipe Ducau 188 Jan 01, 2023
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace, SG-HMC and more

Bayesian Neural Networks Pytorch implementations for the following approximate inference methods: Bayes by Backprop Bayes by Backprop + Local Reparame

1.4k Jan 07, 2023
Keras Implementation of Neural Style Transfer from the paper "A Neural Algorithm of Artistic Style"

Neural Style Transfer & Neural Doodles Implementation of Neural Style Transfer from the paper A Neural Algorithm of Artistic Style in Keras 2.0+ INetw

Somshubra Majumdar 2.2k Dec 31, 2022
This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization This is the code for our paper ``SumGNN: Multi-typed Drug

Yue Yu 58 Dec 21, 2022
Emotional conditioned music generation using transformer-based model.

This is the official repository of EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation. The paper has b

hung anna 96 Nov 09, 2022
RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

184 Jan 04, 2023
OpenL3: Open-source deep audio and image embeddings

OpenL3 OpenL3 is an open-source Python library for computing deep audio and image embeddings. Please refer to the documentation for detailed instructi

Music and Audio Research Laboratory - NYU 326 Jan 02, 2023
Hl classification bc - A Network-Based High-Level Data Classification Algorithm Using Betweenness Centrality

A Network-Based High-Level Data Classification Algorithm Using Betweenness Centr

Esteban Vilca 3 Dec 01, 2022
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.

Unsupervised Phone and Word Segmentation using Vector-Quantized Neural Networks Overview Unsupervised phone and word segmentation on speech data is pe

Herman Kamper 13 Dec 11, 2022
Official PyTorch implementation of "Adversarial Reciprocal Points Learning for Open Set Recognition"

Adversarial Reciprocal Points Learning for Open Set Recognition Official PyTorch implementation of "Adversarial Reciprocal Points Learning for Open Se

Guangyao Chen 78 Dec 28, 2022
IsoGCN code for ICLR2021

IsoGCN The official implementation of IsoGCN, presented in the ICLR2021 paper Isometric Transformation Invariant and Equivariant Graph Convolutional N

horiem 39 Nov 25, 2022
End-To-End Crowdsourcing

End-To-End Crowdsourcing Comparison of traditional crowdsourcing approaches to a state-of-the-art end-to-end crowdsourcing approach LTNet on sentiment

Andreas Koch 1 Mar 06, 2022
Use CLIP to represent video for Retrieval Task

A Straightforward Framework For Video Retrieval Using CLIP This repository contains the basic code for feature extraction and replication of results.

Jesus Andres Portillo Quintero 54 Dec 22, 2022
Objax Apache-2Objax (🥉19 · ⭐ 580) - Objax is a machine learning framework that provides an Object.. Apache-2 jax

Objax Tutorials | Install | Documentation | Philosophy This is not an officially supported Google product. Objax is an open source machine learning fr

Google 729 Jan 02, 2023
I tried to apply the CAM algorithm to YOLOv4 and it worked.

YOLOV4:You Only Look Once目标检测模型在pytorch当中的实现 2021年2月7日更新: 加入letterbox_image的选项,关闭letterbox_image后网络的map得到大幅度提升。 目录 性能情况 Performance 实现的内容 Achievement

55 Dec 05, 2022
Code for "Learning Graph Cellular Automata"

Learning Graph Cellular Automata This code implements the experiments from the NeurIPS 2021 paper: "Learning Graph Cellular Automata" Daniele Grattaro

Daniele Grattarola 37 Oct 26, 2022