No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

Last update: Dec 30, 2022

Related tags

Deep Learning TReS

Overview

This repository contains the implementation for the paper:

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency (WACV 2022) Video

Creat Environment

This code is train and test on Ubuntu 16.04 while using Anaconda, python 3.6.6, and pytorch 1.8.0. To set up the evironment run: conda env create -f environment.yml after installing the virtuall env you should be able to run python -c "import torch; print(torch.__version__)" in the terminal and see 1.8.0

Datasets

In this work we use 7 datasets for evaluation (LIVE, CSIQ, TID2013, KADID10K, CLIVE, KonIQ, LIVEFB)

To start training please make sure to follow the correct folder structure for each of the aformentioned datasets as provided bellow:

LIVE

live
    |--fastfading
    |    |  ...     
    |--blur
    |    |  ... 
    |--jp2k
    |    |  ...     
    |--jpeg
    |    |  ...     
    |--wn
    |    |  ...     
    |--refimgs
    |    |  ...     
    |--dmos.mat
    |--dmos_realigned.mat
    |--refnames_all.mat
    |--readme.txt

CSIQ

csiq
    |--dst_imgs_all
    |    |--1600.AWGN.1.png
    |    |  ... (you need to put all the distorted images here)
    |--src_imgs
    |    |--1600.png
    |    |  ...
    |--csiq.DMOS.xlsx
    |--csiq_label.txt

TID2013

tid2013
    |--distorted_images
    |--reference_images
    |--mos.txt
    |--mos_std.txt
    |--mos_with_names.txt
    |--readme

KADID10K

kadid10k
    |--distorted_images
    |    |--I01_01_01.png
    |    |  ...    
    |--reference_images
    |    |--I01.png
    |    |  ...    
    |--dmos.csv
    |--mv.sh.save
    |--mvv.sh

CLIVE

clive
    |--Data
    |    |--I01_01_01.png
    |    |  ...    
    |--Images
    |    |--I01.png
    |    |  ...    
    |--ChallengeDB_release
    |    |--README.txt
    |--dmos.csv
    |--mv.sh.save
    |--mvv.sh

KonIQ

fblive
   |--1024x768
   |    |  992920521.jpg 
   |    |  ... (all the images should be here)     
   |--koniq10k_scores_and_distributions.csv

LIVEFB

fblive
   |--FLIVE
   |    |  AVA__149.jpg    
   |    |  ... (all the images should be here)     
   |--labels_image.csv

Training

The training scrips are provided in the run.sh. Please change the paths correspondingly. Please note that to achive the same performace the parameters should match the ones in the run.sh files.

Pretrained models

The pretrain models are provided here.

Acknowledgement

This code is borrowed parts from HyperIQA and DETR.

FAQs

What is the difference between self-consistency and ensembling? and will the self-consistency increase the interface time?

In ensampling methods, we need to have several models (with different initializations) and ensemble the results during the training and testing, but in our self-consistency model, we enforce one model to have consistent performance for one network during the training while the network has an input with different transformations. Our self-consistency model has the same interface time/parameters in the testing similar to the model without self-consistency. In other words, we are not adding any new parameters to the network and it won't affect the interface.

What is the difference between self-consistency and augmentation?

In augmentation, we augment an input and send it to one network, so although the network will become robust to different augmentation, it will never have the chance of enforcing the outputs to be the same for different versions of an input at the same time. In our self-consistency approach, we force the network to have a similar output for an image with a different transformation (in our case horizontal flipping) which leads to more robust performance. Please also note that we still use augmentation during the training, so our model is benefiting from the advantages of both augmentation and self-consistency. Also, please see Fig. 1 in the main paper, where we showed that models that used augmentation alone are sensitive to simple transformations.

Why does the relative ranking loss apply to the samples with the highest and lowest quality scores, why not applying it to all the samples?

1) We did not see a significant improvement by applying our ranking loss to all the samples within each batch compared to the case that we just use extreme cases. 2) Considering more samples lead to more gradient back-propagation and therefore more computation during the training which causes slower training.

Citation

If you find this work useful for your research, please cite our paper:

@InProceedings{golestaneh2021no,
  title={No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency},
  author={Golestaneh, S Alireza and Dadsetan, Saba and Kitani, Kris M},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={3209--3218},
  year={2022}
}

If you have any questions about our work, please do not hesitate to contact [email protected]

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

Related tags

Overview

Creat Environment

Datasets

Training

Pretrained models

Acknowledgement

FAQs

Citation

Owner

Alireza Golestaneh

Diverse Image Generation via Self-Conditioned GANs

Official code for the paper "Self-Supervised Prototypical Transfer Learning for Few-Shot Classification"

Current state of supervised and unsupervised depth completion methods

TensorFlow implementation of "Attention is all you need (Transformer)"

Basit bir burç modülü.

Image-Stitching - Panorama composition using SIFT Features and a custom implementaion of RANSAC algorithm

[SIGGRAPH Asia 2019] Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

PyTorch implementation of DreamerV2 model-based RL algorithm

Pull sensitive data from users on windows including discord tokens and chrome data.

Yolov5-opencv-cpp-python - Example of using ultralytics YOLO V5 with OpenCV 4.5.4, C++ and Python

Look Who’s Talking: Active Speaker Detection in the Wild

A lightweight library to compare different PyTorch implementations of the same network architecture.

Minecraft agent to farm resources using reinforcement learning

[ICCV' 21] "Unsupervised Point Cloud Pre-training via Occlusion Completion"

A python module for scientific analysis of 3D objects based on VTK and Numpy

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

Type4Py: Deep Similarity Learning-Based Type Inference for Python

An open-source online reverse dictionary.