source code of “Visual Saliency Transformer” (ICCV2021)

Related tags

Deep LearningVST
Overview

Visual Saliency Transformer (VST)

source code for our ICCV 2021 paper “Visual Saliency Transformer” by Nian Liu, Ni Zhang, Kaiyuan Wan, Junwei Han, and Ling Shao.

created by Ni Zhang, email: [email protected]

avatar

Requirement

  1. Pytorch 1.6.0
  2. Torchvison 0.7.0

RGB VST for RGB Salient Object Detection

Data Preparation

Training Set

We use the training set of DUTS to train our VST for RGB SOD. Besides, we follow Egnet to generate contour maps of DUTS trainset for training. You can directly download the generated contour maps DUTS-TR-Contour from [baidu pan fetch code: ow76 | Google drive] and put it into RGB_VST/Data folder.

Testing Set

We use the testing set of DUTS, ECSSD, HKU-IS, PASCAL-S, DUT-O, and SOD to test our VST. After Downloading, put them into RGB_VST/Data folder.

Your RGB_VST/Data folder should look like this:

-- Data
   |-- DUTS
   |   |-- DUTS-TR
   |   |-- | DUTS-TR-Image
   |   |-- | DUTS-TR-Mask
   |   |-- | DUTS-TR-Contour
   |   |-- DUTS-TE
   |   |-- | DUTS-TE-Image
   |   |-- | DUTS-TE-Mask
   |-- ECSSD
   |   |--images
   |   |--GT
   ...

Training, Testing, and Evaluation

  1. cd RGB_VST
  2. Download the pretrained T2T-ViT_t-14 model [baidu pan fetch code: 2u34 | Google drive] and put it into pretrained_model/ folder.
  3. Run python train_test_eval.py --Training True --Testing True --Evaluation True for training, testing, and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Testing on Our Pretrained RGB VST Model

  1. cd RGB_VST
  2. Download our pretrained RGB_VST.pth[baidu pan fetch code: pe54 | Google drive] and then put it in checkpoint/ folder.
  3. Run python train_test_eval.py --Testing True --Evaluation True for testing and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Our saliency maps can be downloaded from [baidu pan fetch code: 92t0 | Google drive].

SOTA Saliency Maps for Comparison

The saliency maps of the state-of-the-art methods in our paper can be downloaded from [baidu pan fetch code: de4k | Google drive].

RGB-D VST for RGB-D Salient Object Detection

Data Preparation

Training Set

We use 1,485 images from NJUD, 700 images from NLPR, and 800 images from DUTLF-Depth to train our VST for RGB-D SOD. Besides, we follow Egnet to generate corresponding contour maps for training. You can directly download the whole training set from here [baidu pan fetch code: 7vsw | Google drive] and put it into RGBD_VST/Data folder.

Testing Set

NJUD [baidu pan fetch code: 7mrn | Google drive]
NLPR [baidu pan fetch code: tqqm | Google drive]
DUTLF-Depth [baidu pan fetch code: 9jac | Google drive]
STERE [baidu pan fetch code: 93hl | Google drive]
LFSD [baidu pan fetch code: l2g4 | Google drive]
RGBD135 [baidu pan fetch code: apzb | Google drive]
SSD [baidu pan fetch code: j3v0 | Google drive]
SIP [baidu pan fetch code: q0j5 | Google drive]
ReDWeb-S

After Downloading, put them into RGBD_VST/Data folder.

Your RGBD_VST/Data folder should look like this:

-- Data
   |-- NJUD
   |   |-- trainset
   |   |-- | RGB
   |   |-- | depth
   |   |-- | GT
   |   |-- | contour
   |   |-- testset
   |   |-- | RGB
   |   |-- | depth
   |   |-- | GT
   |-- STERE
   |   |-- RGB
   |   |-- depth
   |   |-- GT
   ...

Training, Testing, and Evaluation

  1. cd RGBD_VST
  2. Download the pretrained T2T-ViT_t-14 model [baidu pan fetch code: 2u34 | Google drive] and put it into pretrained_model/ folder.
  3. Run python train_test_eval.py --Training True --Testing True --Evaluation True for training, testing, and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Testing on Our Pretrained RGB-D VST Model

  1. cd RGBD_VST
  2. Download our pretrained RGBD_VST.pth[baidu pan fetch code: zt0v | Google drive] and then put it in checkpoint/ folder.
  3. Run python train_test_eval.py --Testing True --Evaluation True for testing and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Our saliency maps can be downloaded from [baidu pan fetch code: jovk | Google drive].

SOTA Saliency Maps for Comparison

The saliency maps of the state-of-the-art methods in our paper can be downloaded from [baidu pan fetch code: i1we | Google drive].

Acknowledgement

We thank the authors of Egnet for providing codes of generating contour maps. We also thank Zhao Zhang for providing the efficient evaluation tool.

Citation

If you think our work is helpful, please cite

@inproceedings{liu2021VST, 
  title={Visual Saliency Transformer}, 
  author={Liu, Nian and Zhang, Ni and Han, Junwei and Shao, Ling},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}
Owner
Ni Zhang PhD student
Torch-mutable-modules - Use in-place and assignment operations on PyTorch module parameters with support for autograd

Torch Mutable Modules Use in-place and assignment operations on PyTorch module p

Kento Nishi 7 Jun 06, 2022
Machine-in-the-Loop Rewriting for Creative Image Captioning

Machine-in-the-Loop Rewriting for Creative Image Captioning Data Annotated sources of data used in the paper: Data Source URL Mohammed et al. Link Gor

Vishakh P 6 Jul 24, 2022
Course on computational design, non-linear optimization, and dynamics of soft systems at UIUC.

Computational Design and Dynamics of Soft Systems · This is a repository that contains the source code for generating the lecture notes, handouts, exe

Tejaswin Parthasarathy 4 Jul 21, 2022
An implementation of the WHATWG URL Standard in JavaScript

whatwg-url whatwg-url is a full implementation of the WHATWG URL Standard. It can be used standalone, but it also exposes a lot of the internal algori

314 Dec 28, 2022
Efficient Online Bayesian Inference for Neural Bandits

Efficient Online Bayesian Inference for Neural Bandits By Gerardo Durán-Martín, Aleyna Kara, and Kevin Murphy AISTATS 2022.

Probabilistic machine learning 49 Dec 27, 2022
Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

Sarus published models Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are

Sarus Technologies 39 Aug 19, 2022
Weakly Supervised End-to-End Learning (NeurIPS 2021)

WeaSEL: Weakly Supervised End-to-end Learning This is a PyTorch-Lightning-based framework, based on our End-to-End Weak Supervision paper (NeurIPS 202

Auton Lab, Carnegie Mellon University 131 Jan 06, 2023
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

9 Nov 14, 2022
A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 08, 2022
OMAMO: orthology-based model organism selection

OMAMO: orthology-based model organism selection OMAMO is a tool that suggests the best model organism to study a biological process based on orthologo

Dessimoz Lab 5 Apr 22, 2022
A programming language written with python

Kaoft A programming language written with python How to use A simple Hello World: c="Hello World" c Output: "Hello World" Operators: a=12

1 Jan 24, 2022
LLVM-based compiler for LightGBM gradient-boosted trees. Speeds up prediction by ≥10x.

LLVM-based compiler for LightGBM gradient-boosted trees. Speeds up prediction by ≥10x.

Simon Boehm 183 Jan 02, 2023
ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

(Comet-) ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs Paper Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sa

AI2 152 Dec 27, 2022
This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural tree born form a large search space

SeBoW: Self-Born Wiring for neural trees(PaddlePaddle version) This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural

HollyLee 13 Dec 08, 2022
FeTaQA: Free-form Table Question Answering

FeTaQA: Free-form Table Question Answering FeTaQA is a Free-form Table Question Answering dataset with 10K Wikipedia-based {table, question, free-form

Language, Information, and Learning at Yale 40 Dec 13, 2022
This is the pytorch code for the paper Curious Representation Learning for Embodied Intelligence.

Curious Representation Learning for Embodied Intelligence This is the pytorch code for the paper Curious Representation Learning for Embodied Intellig

19 Oct 19, 2022
PyTorch Code for "Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning"

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning [Project Page] [Paper] Wenlong Huang1, Igor Mordatch2, Pieter Abbeel1,

Wenlong Huang 40 Nov 22, 2022
Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.

COResets and Data Subset selection Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order

decile-team 244 Jan 09, 2023
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks (SDPoint) This repository contains the cod

Jason Kuen 17 Jul 04, 2022