Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Last update: Oct 04, 2022

Related tags

Overview

Triple-cooperative Video Shadow Detection

Code and dataset for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"[arXiv link] [official link].
by Zhihao Chen¹, Liang Wan¹, Lei Zhu², Jia Shen¹, Huazhu Fu³, Wennan Liu⁴, and Jing Qin⁵
¹College of Intelligence and Computing, Tianjin University
²Department of Applied Mathematics and Theoretical Physics, University of Cambridge
³Inception Institute of Artiﬁcial Intelligence, UAE
⁴Academy of Medical Engineering and Translational Medicine, Tianjin University
⁵The Hong Kong Polytechnic University

News: In 2021.4.7, We first release the code of TVSD and ViSha dataset.

Citation

@inproceedings{chen21TVSD,
     author = {Chen, Zhihao and Wan, Liang and Zhu, Lei and Shen, Jia and Fu, Huazhu and Liu, Wennan and Qin, Jing},
     title = {Triple-cooperative Video Shadow Detection},
     booktitle = {CVPR},
     year = {2021}
}

Dataset

ViSha dataset is available at ViSha Homepage

Requirement

Python 3.6
PyTorch 1.3.1
torchvision
numpy
tqdm
PIL
math
time
datatime
argparse
apex (alternative, fp16 for save memory and speedup)

Training

Modify the data path on ./config.py
Modify the pretrained backbone path on ./networks/resnext_modify/config.py
Run by python train.py and model will be saved in ./models/TVSD

The pretrained ResNeXt model is ported from the official torch version, using the convertor provided by clcarwin. You can directly download the pretrained model ported by us.

Testing

Modify the data path on ./config.py
Make sure you have a snapshot in ./models/TVSD (Tips: You can download the trained model which is reported in our paper at BaiduNetdisk(pw: 8p5h) or Google Drive)
Run by python infer.py to generate predicted masks
Run by python evaluate.py to evaluate the generated results

Results in ViSha testing set

As mentioned in our paper, since there is no CNN-based method for video shadow detection, we make comparison against 12 state-of-the-art methods for relevant tasks, including BDRAR[1], DSD[2], MTMT[3] (single-image shadow detection), FPN[4], PSPNet[5] (single-image semantic segmentation), DSS[6], R^3 Net[7] (single-image saliency detection), PDBM[8], MAG[9] (video saliency detection), COSNet[10], FEELVOS[11], STM[12] (object object segmentation)
[1]L. Zhu, Z. Deng, X. Hu, C.-W. Fu, X. Xu, J. Qin, and P.-A. Heng. Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In ECCV, pages 121–136, 2018.
[2]Q. Zheng, X. Qiao, Y. Cao, and R.W. Lau. Distraction-aware shadow detection. In CVPR, pages 5167–5176, 2019.
[3]Z. Chen, L. Zhu, L. Wan, S. Wang, W. Feng, and P.-A. Heng. A multi-task mean teacher for semi-supervised shadow detection. In CVPR, pages 5611–5620, 2020.
[4]T.-Y. Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S.Belongie. Feature pyramid networks for object detection. In CVPR, pages 2117–2125, 2017.
[5]H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In CVPR, pages 2881–2890, 2017.
[6]Q. Hou, M. Cheng, X. Hu, A. Borji, Z. Tu, and P. Torr. Deeply supervised salient object detection with short connections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(4):815–828, 2019.
[7]Z. Deng, X. Hu, L. Zhu, X. Xu, J. Qin, G. Han, and P.-A. Heng. R3net: Recurrent residual reﬁnement network for saliency detection. In IJCAI, pages 684–690. AAAI Press, 2018.
[8]H. Song, W. Wang, S. Zhao, J. Shen, and K.-M. Lam. Pyramid dilated deeper convlstm for video salient object detection. In ECCV, pages 715–731, 2018.
[9]H. Li, G. Chen, G. Li, and Y. Yu. Motion guided attention for video salient object detection. In ICCV, pages 7274–7283, 2019.
[10]X. Lu, W. Wang, C. Ma, J. Shen, L. Shao, and F. Porikli. See more, know more: Unsupervised video object segmentation with co-attention siamese networks. In CVPR, pages 3623–3632, 2019.
[11]P. Voigtlaender, Y. Chai, F. Schroff, H. Adam, B. Leibe, and L.-C. Chen. Feelvos: Fast end-to-end embedding learning for video object segmentation. In CVPR, June 2019.
[12]S.W. Oh, J.-Y. Lee, N. Xu, and S.J. Kim. Video object segmentation using space-time memory networks. In ICCV, pages 9226–9235, 2019.

We evaluate those methods and our TVSD in ViSha testing set and release all results in BaiduNetdisk(pw: ritw) or Google Drive

Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

Related tags

Overview

Triple-cooperative Video Shadow Detection

News: In 2021.4.7, We first release the code of TVSD and ViSha dataset.

Citation

Dataset

Requirement

Training

Testing

Results in ViSha testing set

Owner

Zhihao Chen

The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

Pytorch implementation AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation

[ICCV2021] 3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

ICON: Implicit Clothed humans Obtained from Normals

PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

VLGrammar: Grounded Grammar Induction of Vision and Language

Using CNN to mimic the driver based on training data from Torcs

Object-aware Contrastive Learning for Debiased Scene Representation

DeepOBS: A Deep Learning Optimizer Benchmark Suite

Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"

An official implementation of MobileStyleGAN in PyTorch

This repository contains the code used for the implementation of the paper "Probabilistic Regression with HuberDistributions"

Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

JFB: Jacobian-Free Backpropagation for Implicit Models