This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Last update: Aug 19, 2022

Overview

Code-and-Dataset-for-CapSal

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019. Paper link

Our code is implemented based on the Mask RCNN in Tensorflow and Keras. You can first install the maskrcnn according to the instruction or INSTALL.md.

COCO-CapSal Dataset

The COCO-CapSal dataset provides the saliency ground truth as well as the image captions for each image. It contains 5265 images for training and 1459 ones for validation. The annotations can be downloaded at BaiduYun or GoogleDrive. The folder 'capsal' contains the images, ground truth maps as well as the caprions (json file) of both training and validation sets.

Evaluation

For testing the CapSal model, first download the trained model at BaiduYun or Google ) and put it under the ./model. Run test_capsal.py to obtain the saliency maps of different datasets. The saliency map is avaliable at Google or BaiduYun.

Train

Run 'train.py'.

Citation

    @InProceedings{Zhang_2019_CVPR,
            author = {Zhang, Lu and Zhang, Jianming and Lin, Zhe and Lu, Huchuan and He, You},
            title = {CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection},
            booktitle = CVPR,
            year = {2019}}

This project provides the code and datasets for 'CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection', CVPR 2019.

Related tags

Overview

Code-and-Dataset-for-CapSal

COCO-CapSal Dataset

Evaluation

Train

Citation

Owner

lu zhang

Unicorn can be used for performance analyses of highly configurable systems with causal reasoning

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

pytorch implementation of GPV-Pose

toroidal - a lightweight transformer library for PyTorch

Modeling CNN layers activity with Gaussian mixture model

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Half Instance Normalization Network for Image Restoration

Setup and customize deep learning environment in seconds.

realsense d400 -> jpg + csv

STRIVE: Scene Text Replacement In Videos

FAMIE is a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction (IE)

YOLOX_AUDIO is an audio event detection model based on YOLOX

A collection of papers about Transformer in the field of medical image analysis.

Image-to-image translation with conditional adversarial nets

MQBench Quantization Aware Training with PyTorch

This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.

Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot. Graph Convolutional Networks for Hyperspectral Image Classification, IEEE TGRS, 2021.

PyTorch implementation of SmoothGrad: removing noise by adding noise.

GrailQA: Strongly Generalizable Question Answering

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021