Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

Related tags

Deep LearningCDN
Overview

CDN

Code for our NeurIPS 2021 paper "Mining the Benefits of Two-stage and One-stage HOI Detection".

Contributed by Aixi Zhang*, Yue Liao*, Si Liu, Miao Lu, Yongliang Wang, Chen Gao and Xiaobo Li.

Installation

Installl the dependencies.

pip install -r requirements.txt

Data preparation

HICO-DET

HICO-DET dataset can be downloaded here. After finishing downloading, unpack the tarball (hico_20160224_det.tar.gz) to the data directory.

Instead of using the original annotations files, we use the annotation files provided by the PPDM authors. The annotation files can be downloaded from here. The downloaded annotation files have to be placed as follows.

data
 └─ hico_20160224_det
     |─ annotations
     |   |─ trainval_hico.json
     |   |─ test_hico.json
     |   └─ corre_hico.npy
     :

V-COCO

First clone the repository of V-COCO from here, and then follow the instruction to generate the file instances_vcoco_all_2014.json. Next, download the prior file prior.pickle from here. Place the files and make directories as follows.

qpic
 |─ data
 │   └─ v-coco
 |       |─ data
 |       |   |─ instances_vcoco_all_2014.json
 |       |   :
 |       |─ prior.pickle
 |       |─ images
 |       |   |─ train2014
 |       |   |   |─ COCO_train2014_000000000009.jpg
 |       |   |   :
 |       |   └─ val2014
 |       |       |─ COCO_val2014_000000000042.jpg
 |       |       :
 |       |─ annotations
 :       :

For our implementation, the annotation file have to be converted to the HOIA format. The conversion can be conducted as follows.

PYTHONPATH=data/v-coco \
        python convert_vcoco_annotations.py \
        --load_path data/v-coco/data \
        --prior_path data/v-coco/prior.pickle \
        --save_path data/v-coco/annotations

Note that only Python2 can be used for this conversion because vsrl_utils.py in the v-coco repository shows a error with Python3.

V-COCO annotations with the HOIA format, corre_vcoco.npy, test_vcoco.json, and trainval_vcoco.json will be generated to annotations directory.

Pre-trained model

Download the pretrained model of DETR detector for ResNet50, and put it to the params directory.

python convert_parameters.py \
        --load_path params/detr-r50-e632da11.pth \
        --save_path params/detr-r50-pre-2stage-q64.pth \
        --num_queries 64

python convert_parameters.py \
        --load_path params/detr-r50-e632da11.pth \
        --save_path params/detr-r50-pre-2stage.pth \
        --dataset vcoco

Training

After the preparation, you can start training with the following commands. The whole training is split into two steps: CDN base model training and dynamic re-weighting training. The trainings of CDN-S for HICO-DET and V-COCO are shown as follows.

HICO-DET

python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained params/detr-r50-pre-2stage-q64.pth \
        --output_dir logs \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --num_queries 64 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --epochs 90 \
        --lr_drop 60 \
        --use_nms_filter

python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained logs/checkpoint_last.pth \
        --output_dir logs/ \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --num_queries 64 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --epochs 10 \
        --freeze_mode 1 \
        --obj_reweight \
        --verb_reweight \
        --use_nms_filter

V-COCO

python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained params/detr-r50-pre-2stage.pth \
        --output_dir logs \
        --dataset_file vcoco \
        --hoi_path data/v-coco \
        --num_obj_classes 81 \
        --num_verb_classes 29 \
        --backbone resnet50 \
        --num_queries 100 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --epochs 90 \
        --lr_drop 60 \
        --use_nms_filter

python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained logs/checkpoint_last.pth \
        --output_dir logs/ \
        --dataset_file vcoco \
        --hoi_path data/v-coco \
        --num_obj_classes 81 \
        --num_verb_classes 29 \
        --backbone resnet50 \
        --num_queries 100 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --epochs 10 \
        --freeze_mode 1 \
        --verb_reweight \
        --use_nms_filter

Evaluation

HICO-DET

You can conduct the evaluation with trained parameters for HICO-DET as follows.

python -m torch.distributed.launch \
        --nproc_per_node=8 \
        --use_env \
        main.py \
        --pretrained pretrained/hico_cdn_s.pth \
        --dataset_file hico \
        --hoi_path data/hico_20160224_det \
        --num_obj_classes 80 \
        --num_verb_classes 117 \
        --backbone resnet50 \
        --num_queries 64 \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --eval \
        --use_nms_filter

V-COCO

For the official evaluation of V-COCO, a pickle file of detection results have to be generated. You can generate the file and then evaluate it as follows.

python generate_vcoco_official.py \
        --param_path pretrained/vcoco_cdn_s.pth \
        --save_path vcoco.pickle \
        --hoi_path data/v-coco \
        --dec_layers_hopd 3 \
        --dec_layers_interaction 3 \
        --use_nms_filter

cd data/v-coco
python vsrl_eval.py vcoco.pickle

Results

HICO-DET

Full (D) Rare (D) Non-rare (D) Full(KO) Rare (KO) Non-rare (KO) Download
CDN-S (R50) 31.44 27.39 32.64 34.09 29.63 35.42 model
CDN-B (R50) 31.78 27.55 33.05 34.53 29.73 35.96 model
CDN-L (R101) 32.07 27.19 33.53 34.79 29.48 36.38 model

D: Default, KO: Known object

V-COCO

Scenario 1 Scenario 2 Download
CDN-S (R50) 61.68 63.77 model
CDN-B (R50) 62.29 64.42 model
CDN-L (R101) 63.91 65.89 model

Citation

Please consider citing our paper if it helps your research.

@article{zhang2021mining,
  title={Mining the Benefits of Two-stage and One-stage HOI Detection},
  author={Zhang, Aixi and Liao, Yue and Liu, Si and Lu, Miao and Wang, Yongliang and Gao, Chen and Li, Xiaobo},
  journal={arXiv preprint arXiv:2108.05077},
  year={2021}
}

License

CDN is released under the MIT license. See LICENSE for additional details.

Acknowledge

Some of the codes are built upon PPDM, DETR and QPIC. Thanks them for their great works!

Compact Bilinear Pooling for PyTorch

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 234 Dec 07, 2022
Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Troyanskaya Laboratory 323 Jan 01, 2023
This repository is based on Ultralytics/yolov5, with adjustments to enable polygon prediction boxes.

Polygon-Yolov5 This repository is based on Ultralytics/yolov5, with adjustments to enable polygon prediction boxes. Section I. Description The codes a

xinzelee 226 Jan 05, 2023
A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

idn-solver Paper | Project Page This repository contains the code release of our ICCV 2021 paper: A Confidence-based Iterative Solver of Depths and Su

zhaowang 43 Nov 17, 2022
AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models

AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models Description

Angel de Paula 0 Jun 08, 2022
Improving Factual Consistency of Abstractive Text Summarization

Improving Factual Consistency of Abstractive Text Summarization We provide the code for the papers: "Entity-level Factual Consistency of Abstractive T

61 Nov 27, 2022
Using contrastive learning and OpenAI's CLIP to find good embeddings for images with lossy transformations

Creating Robust Representations from Pre-Trained Image Encoders using Contrastive Learning Sriram Ravula, Georgios Smyrnis This is the code for our pr

Sriram Ravula 26 Dec 10, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 08, 2023
The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding"

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

AutoML Research 64 Dec 17, 2022
Efficient training of deep recommenders on cloud.

HybridBackend Introduction HybridBackend is a training framework for deep recommenders which bridges the gap between evolving cloud infrastructure and

Alibaba 111 Dec 23, 2022
Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"

Scripts for "Current best-practices in single-cell RNA-seq: a tutorial" This repository is complementary to the publication: M.D. Luecken, F.J. Theis,

Theis Lab 968 Dec 28, 2022
ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

ST++ This is the official PyTorch implementation of our paper: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation. Lihe Ya

Lihe Yang 147 Jan 03, 2023
Iran Open Source Hackathon

Iran Open Source Hackathon is an open-source hackathon (duh) with the aim of encouraging participation in open-source contribution amongst Iranian dev

OSS Hackathon 121 Dec 25, 2022
OpenDILab Multi-Agent Environment

Go-Bigger: Multi-Agent Decision Intelligence Environment GoBigger Doc (中文版) Ongoing 2021.11.13 We are holding a competition —— Go-Bigger: Multi-Agent

OpenDILab 441 Jan 05, 2023
Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

FPS-Net Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation", accepted by ISPRS journal of Photogrammetry

15 Nov 30, 2022
Tensor-based approaches for fMRI classification

tensor-fmri Using tensor-based approaches to classify fMRI data from StarPLUS. Citation If you use any code in this repository, please cite the follow

4 Sep 07, 2022
Rapid experimentation and scaling of deep learning models on molecular and crystal graphs.

LitMatter A template for rapid experimentation and scaling deep learning models on molecular and crystal graphs. How to use Clone this repository and

Nathan Frey 32 Dec 06, 2022
Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs MATLAB implementation of the paper: P. Mercado, F. Tudisco, and M. Hein,

Pedro Mercado 6 May 26, 2022
Code for paper [ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot] (ICCV 2021, oral))

ACE: Ally Complementary Experts for Solving Long-Tailed Recognition in One-Shot This repository is the official PyTorch implementation of ICCV-21 pape

Jiarui 21 May 09, 2022