Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Overview

Contrast and Mix (CoMix)

The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing part of Advances in Neural Information Processing Systems (NeurIPS) 2021.

Aadarsh Sahoo1, Rutav Shah1, Rameswar Panda2, Kate Saenko2,3, Abir Das1

1 IIT Kharagpur, 2 MIT-IBM Watson AI Lab, 3 Boston University

[Paper] [Project Page]

 

Fig. Temporal Contrastive Learning with Background Mixing and Target Pseudo-labels. Temporal contrastive loss (left) contrasts a single temporally augmented positive (same video, different speed) per anchor against rest of the videos in a mini-batch as negatives. Incorporating background mixing (middle) provides additional positives per anchor possessing same action semantics with a different background alleviating background shift across domains. Incorporating target pseudo-labels (right) additionally enhances the discriminabilty by contrasting the target videos with the same pseudo-label as positives against rest of the videos as negatives.

 

Preparing the Environment

Conda

Please use the comix_environment.yml file to create the conda environment comix as:

conda env create -f comix_environment.yml

Pip

Please use the requirements.txt file to install all the required dependencies as:

pip install -r requirements.txt

Data Directory Structure

All the datasets should be stored in the folder ./data following the convention ./data/ and it must be passed as an argument to base_dir=./data/ .

UCF - HMDB

For ucf_hmdb dataset with base_dir=./data/ucf_hmdb the structure would be as follows:

.
├── ...
├── data
│   ├── ucf_hmdb
│   │   ├── ucf_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
│   │   ├── hmdb_videos
|   |   ├── ucf_BG
|   |   └── hmdb_BG
│   └──
└──

      
     
    
   
Jester

For Jester dataset with base_dir=./data/jester the structure would be as follows

.
├── ...
├── data
│   ├── jester
|   |   ├── jester_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
|   |   ├── jester_BG
|   |   |   ├── 
       
         | | | | ├── 
        
          | | | ├── ... └── └── └── 
        
       
      
     
    
   
Epic-Kitchens

For Epic Kitchens dataset with base_dir=./data/epic_kitchens the structure would be as follows (we follow the same structure as in the original dataset) :

.
├── ...
├── data
│   ├── epic_kitchens
|   |   ├── epic_kitchens_videos
|   |   |   ├── train
|   |   |   |   ├── D1
|   |   |   |   |   ├── 
   
    
|   |   |   |   |   |   ├── 
    
     
|   |   |   |   |   |   ├── 
     
      
|   |   |   |   |   |   ├── ...
|   |   |   |   |   ├── 
      
       
|   |   |   |   |   ├── ...
|   |   |   |   ├── D2
|   |   |   |   └── D3
|   |   |   └── test
└── └── └── epic_kitchens_BG

      
     
    
   

For using datasets stored in some other directories, please pass the parameter base_dir accordingly.

Background Extraction using Temporal Median Filtering

Please refer to the folder ./background_extraction for the codes to extract backgrounds using temporal median filtering.

Data

All the required split files are provided inside the directory ./video_splits.

The official download links for the datasets used for this paper are: [UCF-101] [HMDB-51] [Jester] [Epic Kitchens]

Training CoMix

Here are some of the sample and recomended commands to train CoMix for the transfer task of:

UCF -> HMDB from UCF-HMDB dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name UCF-HMDB --src_dataset UCF --tgt_dataset HMDB --batch_size 8 --model_root ./checkpoints_ucf_hmdb --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.01 --base_dir ./data/ucf_hmdb

S -> T from Jester dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Jester --src_dataset S --tgt_dataset T --batch_size 8 --model_root ./checkpoints_jester --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.1 --base_dir ./data/jester

D1 -> D2 from Epic-Kitchens dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Epic-Kitchens --src_dataset D1 --tgt_dataset D2 --batch_size 8 --model_root ./checkpoints_epic_d1_d2 --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.01 --lambda_tpl 0.01 --base_dir ./data/epic_kitchens

For detailed description regarding the arguments, use:

python main.py --help

Citing CoMix

If you use codes in this repository, consider citing CoMix. Thanks!

@article{sahoo2021contrast,
  title={Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing},
  author={Sahoo, Aadarsh and Shah, Rutav and Panda, Rameswar and Saenko, Kate and Das, Abir},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}
Owner
Computer Vision and Intelligence Research (CVIR)
The Computer Vision and Intelligence Research (CVIR) group is part of the Department of Computer Science and Engineering at IIT Kharagpur.
Computer Vision and Intelligence Research (CVIR)
Our CIKM21 Paper "Incorporating Query Reformulating Behavior into Web Search Evaluation"

Reformulation-Aware-Metrics Introduction This codebase contains source-code of the Python-based implementation of our CIKM 2021 paper. Chen, Jia, et a

xuanyuan14 5 Mar 05, 2022
DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

DSAC* for Visual Camera Re-Localization (RGB or RGB-D) Introduction Installation Data Structure Supported Datasets 7Scenes 12Scenes Cambridge Landmark

Visual Learning Lab 143 Dec 22, 2022
3.8% and 18.3% on CIFAR-10 and CIFAR-100

Wide Residual Networks This code was used for experiments with Wide Residual Networks (BMVC 2016) http://arxiv.org/abs/1605.07146 by Sergey Zagoruyko

Sergey Zagoruyko 1.2k Dec 29, 2022
Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

SMOP is Small Matlab and Octave to Python compiler. SMOP translates matlab to py

Tom Xu 1 Jan 12, 2022
Fuzzy Overclustering (FOC)

Fuzzy Overclustering (FOC) In real-world datasets, we need consistent annotations between annotators to give a certain ground-truth label. However, in

2 Nov 08, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

536 Dec 20, 2022
Automated image registration. Registrationimation was too much of a mouthful.

alignimation Automated image registration. Registrationimation was too much of a mouthful. This repo contains the code used for my blog post Alignimat

Ethan Rosenthal 9 Oct 13, 2022
SOTA easy to use PyTorch-based DL training library

Easily train or fine-tune SOTA computer vision models from one training repository. SuperGradients Introduction Welcome to SuperGradients, a free open

619 Jan 03, 2023
Improving 3D Object Detection with Channel-wise Transformer

"Improving 3D Object Detection with Channel-wise Transformer" Thanks for the OpenPCDet, this implementation of the CT3D is mainly based on the pcdet v

Hualian Sheng 107 Dec 20, 2022
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

Isen (Songyao Jiang) 128 Dec 08, 2022
Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstrac

2 Apr 14, 2022
Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

MeshTransformer ✨ This is our research code of End-to-End Human Pose and Mesh Reconstruction with Transformers. MEsh TRansfOrmer is a simple yet effec

Microsoft 473 Dec 31, 2022
Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"

High-Performance Brain-to-Text Communication via Handwriting Overview This repo is associated with this manuscript, preprint and dataset. The code can

Francis R. Willett 306 Jan 03, 2023
PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation

PyGRANSO PyGRANSO: A PyTorch-enabled port of GRANSO with auto-differentiation Please check https://ncvx.org/PyGRANSO for detailed instructions (introd

SUN Group @ UMN 26 Nov 16, 2022
Real-time Neural Representation Fusion for Robust Volumetric Mapping

NeuralBlox: Real-Time Neural Representation Fusion for Robust Volumetric Mapping Paper | Supplementary This repository contains the implementation of

ETHZ ASL 106 Dec 24, 2022
Logistic Bandit experiments. Official code for the paper "Jointly Efficient and Optimal Algorithms for Logistic Bandits".

Code for the paper Jointly Efficient and Optimal Algorithms for Logistic Bandits, by Louis Faury, Marc Abeille, Clément Calauzènes and Kwang-Sun Jun.

Faury Louis 1 Jan 22, 2022
The implementation of PEMP in paper "Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes"

Prior-Enhanced network with Meta-Prototypes (PEMP) This is the PyTorch implementation of PEMP. Overview of PEMP Meta-Prototypes & Adaptive Prototypes

Jianwei ZHANG 8 Oct 14, 2021
[CVPR'22] COAP: Learning Compositional Occupancy of People

COAP: Compositional Articulated Occupancy of People Paper | Video | Project Page This is the official implementation of the CVPR 2022 paper COAP: Lear

Marko Mihajlovic 111 Dec 11, 2022
Deep Halftoning with Reversible Binary Pattern

Deep Halftoning with Reversible Binary Pattern ICCV Paper | Project Website | BibTex Overview Existing halftoning algorithms usually drop colors and f

Menghan Xia 17 Nov 22, 2022