Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

Overview

SegSwap

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

[PDF] [Project page]

teaser

teaser

If our project is helpful for your research, please consider citing :

@article{shen2021learning,
  title={Learning Co-segmentation by Segment Swapping for Retrieval and Discovery},
  author={Shen, Xi and Efros, Alexei A and Joulin, Armand and Aubry, Mathieu},
  journal={arXiv},
  year={2021}

Table of Content

1. Installation

1.1. Dependencies

Our model can be learnt on a a single GPU Tesla-V100-16GB. The code has been tested in Pytorch 1.7.1 + cuda 10.2

Other dependencies can be installed via (tqdm, kornia, opencv-python, scipy) :

bash requirement.sh

1.2. Pre-trained MocoV2-resnet50 + cross-transformer (~300M)

Quick download :

cd model/pretrained
bash download_model.sh

2. Training Data Generation

2.1. Download COCO (~20G)

This command will download coco2017 training set + annotations (~20G).

cd data/COCO2017/download_coco.sh
bash download_coco.sh

2.2. Image Pairs with One Repeated Object

2.2.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_1obj.py --out-dir pairs_1obj_100k 

2.2.1 Examples of image pairs

Source Blended Obj + Background Stylised Source Stylised Background

2.2.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_1obj/vis.html

2.3. Image Pairs with Two Repeated Object

2.3.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_2obj.py --out-dir pairs_2obj_100k 

2.3.1 Examples of image pairs

Source Blended Obj + Background Stylised Source Stylised Background

2.3.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_2obj/vis.html

3. Evaluation

3.1 One-shot Art Detail Detection on Brueghel Dataset

3.1.1 Visual results: top-3 retrieved images

teaser

3.1.2 Data

Brueghel dataset has been uploaded in this repo

3.1.3 Quantitative results

The following command conduct evaluation on Brueghel with pre-trained cross-transformer:

cd evalBrueghel
python evalBrueghel.py --out-coarse out_brueghel.json --resume-pth ../model/hard_mining_neg5.pth --label-pth ../data/Brueghel/brueghelTest.json

Note that this command will save the features of Brueghel(~10G).

3.2 Place Recognition on Tokyo247 Dataset

3.2.1 Visual results: top-3 retrieved images

teaser

3.2.2 Data

Download Tokyo247 from its project page

Download the top-100 results used by patchVlad(~1G).

The data needs to be organised:

./SegSwap/data/Tokyo247
                    ├── query/
                        ├── 247query_subset_v2/
                    ├── database/
...

./SegSwap/evalTokyo
                    ├── top100_patchVlad.npy

3.2.3 Quantitative results

The following command conduct evaluation on Tokyo247 with pre-trained cross-transformer:

cd evalTokyo
python evalTokyo.py --qry-dir ../data/Tokyo247/query/247query_subset_v2 --db-dir ../data/Tokyo247/database --resume-pth ../model/hard_mining_neg5.pth

3.3 Place Recognition on Pitts30K Dataset

3.3.1 Visual results: top-3 retrieved images

teaser

3.3.2 Data

Download Pittsburgh dataset from its project page

Download the top-100 results used by patchVlad (~4G).

The data needs to be organised:

./SegSwap/data/Pitts
                ├── queries_real/
...

./SegSwap/evalPitts
                    ├── top100_patchVlad.npy

3.3.3 Quantitative results

The following command conduct evaluation on Pittsburgh30K with pre-trained cross-transformer:

cd evalPitts
python evalPitts.py --qry-dir ../data/Pitts/queries_real --db-dir ../data/Pitts --resume-pth ../model/hard_mining_neg5.pth

3.4 Discovery on Internet Dataset

3.4.1 Visual results

teaser

3.4.2 Data

Download Internet dataset from its project page

We provide a script to quickly download and preprocess the data (~400M):

cd data/Internet
bash download_int.sh

The data needs to be organised:

./SegSwap/data/Internet
                ├── Airplane100
                    ├── GroundTruth                
                ├── Horse100
                    ├── GroundTruth                
                ├── Car100
                    ├── GroundTruth                                

3.4.3 Quantitative results

The following commands conduct evaluation on Internet with pre-trained cross-transformer

cd evalInt
bash run_pair_480p.sh
bash run_best_only_cycle.sh

4. Training

Stage 1: standard training

Supposing that the generated pairs are saved in ./SegSwap/data/pairs_1obj_100k and ./SegSwap/data/pairs_2obj_100k.

Training command can be found in ./SegSwap/train/run.sh.

Note that this command should be able to be launched on a single GPU with 16G memory.

cd train
bash run.sh

Stage 2: hard mining

In train/run_hardmining.sh, replacing --resume-pth by the model trained in the 1st stage, than running:

cd train
bash run_hardmining.sh

5. Acknowledgement

We appreciate helps from :

Part of code is borrowed from our previous projects: ArtMiner and Watermark

6. ChangeLog

  • 21/10/21, model, evaluation + training released

7. License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including Kornia, Pytorch, and uses datasets which each have their own respective licenses that must also be followed.

Owner
xshen
Ph.D, Computer Vision, Deep Learning.
xshen
Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Non-Parametric Prior Actor-Critic (N-PPAC) This repository contains the code for On Pathologies in KL-Regularized Reinforcement Learning from Expert D

Cong Lu 5 May 13, 2022
Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

TableParser Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at DS3 Lab 11 Dec 13, 2022

Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)

T-Zero This repository serves primarily as codebase and instructions for training, evaluation and inference of T0. T0 is the model developed in Multit

BigScience Workshop 253 Dec 27, 2022
NovelD: A Simple yet Effective Exploration Criterion

NovelD: A Simple yet Effective Exploration Criterion Intro This is an implementation of the method proposed in NovelD: A Simple yet Effective Explorat

29 Dec 05, 2022
Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

deepface Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python. It is a hybrid

Kushal Shingote 2 Feb 10, 2022
An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Agar.io_Q-Learning_AI An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available act

1 Jun 09, 2022
This repository contains code accompanying the paper "An End-to-End Chinese Text Normalization Model based on Rule-Guided Flat-Lattice Transformer"

FlatTN This repository contains code accompanying the paper "An End-to-End Chinese Text Normalization Model based on Rule-Guided Flat-Lattice Transfor

THUHCSI 74 Nov 28, 2022
Implementation for the paper SMPLicit: Topology-aware Generative Model for Clothed People (CVPR 2021)

SMPLicit: Topology-aware Generative Model for Clothed People [Project] [arXiv] License Software Copyright License for non-commercial scientific resear

Enric Corona 225 Dec 13, 2022
An imperfect information game is a type of game with asymmetric information

DecisionHoldem An imperfect information game is a type of game with asymmetric information. Compared with perfect information game, imperfect informat

Decision AI 25 Dec 23, 2022
NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

Princeton Natural Language Processing 150 Dec 20, 2022
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning

MaCan 4.2k Dec 29, 2022
PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation The paper: https://arxiv.org/abs/1704.03296 What makes

Jacob Gildenblat 322 Dec 17, 2022
DeepMReye: magnetic resonance-based eye tracking using deep neural networks

DeepMReye: magnetic resonance-based eye tracking using deep neural networks

73 Dec 21, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 59 Dec 28, 2022
This repository is for DSA and CP scripts for reference.

dsa-script-collections This Repo is the collection of DSA and CP scripts for reference. Contents Python Bubble Sort Insertion Sort Merge Sort Quick So

Aditya Kumar Pandey 9 Nov 22, 2022
An educational tool to introduce AI planning concepts using mobile manipulator robots.

JEDAI Explains Decision-Making AI Virtual Machine Image The recommended way of using JEDAI is to use pre-configured Virtual Machine image that is avai

Autonomous Agents and Intelligent Robots 13 Nov 15, 2022
Interactive dimensionality reduction for large datasets

BlosSOM 🌼 BlosSOM is a graphical environment for running semi-supervised dimensionality reduction with EmbedSOM. You can use it to explore multidimen

19 Dec 14, 2022
BridgeGAN - Tensorflow implementation of Bridging the Gap between Label- and Reference-based Synthesis in Multi-attribute Image-to-Image Translation.

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022
Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Heterogeneous INteract and aggreGatE (GraphHINGE) This is a pytorch implementation of GraphHINGE model. This is the experiment code in the following w

Jinjiarui 69 Nov 24, 2022
Adversarial Autoencoders

Adversarial Autoencoders (with Pytorch) Dependencies argparse time torch torchvision numpy itertools matplotlib Create Datasets python create_datasets

Felipe Ducau 188 Jan 01, 2023