Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

Overview

SegSwap

Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

[PDF] [Project page]

teaser

teaser

If our project is helpful for your research, please consider citing :

@article{shen2021learning,
  title={Learning Co-segmentation by Segment Swapping for Retrieval and Discovery},
  author={Shen, Xi and Efros, Alexei A and Joulin, Armand and Aubry, Mathieu},
  journal={arXiv},
  year={2021}

Table of Content

1. Installation

1.1. Dependencies

Our model can be learnt on a a single GPU Tesla-V100-16GB. The code has been tested in Pytorch 1.7.1 + cuda 10.2

Other dependencies can be installed via (tqdm, kornia, opencv-python, scipy) :

bash requirement.sh

1.2. Pre-trained MocoV2-resnet50 + cross-transformer (~300M)

Quick download :

cd model/pretrained
bash download_model.sh

2. Training Data Generation

2.1. Download COCO (~20G)

This command will download coco2017 training set + annotations (~20G).

cd data/COCO2017/download_coco.sh
bash download_coco.sh

2.2. Image Pairs with One Repeated Object

2.2.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_1obj.py --out-dir pairs_1obj_100k 

2.2.1 Examples of image pairs

Source Blended Obj + Background Stylised Source Stylised Background

2.2.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_1obj/vis.html

2.3. Image Pairs with Two Repeated Object

2.3.1 Generating 100k pairs (~18G)

This command will generate 100k image pairs with one repeated object.

cd data/
python generate_2obj.py --out-dir pairs_2obj_100k 

2.3.1 Examples of image pairs

Source Blended Obj + Background Stylised Source Stylised Background

2.3.2 Visualizing correspondences and masks of the generated pairs

This command will generate 10 pairs and visualize correspondences and masks of the pairs.

cd data/
bash vis_pair.sh

These pairs can be illustrated via vis10_2obj/vis.html

3. Evaluation

3.1 One-shot Art Detail Detection on Brueghel Dataset

3.1.1 Visual results: top-3 retrieved images

teaser

3.1.2 Data

Brueghel dataset has been uploaded in this repo

3.1.3 Quantitative results

The following command conduct evaluation on Brueghel with pre-trained cross-transformer:

cd evalBrueghel
python evalBrueghel.py --out-coarse out_brueghel.json --resume-pth ../model/hard_mining_neg5.pth --label-pth ../data/Brueghel/brueghelTest.json

Note that this command will save the features of Brueghel(~10G).

3.2 Place Recognition on Tokyo247 Dataset

3.2.1 Visual results: top-3 retrieved images

teaser

3.2.2 Data

Download Tokyo247 from its project page

Download the top-100 results used by patchVlad(~1G).

The data needs to be organised:

./SegSwap/data/Tokyo247
                    ├── query/
                        ├── 247query_subset_v2/
                    ├── database/
...

./SegSwap/evalTokyo
                    ├── top100_patchVlad.npy

3.2.3 Quantitative results

The following command conduct evaluation on Tokyo247 with pre-trained cross-transformer:

cd evalTokyo
python evalTokyo.py --qry-dir ../data/Tokyo247/query/247query_subset_v2 --db-dir ../data/Tokyo247/database --resume-pth ../model/hard_mining_neg5.pth

3.3 Place Recognition on Pitts30K Dataset

3.3.1 Visual results: top-3 retrieved images

teaser

3.3.2 Data

Download Pittsburgh dataset from its project page

Download the top-100 results used by patchVlad (~4G).

The data needs to be organised:

./SegSwap/data/Pitts
                ├── queries_real/
...

./SegSwap/evalPitts
                    ├── top100_patchVlad.npy

3.3.3 Quantitative results

The following command conduct evaluation on Pittsburgh30K with pre-trained cross-transformer:

cd evalPitts
python evalPitts.py --qry-dir ../data/Pitts/queries_real --db-dir ../data/Pitts --resume-pth ../model/hard_mining_neg5.pth

3.4 Discovery on Internet Dataset

3.4.1 Visual results

teaser

3.4.2 Data

Download Internet dataset from its project page

We provide a script to quickly download and preprocess the data (~400M):

cd data/Internet
bash download_int.sh

The data needs to be organised:

./SegSwap/data/Internet
                ├── Airplane100
                    ├── GroundTruth                
                ├── Horse100
                    ├── GroundTruth                
                ├── Car100
                    ├── GroundTruth                                

3.4.3 Quantitative results

The following commands conduct evaluation on Internet with pre-trained cross-transformer

cd evalInt
bash run_pair_480p.sh
bash run_best_only_cycle.sh

4. Training

Stage 1: standard training

Supposing that the generated pairs are saved in ./SegSwap/data/pairs_1obj_100k and ./SegSwap/data/pairs_2obj_100k.

Training command can be found in ./SegSwap/train/run.sh.

Note that this command should be able to be launched on a single GPU with 16G memory.

cd train
bash run.sh

Stage 2: hard mining

In train/run_hardmining.sh, replacing --resume-pth by the model trained in the 1st stage, than running:

cd train
bash run_hardmining.sh

5. Acknowledgement

We appreciate helps from :

Part of code is borrowed from our previous projects: ArtMiner and Watermark

6. ChangeLog

  • 21/10/21, model, evaluation + training released

7. License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including Kornia, Pytorch, and uses datasets which each have their own respective licenses that must also be followed.

Owner
xshen
Ph.D, Computer Vision, Deep Learning.
xshen
competitions-v2

Codabench (formerly Codalab Competitions v2) Installation $ cp .env_sample .env $ docker-compose up -d $ docker-compose exec django ./manage.py migrat

CodaLab 21 Dec 02, 2022
Machine learning, in numpy

numpy-ml Ever wish you had an inefficient but somewhat legible collection of machine learning algorithms implemented exclusively in NumPy? No? Install

David Bourgin 11.6k Dec 30, 2022
The code for our paper CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention.

CrossFormer This repository is the code for our paper CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention. Introduction Existin

cheerss 238 Jan 06, 2023
Free like Freedom

This is all very much a work in progress! More to come! ( We're working on it though! Stay tuned!) Installation Open an Anaconda Prompt (in Windows, o

2.3k Jan 04, 2023
Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

An LSTM Odyssey Code for training variants of "LSTM: A Search Space Odyssey" on Fomoro. Check out the blog post. Training Install TensorFlow. Clone th

Fomoro AI 95 Apr 13, 2022
Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral)

Single-Stage Instance Shadow Detection with Bidirectional Relation Learning (CVPR 2021 Oral) Tianyu Wang*, Xiaowei Hu*, Chi-Wing Fu, and Pheng-Ann Hen

Steve Wong 51 Oct 20, 2022
sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

sequitur sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code. It implements three differ

Jonathan Shobrook 305 Dec 21, 2022
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

WebDataset WebDataset is a PyTorch Dataset (IterableDataset) implementation providing efficient access to datasets stored in POSIX tar archives and us

1.1k Jan 08, 2023
Аналитика доходности инвестиционного портфеля в Тинькофф брокере

Аналитика доходности инвестиционного портфеля Тиньков Видео на YouTube Для работы скрипта нужно установить три переменных окружения: export TINKOFF_TO

Alexey Goloburdin 64 Dec 17, 2022
Software & Hardware to do multi color printing with Sharpies

3D Print Colorizer is a combination of 3D printed parts and a Cura plugin which allows anyone with an Ender 3 like 3D printer to produce multi colored

343 Jan 06, 2023
Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

TEMOS: TExt to MOtionS Generating diverse human motions from textual descriptions Description Official PyTorch implementation of the paper "TEMOS: Gen

Mathis Petrovich 187 Dec 27, 2022
x-transformers-paddle 2.x version

x-transformers-paddle x-transformers-paddle 2.x version paddle 2.x版本 https://github.com/lucidrains/x-transformers 。 requirements paddlepaddle-gpu==2.2

yujun 7 Dec 08, 2022
Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

Bayesian Methods Research Group 56 Nov 15, 2022
A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Built with ❤️ by Sam Showalter Contents Overview Installation Dependencies Usage Scripts Standard Execution Environment Development Environment Benchm

SRI-AIC 1 Nov 18, 2021
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
Simple Python application to transform Serial data into OSC messages

SerialToOSC-Bridge Simple Python application to transform Serial data into OSC messages. The current purpose is to be a compatibility layer between ha

Division of Applied Acoustics at Chalmers University of Technology 3 Jun 03, 2021
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

In-Place Activated BatchNorm In-Place Activated BatchNorm for Memory-Optimized Training of DNNs In-Place Activated BatchNorm (InPlace-ABN) is a novel

1.3k Dec 29, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
RodoSol-ALPR Dataset

RodoSol-ALPR Dataset This dataset, called RodoSol-ALPR dataset, contains 20,000 images captured by static cameras located at pay tolls owned by the Ro

Rayson Laroca 45 Dec 15, 2022
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022