Code of the paper "Part Detector Discovery in Deep Convolutional Neural Networks" by Marcel Simon, Erik Rodner and Joachim Denzler

Overview

Part Detector Discovery

This is the code used in our paper "Part Detector Discovery in Deep Convolutional Neural Networks" by Marcel Simon, Erik Rodner and Joachim Denzler published at ACCV 2014. If you would like to refer to this work, please cite the corresponding paper

@inproceedings{Simon14:PDD,
  author = {Marcel Simon and Erik Rodner and Joachim Denzler},
  booktitle = {Asian Conference on Computer Vision (ACCV)},
  title = {Part Detector Discovery in Deep Convolutional Neural Networks},
  year = {2014},
}

The following steps will guide you through the usage of the code.

1. Python Environment

Setup a python environment, preferably a virtual environment using e. g. virtual_env. The requirements file might install more than you need.

virtualenv pyhton-env && pip install -r requirements.txt

2. DeCAF Installation

Build and install decaf into this environment

source python-env/bin/activate
cd decaf-tools/decaf/
python setup.py build
python setup.py install

3. Pre-Trained ImageNet Model

Get the decaf ImageNet model:

cd decaf-tools/models/
bash get_model.sh

You now might need to adjust the path to the decaf model in decaf-tools/extract_grad_map.py, line 75!

4. Gradient Map Calculation

Now you can calculate the gradient maps using the following command. For a single image, use decaf-tools/extract_grad_map.py :

usage: extract_grad_map.py [-h] [--layers LAYERS [LAYERS ...]] [--limit LIMIT]
                           [--channel_limit CHANNEL_LIMIT]
                           [--images pattern [pattern ...]] [--outdir OUTDIR]

Calculate the gradient maps for an image.

optional arguments:
  -h, --help            show this help message and exit
  --layers LAYERS [LAYERS ...]
  --limit LIMIT         When calculating the gradient of the class scores,
                        calculate the gradient for the output elements with the
                        [limit] highest probabilities.
  --channel_limit CHANNEL_LIMIT
                        Sets the number of channels per layer you want to
                        calculate the gradient of.
  --images pattern [pattern ...]
			Absolute image path to the image. You can use wildcards.
  --outdir OUTDIR

For a list of absolute image paths call this script this way:

python extract_grad_map.py --images $(cat /path/to/imagelist.txt) --limit 1 --channel_limit 256 --layers probs pool5 --outdir /path/to/output/

The gradient maps are stored as Matlab .mat file and as png. In addition to these, the script also generates A html file to view the gradient maps and the input image. The gradient map is placed in the directory outdir/images'_parent_dir/image_filename/*. Be aware that approx. 45 MiB of storage is required per input image. For the whole CUB200-2011 dataset this means a total storage size of approx 800 GiB!

5. Part Localization

Apply the part localization using GMM fitting or maximum finding. Have a look in the part_localization folder for that. Open calcCUBPartLocs.m and adjust the paths. Now simply run calcCUBPartLocs(). This will create a file which has the same format as the part_locs.txt file of the CUB200-2011 dataset. You can use it for part-based classification.

6. Classification

We also provide the classification framework to use these part localizations and feature extraction with DeCAF. Go to the folder classification and open partEstimationDeepLearing.m. Have a look at line 40 and adjust the path such that it points to the correct file. Open settings.m and adjust the paths. Next, open settings.m and adjust the paths to liblinear and the virtual python environment. Now you can execute for example:

init
recRate = experimentParts('cub200_2011',200, struct('descriptor','plain','preprocessing_useMask','none','preprocessing_cropToBoundingbox',0), struct('partSelection',[1 2 3 9 14],'bothSymmetricParts',0,'descriptor','plain','trainPartLocation','est','preprocessing_relativePartSize',1.0/8,'preprocessing_cropToBoundingbox',0))

This will evaluate the classification performance on the standard train-test-split using the estimated part locations. Experiment parts has four parameters. The first one tell the function which dataset to use. You want to keep 'cub200_2011' here.

The second one is the number of classes to use, 3, 14 and 200 is supported here. Next is the setup for the global feature extraction. The only important setting is preprocessing_cropToBoundingbox. A value of 0 will tell the function not to use the ground truth bounding box during testing. You should leave the other two options as shown here.

The last one is the setup for the part features. You can select here, which parts you want to use and if you want to extract features from both symmetric parts, if both are visible. Since the part detector discovery associates some parts with the same channel, the location prediction will be the same for these. In this case, only select the parts which have unique channels here. In the example, the part 1, 2, 3, 9 and 14 are associated with different channels.

'trainPartLocation' tells the function, if grount-truth ('gt') or estimated ('est') part locations should be used for training. Since the discovered part detectors do not necessarily relate to semantic parts, 'est' usually is the better option here.

'preprocessing_relativePartSize' adjusts the size of patches, that are extracted at the estimated part locations. Please have a look at the paper for more information.

For the remaining options, you should keep everything as it is.

Acknowledgements

The classification framework is an extension of the excellent fine-grained recognition framework by Christoph Göring, Erik Rodner, Alexander Freytag and Joachim Denzler. You can find their project at https://github.com/cvjena/finegrained-cvpr2014.

Our work is based on DeCAF, a framework for convolutional neural networks. You can find the repository of the corresponding project at https://github.com/UCB-ICSI-Vision-Group/decaf-release/ .

License

Part Detector Discovery Framework by Marcel Simon, Erik Rodner and Joachim Denzler is licensed under the non-commercial license Creative Commons Attribution 4.0 International License. For usage beyond the scope of this license, please contact Marcel Simon.

You might also like...
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

Image Crop Analysis This is a repo for the code used for reproducing our Image Crop Analysis paper as shared on our blog post. If you plan to use this

Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

Code for the ICML 2021 paper
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

Data and Code for ACL 2021 Paper
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Introduction Code and data for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning". We cons

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

This repository contains code for the following two papers: VisualBERT: A Simple and Performant Baseline for Vision and Language (arxiv) with a short

Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.
Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Non-Rigid Neural Radiance Fields This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synt

Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.

Shortformer This repository contains the code and the final checkpoint of the Shortformer model. This file explains how to run our experiments on the

Open source code for Paper
Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

A Co-Interactive Transformer for Joint Slot Filling and Intent Detection This repository contains the PyTorch implementation of the paper: A Co-Intera

A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

Releases(v1.0)
Owner
Computer Vision Group Jena
Computer Vision Group Jena
Julia package for multiway (inverse) covariance estimation.

TensorGraphicalModels TensorGraphicalModels.jl is a suite of Julia tools for estimating high-dimensional multiway (tensor-variate) covariance and inve

Wayne Wang 3 Sep 23, 2022
🛠️ Tools for Transformers compression using Lightning ⚡

Bert-squeeze is a repository aiming to provide code to reduce the size of Transformer-based models or decrease their latency at inference time.

Jules Belveze 66 Dec 11, 2022
PaRT: Parallel Learning for Robust and Transparent AI

PaRT: Parallel Learning for Robust and Transparent AI This repository contains the code for PaRT, an algorithm for training a base network on multiple

Mahsa 0 May 02, 2022
[AAAI 2021] EMLight: Lighting Estimation via Spherical Distribution Approximation and [ICCV 2021] Sparse Needlets for Lighting Estimation with Spherical Transport Loss

EMLight: Lighting Estimation via Spherical Distribution Approximation (AAAI 2021) Update 12/2021: We release our Virtual Object Relighting (VOR) Datas

Fangneng Zhan 144 Jan 06, 2023
Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

Nvdiffrast – Modular Primitives for High-Performance Differentiable Rendering Modular Primitives for High-Performance Differentiable Rendering Samuli

NVIDIA Research Projects 675 Jan 06, 2023
YOLOX-RMPOLY

本算法为适应robomaster比赛,而改动自矩形识别的yolox算法。 基于旷视科技YOLOX,实现对不规则四边形的目标检测 TODO 修改onnx推理模型 更改/添加标注: 1.yolox/models/yolox_polyhead.py: 1.1继承yolox/models/yolo_

3 Feb 25, 2022
Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Elias Kassapis 31 Nov 22, 2022
An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models. Hyperactive: is very easy to lear

Simon Blanke 422 Jan 04, 2023
Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

TDEER 🦌 🦒 Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021) Overview TDEE

33 Dec 23, 2022
Open-source code for Generic Grouping Network (GGN, CVPR 2022)

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity Pytorch implementation for "Open-World Instance Segmen

Meta Research 99 Dec 06, 2022
Colar: Effective and Efficient Online Action Detection by Consulting Exemplars, CVPR 2022.

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars This repository is the official implementation of Colar. In this work,

LeYang 246 Dec 13, 2022
Lightweight Python library for adding real-time object tracking to any detector.

Norfair is a customizable lightweight Python library for real-time 2D object tracking. Using Norfair, you can add tracking capabilities to any detecto

Tryolabs 1.7k Jan 05, 2023
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

flownet2-pytorch Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Multiple GPU training is supported, a

NVIDIA Corporation 2.8k Dec 27, 2022
[EMNLP 2021] MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity Representations

MuVER This repo contains the code and pre-trained model for our EMNLP 2021 paper: MuVER: Improving First-Stage Entity Retrieval with Multi-View Entity

24 May 30, 2022
Official code repository for the EMNLP 2021 paper

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization PyTorch code for the EMNLP 2021 paper "Integrating Visuospatia

Adyasha Maharana 23 Dec 19, 2022
Self-supervised learning algorithms provide a way to train Deep Neural Networks in an unsupervised way using contrastive losses

Self-supervised learning Self-supervised learning algorithms provide a way to train Deep Neural Networks in an unsupervised way using contrastive loss

Arijit Das 2 Mar 26, 2022
This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Pytorch Medical Segmentation Read Chinese Introduction:Here! Recent Updates 2021.1.8 The train and test codes are released. 2021.2.6 A bug in dice was

EasyCV-Ellis 618 Dec 27, 2022
A Peer-to-peer Platform for Secure, Privacy-preserving, Decentralized Data Science

PyGrid is a peer-to-peer network of data owners and data scientists who can collectively train AI models using PySyft. PyGrid is also the central serv

OpenMined 615 Jan 03, 2023
Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training"

Saliency Guided Training Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training" by Aya Abdelsalam Ismail, Hector Cor

8 Sep 22, 2022
10th place solution for Google Smartphone Decimeter Challenge at kaggle.

Under refactoring 10th place solution for Google Smartphone Decimeter Challenge at kaggle. Google Smartphone Decimeter Challenge Global Navigation Sat

12 Oct 25, 2022