Feature extraction made simple with torchextractor

Last update: Oct 31, 2022

Overview

`torchextractor`: PyTorch Intermediate Feature Extraction

Introduction

Too many times some model definitions get remorselessly copy-pasted just because the forward function does not return what the person expects. You provide module names and torchextractor takes care of the extraction for you.It's never been easier to extract feature, add an extra loss or plug another head to a network. Ler us know what amazing things you build with torchextractor!

Installation

pip install torchextractor  # stable
pip install git+https://github.com/antoinebrl/torchextractor.git  # latest

Requirements:

Python >= 3.6+
torch >= 1.4.0

Usage

import torch
import torchvision
import torchextractor as tx

model = torchvision.models.resnet18(pretrained=True)
model = tx.Extractor(model, ["layer1", "layer2", "layer3", "layer4"])
dummy_input = torch.rand(7, 3, 224, 224)
model_output, features = model(dummy_input)
feature_shapes = {name: f.shape for name, f in features.items()}
print(feature_shapes)

# {
#   'layer1': torch.Size([1, 64, 56, 56]),
#   'layer2': torch.Size([1, 128, 28, 28]),
#   'layer3': torch.Size([1, 256, 14, 14]),
#   'layer4': torch.Size([1, 512, 7, 7]),
# }

See more examples

Read the documentation

FAQ

• How do I know the names of the modules?

You can print all module names like this:

tx.list_module_names(model)

# OR

for name, module in model.named_modules():
    print(name)

• Why do some operations not get listed?

It is not possible to add hooks if operations are not defined as modules. Therefore, F.relu cannot be captured but nn.Relu() can.

• How can I avoid listing all relevant modules?

You can specify a custom filtering function to hook the relevant modules:

# Hook everything !
module_filter_fn = lambda module, name: True

# Capture of all modules inside first layer
module_filter_fn = lambda module, name: name.startswith("layer1")

# Focus on all convolutions
module_filter_fn = lambda module, name: isinstance(module, torch.nn.Conv2d)

model = tx.Extractor(model, module_filter_fn=module_filter_fn)

• Is it compatible with ONNX?

tx.Extractor is compatible with ONNX! This means you can also access intermediate features maps after the export.

Pro-tip: name the output nodes by using output_names when calling torch.onnx.export.

• Is it compatible with TorchScript?

Not yet, but we are working on it. Compiling registered hook of a module was just recently added in PyTorch v1.8.0.

• "One more thing!" 😉

By default we capture the latest output of the relevant modules, but you can specify your own custom operations.

For example, to accumulate features over 10 forward passes you can do the following:

import torch
import torchvision
import torchextractor as tx

model = torchvision.models.resnet18(pretrained=True)

def capture_fn(module, input, output, module_name, feature_maps):
    if module_name not in feature_maps:
        feature_maps[module_name] = []
    feature_maps[module_name].append(output)

extractor = tx.Extractor(model, ["layer3", "layer4"], capture_fn=capture_fn)

for i in range(20):
    for i in range(10):
        x = torch.rand(7, 3, 224, 224)
        model(x)
    feature_maps = extractor.collect()

    # Do your stuffs here

    # Discard collected elements
    extractor.clear_placeholder()

Contributing

All feedbacks and contributions are welcomed. Feel free to report an issue or to create a pull request!

If you want to get hands-on:

(Fork and) clone the repo.
Create a virtual environment: virtualenv -p python3 .venv && source .venv/bin/activate
Install dependencies: pip install -r requirements.txt && pip install -r requirements-dev.txt
Hook auto-formatting tools: pre-commit install
Hack as much as you want!
Run tests: python -m unittest discover -vs ./tests/
Share your work and create a pull request.

To Build documentation:

cd docs
pip install requirements.txt
make html

You might also like...

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Deep Image Search - AI-Based Image Search Engine Deep Image Search is an AI-based image search engine that includes deep transfer learning features Ex

139 Jan 1, 2023

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Cross-media Structured Common Space for Multimedia Event Extraction Table of Contents Overview Requirements Data Quickstart Citation Overview The code

49 Nov 21, 2022

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

146 Nov 29, 2022

Training data extraction on GPT-2

Training data extraction from GPT-2 This repository contains code for extracting training data from GPT-2, following the approach outlined in the foll

62 Dec 7, 2022

This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

Polygonal Building Segmentation by Frame Field Learning We add a frame field output to an image segmentation neural network to improve segmentation qu

186 Jan 4, 2023

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference. ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, scikit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio, video, etc.) and machine learning tasks (classification, object detection, speech recognition, generation, certification, etc.).

3.4k Jan 4, 2023

Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

SSAN Introduction This is the pytorch implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Menti

69 Nov 15, 2022

An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

AnalyticMesh Analytic Marching is an exact meshing solution from neural networks. Compared to standard methods, it completely avoids geometric and top

45 Dec 21, 2022

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

REval Table of Contents Introduction Overview Requirements Installation Probing Usage Citation License 🎓 Introduction REval is a simple framework for

13 Jan 6, 2023

Comments

Only extracting part of the intermediate feature with DataParallel

Hi @antoinebrl,

I am using torch.nn.DataParallel on a 2-GPU machine with a batch size of N. Data parallel training will split the input data batch into 2 pieces sequentially and sends them to GPUs.

When using torchextractor to obtain the intermediate feature, the input data size and the output size are both N as expected, but the feature size becomes N/2. Does this mean we only extract the features of one GPU? I'm not sure because I didn't find an exact match.

Can you please explain why this happens? Maybe the normal behavior is returning features from all GPUs or from a specified one?

A minimal example to reproduce:

import torch
import torchvision
import torchextractor as tx

model = torchvision.models.resnet18(pretrained=True)
model_gpu = torch.nn.DataParallel(torchvision.models.resnet18(pretrained=True))
model_gpu.cuda()

model = tx.Extractor(model, ["layer1"])
model_gpu = tx.Extractor(model_gpu, ["module.layer1"])
dummy_input = torch.rand(8, 3, 224, 224)
_, features = model(dummy_input)
_, features_gpu = model_gpu(dummy_input)
feature_shapes = {name: f.shape for name, f in features.items()}
print(feature_shapes)
feature_shapes_gpu = {name: f.shape for name, f in features_gpu.items()}
print(feature_shapes_gpu)

# {'layer1': torch.Size([8, 64, 56, 56])}
# {'module.layer1': torch.Size([4, 64, 56, 56])}

opened by wydwww 5

Feature extraction made simple with torchextractor

Related tags

Overview

torchextractor: PyTorch Intermediate Feature Extraction

Introduction

Installation

Usage

FAQ

Contributing

You might also like...

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

Training data extraction on GPT-2

This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

An Efficient Implementation of Analytic Mesh Algorithm for 3D Iso-surface Extraction from Neural Networks

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Comments

Only extracting part of the intermediate feature with DataParallel

Releases(v0.3.0)

v0.3.0(Mar 7, 2021)

v0.2.0(Mar 6, 2021)

Owner

Antoine Broyelle

Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

YOLOv7 - Framework Beyond Detection

This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds

A Python parser that takes the content of a text file and then reads it into variables.

Python scripts form performing stereo depth estimation using the high res stereo model in PyTorch .

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch

SuRE Evaluation: A Supplementary Material

BLEND: A Fast, Memory-Efficient, and Accurate Mechanism to Find Fuzzy Seed Matches

Differentiable Quantum Chemistry (only Differentiable Density Functional Theory and Hartree Fock at the moment)

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

A Gura parser implementation for Python

Uses Open AI Gym environment to create autonomous cryptocurrency bot to trade cryptocurrencies.

Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

Py-FEAT: Python Facial Expression Analysis Toolbox

Tiny-NewsRec: Efﬁcient and Effective PLM-based News Recommendation

Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

Pose estimation with MoveNet Lightning

`torchextractor`: PyTorch Intermediate Feature Extraction