Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Last update: Dec 21, 2022

Overview

Towards End-to-End Image Compression and Analysis with Transformers

Source code of our AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Usage

The code is run with Python 3.7, Pytorch 1.8.1, Timm 0.4.9 and Compressai 1.1.4.

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class2/
      img4.jpeg

Pretrained model

The ./pretrained_model provides the pretrained model without compression.

Test

Please adjust --data-path and run sh test.sh:

python main.py --eval --resume ./pretrain_s/checkpoint.pth --model pretrained_model --data-path /path/to/imagenet/ --output_dir ./eval

The ./pretrain_s/checkpoint.pth can be downloaded from Baidu Netdisk, with access code aaai.

Train

Please adjust --data-path and run sh train.sh:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model pretrained_model --no-model-ema --clip-grad 1.0 --batch-size 128 --num_workers 16 --data-path /path/to/imagenet/ --output_dir ./ckp_pretrain

Full model

The ./full_model provides the full model with compression.

Test

Please adjust --data-path and --resume, respectively. Run sh test.sh:

python main.py --eval --resume ./ckp_s_q1/checkpoint.pth --model full_model --no-pretrained --data-path /path/to/imagenet/ --output_dir ./eval

The ./ckp_s_q1/checkpoint.pth, ./ckp_s_q2/checkpoint.pth and ./ckp_s_q3/checkpoint.pth can be downloaded from Baidu Netdisk, with access code aaai.

Train

Please download ./pretrain_s/checkpoint.pth from Baidu Netdisk with access code aaai, adjust --data-path and --quality, respectively.

quality	alpha	beta
1	0.1	0.001
2	0.3	0.003
3	0.6	0.006

Run sh train.sh:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model full_model --batch-size 128 --num_workers 16 --clip-grad 1.0 --quality 1 --data-path /path/to/imagenet/ --output_dir ./ckp_full

Citation

@InProceedings{Bai2022AAAI,
  title={Towards End-to-End Image Compression and Analysis with Transformers},
  author={Bai, Yuanchao and Yang, Xu and Liu, Xianming and Jiang, Junjun and Wang, Yaowei and Ji, Xiangyang and Gao, Wen},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2022}
}

Source code of AAAI 2022 paper "Towards End-to-End Image Compression and Analysis with Transformers".

Related tags

Overview

Towards End-to-End Image Compression and Analysis with Transformers

Usage

Data preparation

Pretrained model

Full model

Citation

Owner

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models

Fast Differentiable Matrix Sqrt Root

Official repo of the paper "Surface Form Competition: Why the Highest Probability Answer Isn't Always Right"

This is the official repository of Music Playlist Title Generation: A Machine-Translation Approach.

🍷 Gracefully claim weekly free games and monthly content from Epic Store.

This is the official code release for the paper Shape and Material Capture at Home

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

Generalized Random Forests

Implementation of the ALPHAMEPOL algorithm, presented in Unsupervised Reinforcement Learning in Multiple Environments.

Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

Neural Module Network for VQA in Pytorch

Identifying a Training-Set Attack’s Target Using Renormalized Influence Estimation

From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

This is our ARTS test set, an enriched test set to probe Aspect Robustness of ABSA.

Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis

The challenge for Quantum Coalition Hackathon 2021

A simple python module to generate anchor (aka default/prior) boxes for object detection tasks.

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting