Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Last update: Dec 18, 2022

Overview

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

This paper submitted to TIP is the extension of the previous Arxiv paper.

This project aims to

provide a baseline of pedestrian attribute recognition.
provide two new datasets RAPzs and PETAzs following zero-shot pedestrian identity setting.
provide a general training pipeline for pedestrian attribute recognition and multi-label classification task.

This project provide

DDP training, which is mainly used for multi-label classifition.
Training on all attributes, testing on "selected" attribute. Because the proportion of positive samples for other attributes is less than a threshold, such as 0.01.
1. For PETA and PETAzs, 35 of the 105 attributes are selected for performance evaluation.
2. For RAPv1, 51 of the 92 attributes are selected for performance evaluation.
3. For RAPv2 and RAPzs, 54 and 53 of the 152 attributes are selected for performance evaluation.
4. For PA100k, all attributes are selected for performance evaluation.
- However, training on all attributes can not bring consistent performance improvement on various datasets.
EMA model.
Transformer-base model, such as swin-transformer (with a huge performance improvement) and vit.
Convenient dataset info file like dataset_all.pkl

Dataset Info

PETA: Pedestrian Attribute Recognition At Far Distance [Paper][Project]
PA100K[Paper][Github]
RAP : A Richly Annotated Dataset for Pedestrian Attribute Recognition
- v1 [Paper][Project]
- v2 [Paper][Project]
PETAzs & RAPzs : Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting Paper [Project]

Performance

Pedestrian Attribute Recognition

Datasets	Models	ma	Acc	Prec	Rec	F1
PA100k	resnet50	80.21	79.15	87.79	87.01	87.40
--	resnet50*	79.85	79.13	89.45	85.40	87.38
--	resnet50 + EMA	81.97	80.20	88.06	88.17	88.11
--	bninception	79.13	78.19	87.42	86.21	86.81
--	TresnetM	74.46	68.72	79.82	80.71	80.26
--	swin_s	82.19	80.35	87.85	88.51	88.18
--	vit_s	79.40	77.61	86.41	86.22	86.32
--	vit_b	81.01	79.38	87.60	87.49	87.55
PETA	resnet50	83.96	78.65	87.08	85.62	86.35
PETAzs	resnet50	71.43	58.69	74.41	69.82	72.04
RAPv1	resnet50	79.27	67.98	80.19	79.71	79.95
RAPv2	resnet50	78.52	66.09	77.20	80.23	78.68
RAPzs	resnet50	71.76	64.83	78.75	76.60	77.66

The resnet* model is trained by using the weighted function proposed by Tan in AAAI2020.
Performance in PETAzs and RAPzs based on the first version of PETAzs and RAPzs as described in paper.
Experiments are conducted on the input size of (256, 192), so there may be minor differences from the results in the paper.
The reported performance can be achieved at the first drop of learning rate. We also take this model as the best model.
Pretrained models are provided now at Google Drive.

Multi-label Classification

Datasets	Models	mAP	CP	CR	CF1	OP	OR	OF1
COCO	resnet101	82.75	84.17	72.07	77.65	85.16	75.47	80.02

Pretrained Models

Dependencies

python 3.7
pytorch 1.7.0
torchvision 0.8.2
cuda 10.1

Get Started

Run git clone https://github.com/valencebond/Rethinking_of_PAR.git
Create a directory to dowload above datasets.
```
cd Rethinking_of_PAR
mkdir data
```

Prepare datasets to have following structure:

${project_dir}/data
    PETA
        images/
        PETA.mat
        dataset_all.pkl
        dataset_zs_run0.pkl
    PA100k
        data/
        dataset_all.pkl
    RAP
        RAP_dataset/
        RAP_annotation/
        dataset_all.pkl
    RAP2
        RAP_dataset/
        RAP_annotation/
        dataset_zs_run0.pkl
    COCO14
        train2014/
        val2014/
        ml_anno/
            category.json
            coco14_train_anno.pkl
            coco14_val_anno.pkl

Train baseline based on resnet50
```
sh train.sh
```

Acknowledgements

Codes are based on the repository from Dangwei Li and Houjing Huang. Thanks for their released code.

Citation

If you use this method or this code in your research, please cite as:

@article{jia2021rethinking,
  title={Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting},
  author={Jia, Jian and Huang, Houjing and Chen, Xiaotang and Huang, Kaiqi},
  journal={arXiv preprint arXiv:2107.03576},
  year={2021}
}

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting

Related tags

Overview

Rethinking of Pedestrian Attribute Recognition: A Reliable Evaluation under Zero-Shot Pedestrian Identity Setting (official Pytorch implementation)

This project aims to

This project provide

Dataset Info

Performance

Pedestrian Attribute Recognition

Multi-label Classification

Pretrained Models

Dependencies

Get Started

Acknowledgements

Citation

Owner

Jian

FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Intrusion Test Tool with Python

Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Dataset and Source code of paper 'Enhancing Keyphrase Extraction from Academic Articles with their Reference Information'.

Spatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch

Search and filter videos based on objects that appear in them using convolutional neural networks

[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.

Using Machine Learning to Create High-Res Fine Art

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Code repository accompanying the paper "On Adversarial Robustness: A Neural Architecture Search perspective"

A simple program for training and testing vit

On Effective Scheduling of Model-based Reinforcement Learning

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

Referring Video Object Segmentation

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting