πŸ† The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Overview

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval

πŸ† The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

framework

We have two codebases. For the final submission, we conduct the feature ensemble, where features are from two codebases.

Part One is at here: https://github.com/ShuaiBai623/AIC2021-T5-CLV

Part Two is at here: https://github.com/layumi/NLP-AICity2021

Prepare

  • Preprocess the dataset to prepare frames, motion maps, NLP augmentation

scripts/extract_vdo_frms.py is a Python script that is used to extract frames.

scripts/get_motion_maps.py is a Python script that is used to get motion maps.

scripts/deal_nlpaug.py is a Python script that is used for NLP augmentation.

  • Download the pretrained models of Part One to checkpoints. The checkpoints can be found here. The best score of a single model on TestA is 0.1927 from motion_effb3_NOCLS_nlpaug_320.pth.

The directory structures in data and checkpoints are as follows:

.
β”œβ”€β”€ checkpoints
β”‚   β”œβ”€β”€ motion_effb2_1CLS_nlpaug_288.pth
β”‚   β”œβ”€β”€ motion_effb3_NOCLS_nlpaug_320.pth
β”‚   β”œβ”€β”€ motion_SE_3CLS_nonlpaug_288.pth
β”‚   β”œβ”€β”€ motion_SE_NOCLS_nlpaug_288.pth
β”‚   └── motion_SE_NOCLS_nonlpaug_288.pth
└── data
    β”œβ”€β”€ AIC21_Track5_NL_Retrieval
    β”‚   β”œβ”€β”€ train
    β”‚   └── validation
    β”œβ”€β”€ motion_map 
    β”œβ”€β”€ test-queries.json
    β”œβ”€β”€ test-queries_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
    β”œβ”€β”€ test-tracks.json
    β”œβ”€β”€ train.json
    β”œβ”€β”€ train_nlpaug.json
    β”œβ”€β”€ train-tracks.json
    β”œβ”€β”€ train-tracks_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
    β”œβ”€β”€ val.json
    └── val_nlpaug.json             ## NLP augmentation (Refer to scripts/deal_nlpaug.py)

Part One

  • Modify the data paths in config.py

Train

The configuration files are in configs.

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main.py --name your_experiment_name --config your_config_file |tee log

Test

Change the RESTORE_FROM in your configuration file.

python -u test.py --config your_config_file

Extract the visual and text embeddings. The extracted embeddings can be found here.

python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nlpaug_288.yaml
python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_3CLS_nonlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nonlpaug_288.yaml

Part Two

Link

Submission

During the inference, we average all the frame features of the target in each track as track features, the embeddings of text descriptions are also averaged as the query features. The cosine distance is used for ranking as the final result.

  • Reproduce the best submission. ALL extracted embeddings are in the folder output:
python scripts/get_submit.py

Friend Links:

This repository is for the preprint "A generative nonparametric Bayesian model for whole genomes"

BEAR Overview This repository contains code associated with the preprint A generative nonparametric Bayesian model for whole genomes (2021), which pro

Debora Marks Lab 10 Sep 18, 2022
Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Semantic-NeRF: Semantic Neural Radiance Fields Project Page | Video | Paper | Data In-Place Scene Labelling and Understanding with Implicit Scene Repr

Shuaifeng Zhi 243 Jan 07, 2023
Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Deep Learning for image classification pip install -r http://webia.lip6.fr/~baskiotisn/requirements-amal.txt Train an autoencoder python3 train_auto

Hector Kohler 0 Mar 30, 2022
Learning Time-Critical Responses for Interactive Character Control

Learning Time-Critical Responses for Interactive Character Control Abstract This code implements the paper Learning Time-Critical Responses for Intera

Movement Research Lab 227 Dec 31, 2022
Use unsupervised and supervised learning to predict stocks

AIAlpha: Multilayer neural network architecture for stock return prediction This project is meant to be an advanced implementation of stacked neural n

Vivek Palaniappan 1.5k Dec 26, 2022
Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

UPT: Unary–Pairwise Transformers This repository contains the official PyTorch implementation for the paper Frederic Z. Zhang, Dylan Campbell and Step

Frederic Zhang 109 Dec 20, 2022
Dynamic wallpaper generator.

Wiki β€’ About β€’ Installation About This project is a dynamic wallpaper changer. It waits untill you turn on the music, downloads album cover if it's po

3 Sep 18, 2021
PyTorch Implementation of Region Similarity Representation Learning (ReSim)

ReSim This repository provides the PyTorch implementation of Region Similarity Representation Learning (ReSim) described in this paper: @Article{xiao2

Tete Xiao 74 Jan 03, 2023
The implementation of the algorithm in the paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020.

DS3L This is the code for paper "Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data" published in ICML 2020. Setups The code is implem

Guolz 36 Oct 19, 2022
MDMM - Learning multi-domain multi-modality I2I translation

Multi-Domain Multi-Modality I2I translation Pytorch implementation of multi-modality I2I translation for multi-domains. The project is an extension to

Hsin-Ying Lee 107 Nov 04, 2022
68 keypoint annotations for COFW test data

68 keypoint annotations for COFW test data This repository contains manually annotated 68 keypoints for COFW test data (original annotation of CFOW da

31 Dec 06, 2022
LETR: Line Segment Detection Using Transformers without Edges

LETR: Line Segment Detection Using Transformers without Edges Introduction This repository contains the official code and pretrained models for Line S

mlpc-ucsd 157 Jan 06, 2023
RTSeg: Real-time Semantic Segmentation Comparative Study

Real-time Semantic Segmentation Comparative Study The repository contains the official TensorFlow code used in our papers: RTSEG: REAL-TIME SEMANTIC S

Mennatullah Siam 592 Nov 18, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
StarGAN v2 - Official PyTorch Implementation (CVPR 2020)

StarGAN v2 - Official PyTorch Implementation StarGAN v2: Diverse Image Synthesis for Multiple Domains Yunjey Choi*, Youngjung Uh*, Jaejun Yoo*, Jung-W

Clova AI Research 3.1k Jan 09, 2023
Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Despite its importance, there are few previous works applying I2I translation to webtoon. I collected dataset from naver webtoon μ—°μ• ν˜λͺ… and tried to transfer human faces to webtoon domain.

μ΄μƒμœ€ 64 Oct 19, 2022
ML course - EPFL Machine Learning Course, Fall 2021

EPFL Machine Learning Course CS-433 Machine Learning Course, Fall 2021 Repository for all lecture notes, labs and projects - resources, code templates

EPFL Machine Learning and OptimizationΒ Laboratory 1k Jan 04, 2023
Finite Element Analysis

FElupe - Finite Element Analysis FElupe is a Python 3.6+ finite element analysis package focussing on the formulation and numerical solution of nonlin

Andreas D. 20 Jan 09, 2023
Official repository of "DeepMIH: Deep Invertible Network for Multiple Image Hiding", TPAMI 2022.

DeepMIH: Deep Invertible Network for Multiple Image Hiding (TPAMI 2022) This repo is the official code for DeepMIH: Deep Invertible Network for Multip

Junpeng Jing 67 Nov 22, 2022
A collection of Jupyter notebooks to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

StyleGAN3 CLIP-based guidance StyleGAN3 + CLIP StyleGAN3 + inversion + CLIP This repo is a collection of Jupyter notebooks made to easily play with St

Eugenio Herrera 176 Dec 30, 2022