Google Landmark Recogntion and Retrieval 2021 Solutions

Last update: Nov 25, 2022

Related tags

Overview

Google Landmark Recogntion and Retrieval 2021 Solutions

In this repository you can find solution and code for Google Landmark Recognition 2021 and Google Landmark Retrieval 2021 competitions (both in top-100).

Brief Summary

My solution is based on the latest modeling from the previous competition and strong post-processing based on re-ranking and using side models like detectors. I used single RTX 3080, EfficientNet B0 and only competition data for training.

Model and loss function

I used the same model and loss as the winner team of the previous competition as a base. Since I had only single RTX 3080, I hadn't enough time to experiment with that and change it. The only things I managed to test is Subcenter ArcMarginProduct as the last block of model and ArcFaceLossAdaptiveMargin loss function, which has been used by the 2nd place team in the previous year. Both those things gave me a signifact score boost (around 4% on CV and 5% on LB).

Setting up the training and validation

Optimizing and scheduling

Optimizer - Ranger (lr=0.003)
Scheduler - CosineAnnealingLR (T_max=12) + 1 epoch Warm-Up

Training stages

I found the best perfomance in training for 15 epochs and 5 stages:

(1-3) - Resize to image size, Horizontal Flip
(4-6) - Resize to bigger image size, Random Crop to image size, Horizontal Flip
(7-9) - Resize to bigger image size, Random Crop to image size, Horizontal Flip, Coarse Dropout with one big square (CutMix)
(10-12) - Resize to bigger image size, Random Crop to image size, Horizontal Flip, FMix, CutMix, MixUp
(13-15) - Resize to bigger image size, Random Crop to image size, Horizontal Flip

I used default Normalization on all the epochs.

Validation scheme

Since I hadn't enough hardware, this became my first competition where I wasn't able to use a K-fold validation, but at least I saw stable CV and CV/LB correlation at the previous competitions, so I used simple stratified train-test split in 0.8, 0.2 ratio. I also oversampled all the samples up to 5 for each class.

Inference and Post-Processing:

Change class to non-landmark if it was predicted more than 20 times .
Using pretrained YoloV5 for detecting non-landmark images. All classes are used, boxes with confidence < 0.5 are dropped. If total area of boxes is greater than total_image_area / 2.7, the sample is marked as non-landmark. I tried to use YoloV5 for cleaning the train dataset as well, but it only decreased a score.
Tuned post-processing from this paper, based on the cosine similarity between train and test images to non-landmark ones.
Higher image size for extracting embeddings on inference.
Also using public train dataset as an external data for extracting embeddings.

Didn't work for me

Knowledge Distillation
Resnet architectures (on average they were worse than effnets)
Adding an external non-landmark class to training from 2019 test dataset
Train binary non-landmark classifier

Transfer Learning on the full dataset and Label Smoothing should be useful here, but I didn't have time to test it.

Google Landmark Recogntion and Retrieval 2021 Solutions

Related tags

Overview

Google Landmark Recogntion and Retrieval 2021 Solutions

Brief Summary

Model and loss function

Setting up the training and validation

Optimizing and scheduling

Training stages

Validation scheme

Inference and Post-Processing:

Didn't work for me

Owner

Vadim Timakin

Code for "R-GCN: The R Could Stand for Random"

Active Offline Policy Selection With Python

PyTorch implementation for the paper Pseudo Numerical Methods for Diffusion Models on Manifolds

frida工具的缝合怪

Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search

AI Summer's complete catalog of articles

Pointer-generator - Code for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Networks

Data pipelines for both TensorFlow and PyTorch!

1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

This repository provides some of the code implemented and the data used for the work proposed in "A Cluster-Based Trip Prediction Graph Neural Network Model for Bike Sharing Systems".

Introduction to AI assignment 1 HCM University of Technology, term 211

Pytorch implementation of forward and inverse Haar Wavelets 2D

A naive ROS interface for visualDet3D.

Reinforcement Learning for Portfolio Management

A GUI for Face Recognition, based upon Docker, Tkinter, GPU and a camera device.

An index of recommendation algorithms that are based on Graph Neural Networks.

pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang