Code for 'Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning', ICCV 2021

Last update: Nov 17, 2022

Related tags

Overview

CMIC-Retrieval

Code for Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning. ICCV 2021.

Introduction

In this work, we tackle the problem of single image-based 3D shape retrieval (IBSR), where we seek to find the most matched shape of a given single 2D image from a shape repository. Most of the existing works learn to embed 2D images and 3D shapes into a common feature space and perform metric learning using a triplet loss. Inspired by the great success in recent contrastive learning works on self-supervised representation learning, we propose a novel IBSR pipeline leveraging contrastive learning. We note that adopting such cross-modal contrastive learning between 2D images and 3D shapes into IBSR tasks is non-trivial and challenging: contrastive learning requires very strong data augmentation in constructed positive pairs to learn the feature invariance, whereas traditional metric learning works do not have this requirement. However, object shape and appearance are entangled in 2D query images, thus making the learning task more difficult than contrasting single-modal data. To mitigate the challenges, we propose to use multi-view grayscale rendered images from the 3D shapes as a shape representation. We then introduce a strong data augmentation technique based on color transfer, which can significantly but naturally change the appearance of the query image, effectively satisfying the need for contrastive learning. Finally, we propose to incorporate a novel category-level contrastive loss that helps distinguish similar objects from different categories, in addition to classic instance-level contrastive loss. Our experiments demonstrate that our approach achieves the best performance on all the three popular IBSR benchmarks, including Pix3D, Stanford Cars, and Comp Cars, outperforming the previous state-of-the-art from 4% - 15% on retrieval accuracy.

About this repository

This repository provides data, pre-trained models and code.

Citations

@inProceedings{lin2021cmic,
	title={Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning},
	author={Lin, Ming-Xian and Yang, Jie and Wang, He and Lai, Yu-Kun and Jia, Rongfei and Zhao, Binqiang and Gao, Lin},
	year={2021},
	booktitle={International Conference on Computer Vision (ICCV)}
}

Updates

[Oct 1, 2021] Preliminary version of Data and Code released. For more code and data, coming soon. Please follow our updates.

Code for 'Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning', ICCV 2021

Related tags

Overview

CMIC-Retrieval

Introduction

About this repository

Citations

Updates

Owner

Conditional Gradients For The Approximately Vanishing Ideal

Diverse Branch Block: Building a Convolution as an Inception-like Unit

It's A ML based Web Site build with python and Django to find the breed of the dog

Pytorch implementation of few-shot semantic image synthesis

An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

Pointer-generator - Code for the ACL 2017 paper Get To The Point: Summarization with Pointer-Generator Networks

PGPortfolio: Policy Gradient Portfolio, the source code of "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem"(https://arxiv.org/pdf/1706.10059.pdf).

PyTorch implementation of TSception V2 using DEAP dataset

A python implementation of Deep-Image-Analogy based on pytorch.

Open-Domain Question-Answering for COVID-19 and Other Emergent Domains

A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swar.

Code and description for my BSc Project, September 2021

Instantaneous Motion Generation for Robots and Machines.

Tree Nested PyTorch Tensor Lib

A Real-World Benchmark for Reinforcement Learning based Recommender System

Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

PyContinual (An Easy and Extendible Framework for Continual Learning)

GARCH and Multivariate LSTM forecasting models for Bitcoin realized volatility with potential applications in crypto options trading, hedging, portfolio management, and risk management

This repository lets you interact with Lean through a REPL.