Includes PyTorch -> Keras model porting code for ConvNeXt family of models with fine-tuning and inference notebooks.

Overview

ConvNeXt-TF

This repository provides TensorFlow / Keras implementations of different ConvNeXt [1] variants. It also provides the TensorFlow / Keras models that have been populated with the original ConvNeXt pre-trained weights available from [2]. These models are not blackbox SavedModels i.e., they can be fully expanded into tf.keras.Model objects and one can call all the utility functions on them (example: .summary()).

As of today, all the TensorFlow / Keras variants of the models listed here are available in this repository except for the isotropic ones. This list includes the ImageNet-1k as well as ImageNet-21k models.

Refer to the "Using the models" section to get started. Additionally, here's a related blog post that jots down my experience.

Conversion

TensorFlow / Keras implementations are available in models/convnext_tf.py. Conversion utilities are in convert.py.

Models

The converted models are available on TF-Hub.

There should be a total of 15 different models each having two variants: classifier and feature extractor. You can load any model and get started like so:

import tensorflow as tf

model_gcs_path = "gs://tfhub-modules/sayakpaul/convnext_tiny_1k_224/1/uncompressed"
model = tf.keras.models.load_model(model_gcs_path)
print(model.summary(expand_nested=True))

The model names are interpreted as follows:

  • convnext_large_21k_1k_384: This means that the model was first pre-trained on the ImageNet-21k dataset and was then fine-tuned on the ImageNet-1k dataset. Resolution used during pre-training and fine-tuning: 384x384. large denotes the topology of the underlying model.
  • convnext_large_1k_224: Means that the model was pre-trained on the ImageNet-1k dataset with a resolution of 224x224.

Results

Results are on ImageNet-1k validation set (top-1 accuracy).

name original [email protected] keras [email protected]
convnext_tiny_1k_224 82.1 81.312
convnext_small_1k_224 83.1 82.392
convnext_base_1k_224 83.8 83.28
convnext_base_1k_384 85.1 84.876
convnext_large_1k_224 84.3 83.844
convnext_large_1k_384 85.5 85.376
convnext_base_21k_1k_224 85.8 85.364
convnext_base_21k_1k_384 86.8 86.79
convnext_large_21k_1k_224 86.6 86.36
convnext_large_21k_1k_384 87.5 87.504
convnext_xlarge_21k_1k_224 87.0 86.732
convnext_xlarge_21k_1k_384 87.8 87.68

Differences in the results are primarily because of the differences in the library implementations especially how image resizing is implemented in PyTorch and TensorFlow. Results can be verified with the code in i1k_eval. Logs are available at this URL.

Using the models

Pre-trained models:

Randomly initialized models:

from models.convnext_tf import get_convnext_model

convnext_tiny = get_convnext_model()
print(convnext_tiny.summary(expand_nested=True))

To view different model configurations, refer here.

Upcoming (contributions welcome)

  • Align layer initializers (useful if someone wanted to train the models from scratch)
  • Allow the models to accept arbitrary shapes (useful for downstream tasks)
  • Convert the isotropic models as well
  • Fine-tuning notebook (thanks to awsaf49)
  • Off-the-shelf-classification notebook
  • Publish models on TF-Hub

References

[1] ConvNeXt paper: https://arxiv.org/abs/2201.03545

[2] Official ConvNeXt code: https://github.com/facebookresearch/ConvNeXt

Acknowledgements

Owner
Sayak Paul
ML Engineer at @carted | One PR at a time
Sayak Paul
Parasite: a tool allowing you to compress and decompress files, to reduce their size

🦠 Parasite 🦠 Parasite is a tool written in Python3 allowing you to "compress" any file, reducing its size. ⭐ Features ⭐ + Fast + Good optimization,

Billy 30 Nov 25, 2022
MMDetection3D is an open source object detection toolbox based on PyTorch

MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.

OpenMMLab 3.2k Jan 05, 2023
Auto-updating data to assist in investment to NEPSE

Symbol Ratios Summary Sector LTP Undervalued Bonus % MEGA Strong Commercial Banks 368 5 10 JBBL Strong Development Banks 568 5 10 SIFC Strong Finance

Amit Chaudhary 16 Nov 01, 2022
[ICRA2021] Reconstructing Interactive 3D Scene by Panoptic Mapping and CAD Model Alignment

Interactive Scene Reconstruction Project Page | Paper This repository contains the implementation of our ICRA2021 paper Reconstructing Interactive 3D

97 Dec 28, 2022
Planner_backend - Academic planner application designed for students and counselors.

Planner (backend) Academic planner application designed for students and advisors.

2 Dec 31, 2021
Resources for the Ki testnet challenge

Ki Testnet Challenge This repository hosts ki-testnet-challenge. A set of scripts and resources to be used for the Ki Testnet Challenge What is the te

Ki Foundation 23 Aug 08, 2022
Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom Sample on-line plotting while training(avg loss)/testing(writ

Jingwei Zhang 269 Nov 15, 2022
A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

A-ESRGAN: Training Real-World Blind Super-Resolution with Attention-based U-net Discriminators The authors are hidden for the purpose of double blind

77 Dec 16, 2022
A PyTorch library for Vision Transformers

VFormer A PyTorch library for Vision Transformers Getting Started Read the contributing guidelines in CONTRIBUTING.rst to learn how to start contribut

Society for Artificial Intelligence and Deep Learning 142 Nov 28, 2022
A Python Package For System Identification Using NARMAX Models

SysIdentPy is a Python module for System Identification using NARMAX models built on top of numpy and is distributed under the 3-Clause BSD license. N

Wilson Rocha 175 Dec 25, 2022
GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

GraPE GraPE (Graph Processing and Embedding) is a fast graph processing and embedding library, designed to scale with big graphs and to run on both of

AnacletoLab 194 Dec 29, 2022
A deep learning tabular classification architecture inspired by TabTransformer with integrated gated multilayer perceptron.

The GatedTabTransformer. A deep learning tabular classification architecture inspired by TabTransformer with integrated gated multilayer perceptron. C

Radi Cho 60 Dec 15, 2022
ServiceX Transformer that converts flat ROOT ntuples into columnwise data

ServiceX_Uproot_Transformer ServiceX Transformer that converts flat ROOT ntuples into columnwise data Usage You can invoke the transformer from the co

Vis 0 Jan 20, 2022
Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

(ACMMM 2021 Oral) SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment This repository shows two tasks: Face landmark detection and Fac

BoomStar 51 Dec 13, 2022
Implements VQGAN+CLIP for image and video generation, and style transfers, based on text and image prompts. Emphasis on ease-of-use, documentation, and smooth video creation.

VQGAN-CLIP-GENERATOR Overview This is a package (with available notebook) for running VQGAN+CLIP locally, with a focus on ease of use, good documentat

Ryan Hamilton 98 Dec 30, 2022
WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking

WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking [Paper Link] Abstract In this work, we contribute a new million-scale Un

25 Jan 01, 2023
Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness (NeurIPS2021) This repository contains code for the paper "Smo

Jongheon Jeong 17 Dec 27, 2022
Deep metric learning methods implemented in Chainer

Deep Metric Learning Implementation of several methods for deep metric learning in Chainer v4.2.0. Proxy-NCA: No Fuss Distance Metric Learning using P

ronekko 156 Nov 28, 2022
Code of Puregaze: Purifying gaze feature for generalizable gaze estimation, AAAI 2022.

PureGaze: Purifying Gaze Feature for Generalizable Gaze Estimation Description Our work is accpeted by AAAI 2022. Picture: We propose a domain-general

39 Dec 05, 2022
[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Efficient Graph Similarity Computation - (EGSC) This repo contains the source code and dataset for our paper: Slow Learning and Fast Inference: Effici

23 Nov 11, 2022