SwinTransformerV2-TensorFlow

A TensorFlow implementation of SwinTransformerV2 by Microsoft Research Asia, based on their official implementation of SwinTransformerV1 and their paper on V2.

Paper on Version 2 (18/11/2021): [arXiv]

Paper on Version 1 (17/08/2021): [arXiv]

Features:

TensorFlow 2 implementation of version 1 and 2 of the SwinTransformer, a state-of-the-art backbone for many contemporaty tasks in computer vision. A brief overview of the architectural changes made in version 2:

A pre-norm configuration replaces the previous post-norm configuration, meant to improve training stability in larger models.
A scaled cosine attention replaces the dot product attention in V1, with a learnable scaler.
A continuous log-spaced relative position bias is used instead of the previous parametric table approach. This is implemented here as a small MLP network and a log transform on the relative coordinates bias.

Requirements:

numpy==1.21.4
tensorflow==2.7.0
tensorflow_addons==0.15.0

Getting started

Currently writing up.

License

This project is licensed under the MIT license.

Citation

@article{liu2021Swin,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  journal={arXiv preprint arXiv:2103.14030},
  year={2021}
}

Implementation of SwinTransformerV2 in TensorFlow.

Related tags

Overview

SwinTransformerV2-TensorFlow

Features:

Requirements:

Getting started

License

Citation

Owner

Phan Nguyen

ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

It's like Shape Editor in Maya but works with skeletons (transforms).

Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Toolbox of models, callbacks, and datasets for AI/ML researchers.

The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

LegoDNN: a block-grained scaling tool for mobile vision systems

Image-popularity-score - A novel deep regression method for image scoring.

Hyperparameter Optimization for TensorFlow, Keras and PyTorch

CPPE - 5 (Medical Personal Protective Equipment) is a new challenging object detection dataset

Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4

Introduction to CPM

[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

Deep learning library featuring a higher-level API for TensorFlow.

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. ICCV 2021.

This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

BridgeGAN - Tensorflow implementation of Bridging the Gap between Label- and Reference-based Synthesis in Multi-attribute Image-to-Image Translation.

Testbed of AI Systems Quality Management

Scripts for training an AI to play the endless runner Subway Surfers using a supervised machine learning approach by imitation and a convolutional neural network (CNN) for image classification