A curated list of awesome resources combining Transformers with Neural Architecture Search

Overview

Awesome Transformer Architecture Search: Awesome

To keep track of the large number of recent papers that look at the intersection of Transformers and Neural Architecture Search (NAS), we have created this awesome list of curated papers and resources, inspired by awesome-autodl, awesome-architecture-search, and awesome-computer-vision. Papers are divided into the following categories:

  1. General Transformer search
  2. Domain Specific, applied Transformer search (divided into NLP, Vision, ASR)
  3. Insights on Transformer components or searchable parameters
  4. Transformer Surveys

This repository is maintained by the AutoML Group Freiburg. Please feel free to pull requests or open an issue to add papers.

General Transformer Search

Title Venue Group
UniNet: Unified Architecture Search with Convolutions, Transformer and MLP arxiv [Oct'21] SenseTime
Analyzing and Mitigating Interference in Neural Architecture Search arxiv [Aug'21] Tsinghua, MSR
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search ICCV'21 Sun Yat-sen University
Memory-Efficient Differentiable Transformer Architecture Search ACL-IJCNLP'21 MSR, Peking University
Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition arxiv [Aug'20] Google Research
AutoTrans: Automating Transformer Design via Reinforced Architecture Search arxiv [Sep'20] Fudan University
NAT: Neural Architecture Transformer for Accurate and Compact Architectures NeurIPS'19 Tencent AI
The Evolved Transformer ICML'19 Google Brain

Domain Specific Transformer Search

Vision

Title Venue Group
AutoFormer: Searching Transformers for Visual Recognition ICCV'21 MSR
GLiT: Neural Architecture Search for Global and Local Image Transformer ICCV'21 University of Sydney
Searching for Efficient Multi-Stage Vision Transformers ICCV'21 workshop MIT
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers CVPR'21 Bytedance Inc.
Vision Transformer Architecture Search arxiv [June'21] SenseTime, Tsingua University

Natural Language Processing

Title Venue Group
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models ACL'21 MIT
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search KDD'21 MSR, Tsinghua University
AutoBERT-Zero: Evolving the BERT backbone from scratch arxiv [July'21] Huawei Noah’s Ark Lab
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing ACL'20 MIT

Automatic Speech Recognition

Title Venue Group
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search ICASSP'21 MSR
Darts-Conformer: Towards Efficient Gradient-Based Neural Architecture Search For End-to-End ASR arxiv [Aug'21] NPU, Xi'an
Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search arxiv [April'21] Chinese Academy of Sciences
Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition INTERSPEECH'20 VUNO Inc.

Insights on Transformer components and interesting papers

Title Venue Group
Patches are All You Need ? ICLR'22 under review -
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows ICCV'21 best paper MSR
Rethinking Spatial Dimensions of Vision Transformers ICCV'21 NAVER AI
What makes for hierarchical vision transformers arxiv [Sept'21] HUST
AutoAttend: Automated Attention Representation Search ICML'21 Tsinghua University
Rethinking Attention with Performers ICLR'21 Oral Google
LambdaNetworks: Modeling long-range Interactions without Attention ICLR'21 Google Research
HyperGrid Transformers ICLR'21 Google Research
LocalViT: Bringing Locality to Vision Transformers arxiv [April'21] ETH Zurich
NASABN: A Neural Architecture Search Framework for Attention-Based Networks IJCNN'20 Chinese Academy of Sciences
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned ACL'19 Yandex

Transformer Surveys

Title Venue Group
Transformers in Vision: A Survey arxiv [Oct'21] MBZ University of AI
Efficient Transformers: A Survey arxiv [Sept'21] Google Research

Misc resources

Owner
Yash Mehta
Researcher, deep learning 🍁 Previously @GatsbyUCL, @NTUsingapore, @AmazonSDE
Yash Mehta
Implementation of QuickDraw - an online game developed by Google, combined with AirGesture - a simple gesture recognition application

QuickDraw - AirGesture Introduction Here is my python source code for QuickDraw - an online game developed by google, combined with AirGesture - a sim

Viet Nguyen 89 Dec 18, 2022
Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Attention Is All You Need Paper Implementation This is my from-scratch implementation of the original transformer architecture from the following pape

Brando Koch 195 Dec 30, 2022
A tool for making map images from OpenTTD save games

OpenTTD Surveyor A tool for making map images from OpenTTD save games. This is not part of the main OpenTTD codebase, nor is it ever intended to be pa

Aidan Randle-Conde 9 Feb 15, 2022
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-

Khanh Nguyen 131 Oct 21, 2022
The official PyTorch implementation of paper BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition

BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition Boyan Zhou, Quan Cui, Xiu-Shen Wei*, Zhao-Min Chen This repo

Megvii-Nanjing 616 Dec 21, 2022
This is the official implementation of TrivialAugment and a mini-library for the application of multiple image augmentation strategies including RandAugment and TrivialAugment.

Trivial Augment This is the official implementation of TrivialAugment (https://arxiv.org/abs/2103.10158), as was used for the paper. TrivialAugment is

AutoML-Freiburg-Hannover 94 Dec 30, 2022
Adversarial examples to the new ConvNeXt architecture

Adversarial examples to the new ConvNeXt architecture To get adversarial examples to the ConvNeXt architecture, run the Colab: https://github.com/stan

Stanislav Fort 19 Sep 18, 2022
IA for recognising Traffic Signs using Keras [Tensorflow]

Traffic Signs Recognition ⚠️ 🚦 Fundamentals of Intelligent Systems Introduction 📄 Development of a neural network capable of recognizing nine differ

Sebastián Fernández García 2 Dec 19, 2022
Identifying Stroke Indicators Using Rough Sets

Identifying Stroke Indicators Using Rough Sets With the spirit of reproducible research, this repository contains all the codes required to produce th

Muhammad Salman Pathan 0 Jun 09, 2022
This is an official implementation for "Video Swin Transformers".

Video Swin Transformer By Ze Liu*, Jia Ning*, Yue Cao, Yixuan Wei, Zheng Zhang, Stephen Lin and Han Hu. This repo is the official implementation of "V

Swin Transformer 981 Jan 03, 2023
Recursive Bayesian Networks

Recursive Bayesian Networks This repository contains the code to reproduce the results from the NeurIPS 2021 paper Lieck R, Rohrmeier M (2021) Recursi

Robert Lieck 11 Oct 18, 2022
training script for space time memory network

Trainig Script for Space Time Memory Network This codebase implemented training code for Space Time Memory Network with some cyclic features. Requirem

Yuxi Li 100 Dec 20, 2022
A JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations

jaxdf - JAX-based Discretization Framework Overview | Example | Installation | Documentation ⚠️ This library is still in development. Breaking changes

UCL Biomedical Ultrasound Group 65 Dec 23, 2022
[ICCV2021] IICNet: A Generic Framework for Reversible Image Conversion

IICNet - Invertible Image Conversion Net Official PyTorch Implementation for IICNet: A Generic Framework for Reversible Image Conversion (ICCV2021). D

felixcheng97 55 Dec 06, 2022
Anime Face Detector using mmdet and mmpose

Anime Face Detector This is an anime face detector using mmdetection and mmpose. (To avoid copyright issues, I use generated images by the TADNE model

198 Jan 07, 2023
The openspoor package is intended to allow easy transformation between different geographical and topological systems commonly used in Dutch Railway

Openspoor The openspoor package is intended to allow easy transformation between different geographical and topological systems commonly used in Dutch

7 Aug 22, 2022
Codes for building and training the neural network model described in Domain-informed neural networks for interaction localization within astroparticle experiments.

Domain-informed Neural Networks Codes for building and training the neural network model described in Domain-informed neural networks for interaction

DIDACTS 0 Dec 13, 2021
The implementation of 'Image synthesis via semantic composition'.

Image synthesis via semantic synthesis [Project Page] by Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia. Introduction This repository gives

DV Lab 71 Jan 06, 2023
MolRep: A Deep Representation Learning Library for Molecular Property Prediction

MolRep: A Deep Representation Learning Library for Molecular Property Prediction Summary MolRep is a Python package for fairly measuring algorithmic p

AI-Health @NSCC-gz 83 Dec 24, 2022
Data pipelines for both TensorFlow and PyTorch!

rapidnlp-datasets Data pipelines for both TensorFlow and PyTorch ! If you want to load public datasets, try: tensorflow/datasets huggingface/datasets

1 Dec 08, 2021