Experiments with differentiable stacks and queues in PyTorch

Related tags

Deep LearningStackNN
Overview

Please use stacknn-core instead!


StackNN

This project implements differentiable stacks and queues in PyTorch. The data structures are implemented in such a way that it should be easy to integrate them into your own models. For example, to construct a differentiable stack and perform a push:

from StackNN.structs import Stack
stack = Stack(BATCH_SIZE, STACK_VECTOR_SIZE)
read_vectors = stack(value_vectors, pop_strengths, push_strengths)

For examples of more complex use cases of this library, refer to the industrial-stacknns repository.

All the code in this repository is associated with the paper Context-Free Transductions with Neural Stacks, which appeared at the Analyzing and Interpreting Neural Networks for NLP workshop at EMNLP 2018. Refer to our paper for more theoretical background on differentiable data structures.

Running a demo

Check example.ipynb for the most up-to-date demo code.

There are several experiment configurations pre-defined in configs.py. To train a model on one of these configs, do:

python run.py CONFIG_NAME

For example, to train a model on the string reversal task:

python run.py final_reverse_config

In addition to the experiment configuration argument, run.py takes several flags:

  • --model: Model type (BufferedModel or VanillaModel)
  • --controller: Controller type (LinearSimpleStructController, LSTMSimpleStructController, etc.)
  • --struct: Struct type (Stack, NullStruct, etc.)
  • --savepath: Path for saving a trained model
  • --loadpath: Path for loading a model

Documentation

You can find auto-generated documentation here.

Contributing

This project is managed by Computational Linguistics at Yale. We welcome contributions from outside in the form of pull requests. Please report any bugs in the GitHub issues tracker. If you are a Yale student interested in joining our lab, please contact Bob Frank.

Citations

If you use this codebase in your research, please cite the associated paper:

@inproceedings{hao-etal-2018-context,
    title = "Context-Free Transductions with Neural Stacks",
    author = "Hao, Yiding  and
      Merrill, William  and
      Angluin, Dana  and
      Frank, Robert  and
      Amsel, Noah  and
      Benz, Andrew  and
      Mendelsohn, Simon",
    booktitle = "Proceedings of the 2018 {EMNLP} Workshop {B}lackbox{NLP}: Analyzing and Interpreting Neural Networks for {NLP}",
    month = nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/W18-5433",
    pages = "306--315",
    abstract = "This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modelling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex stack-augmented networks often find approximate solutions by using the stack as unstructured memory.",
}

Dependencies

The core implementation of the data structures is stable in Python 2 and 3. The specific tasks that we have implemented require Python 2.7. We use PyTorch version 0.4.1, with the following additional dependencies:

  • numpy
  • scipy (for data processing)
  • matplotlib (for visualization)
  • nltk

Using pip or conda should suffice for installing most of these dependencies. To get the right command for installing PyTorch, refer to the installation widget on the PyTorch website.

Models

A model is a pairing of a controller network with a neural data structure. There are two kinds of models:

  • models.VanillaModel is a simple controller-data structure network. This means there will be one step of computation per input.
  • models.BufferedModel adds input and output buffers to the vanilla model. This allows the network to run for extra computation steps.

To use a model, call model.forward() on every input and model.init_controller() whenever you want to reset the stack between inputs. You can find example training logic in the tasks package.

Data structures

  • structs.Stack implements the differentiable stack data structure.
  • structs.Queue implements the differentiable queue data structure.

The buffered models use read-only and write-only versions of the differentiable queue for their input and output buffers.

Tasks

The Task class defines specific tasks that models can be trained on. Below are some formal language tasks that we have explored using stack models.

String reversal

The ReverseTask trains a feed-forward controller network to do string reversal. The code generates 800 random binary strings which the network must reverse in a sequence-to-sequence fashion:

Input:   1 1 0 1 # # # #
Label:   # # # # 1 0 1 1

By 10 epochs, the model tends to achieve 100% accuracy. The config for this task is called final_reverse_config.

Context-free language modelling

CFGTask can be used to train a context-free language model. Many interesting questions probing linguistic structure can be reduced to special cases of this general task. For example, the task can be used to model a language of balanced parentheses. The configuration for the parentheses task is final_dyck_config.

Evaluation tasks

We also have a class for evaluation tasks. These are tasks where output i can be succintly expressed as some function of inputs 0, .., i. Some applications of this are evaluation of parity and reverse polish boolean formulae.

Real datasets

The data folder contains several real datasets that the stack can be trained on. We should implement a task for reading in these datasets.

Owner
Will Merrill
NLP x linguistics x theory w/ AllenNLP.
Will Merrill
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023
一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目

定时面板上的签到盒 一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 特别声明 本仓库发布的脚本及其中涉及的任何解锁和解密分析脚本,仅用于测试和学习研究,禁止用于商业用途,不能保证其合

Leon 1.1k Dec 30, 2022
A Pytorch Implementation of ClariNet

ClariNet A Pytorch Implementation of ClariNet (Mel Spectrogram -- Waveform) Requirements PyTorch 0.4.1 & python 3.6 & Librosa Examples Step 1. Downlo

Sungwon Kim 286 Sep 15, 2022
Plotting points that lie on the intersection of the given curves using gradient descent.

Plotting intersection of curves using gradient descent Webapp Link --- What's the app about Why this app Plotting functions and their intersection. A

Divakar Verma 2 Jan 09, 2022
The FIRST GANs-based omics-to-omics translation framework

OmiTrans Please also have a look at our multi-omics multi-task DL freamwork 👀 : OmiEmbed The FIRST GANs-based omics-to-omics translation framework Xi

Xiaoyu Zhang 6 Dec 14, 2022
Code Impementation for "Mold into a Graph: Efficient Bayesian Optimization over Mixed Spaces"

Code Impementation for "Mold into a Graph: Efficient Bayesian Optimization over Mixed Spaces" This repo contains the implementation of GEBO algorithm.

Jaeyeon Ahn 2 Mar 22, 2022
[NeurIPS'20] Multiscale Deep Equilibrium Models

Multiscale Deep Equilibrium Models 💥 💥 💥 💥 This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simple

CMU Locus Lab 221 Dec 26, 2022
This repository contains an implementation of the Permutohedral Attention Module in Pytorch

Permutohedral_attention_module This repository contains an implementation of the Permutohedral Attention Module

Samuel JOUTARD 26 Nov 27, 2022
TensorFlow implementation of ENet, trained on the Cityscapes dataset.

segmentation TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e

Fredrik Gustafsson 248 Dec 16, 2022
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023
Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic

NAVER/LINE Vision 30 Dec 06, 2022
基于Paddle框架的fcanet复现

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: frazerlin-fcanet 数据准备 本项目已挂

QuanHao Guo 7 Mar 07, 2022
A framework for multi-step probabilistic time-series/demand forecasting models

JointDemandForecasting.py A framework for multi-step probabilistic time-series/demand forecasting models File stucture JointDemandForecasting contains

Stanford Intelligent Systems Laboratory 3 Sep 28, 2022
Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

DiagonalGAN Official Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Trans

32 Dec 06, 2022
A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

Aboleth A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation [1] with stochastic gradient variational Bayes

Gradient Institute 127 Dec 12, 2022
A novel Engagement Detection with Multi-Task Training (ED-MTT) system

A novel Engagement Detection with Multi-Task Training (ED-MTT) system which minimizes MSE and triplet loss together to determine the engagement level of students in an e-learning environment.

Onur Çopur 12 Nov 11, 2022
A curated list of awesome game datasets, and tools to artificial intelligence in games

🎮 Awesome Game Datasets In computer science, Artificial Intelligence (AI) is intelligence demonstrated by machines. Its definition, AI research as th

Leonardo Mauro 454 Jan 03, 2023
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Sundararaman 76 Dec 06, 2022
My freqtrade strategies

My freqtrade-strategies Hi there! This is repo for my freqtrade-strategies. My name is Ilya Zelenchuk, I'm a lecturer at the SPbU university (https://

171 Dec 05, 2022
Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Trainable multi-codebook quantization This repository implements a utility for use with PyTorch, and ideally GPUs, for training an efficient quantizer

Daniel Povey 41 Jan 07, 2023