Tzer: TVM Implementation of "Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation (OOPSLA'22)“.

Related tags

Deep Learningtzer
Overview


ArtifactReproduce BugsQuick StartInstallationExtend Tzer

Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation

This is the source code repo for "Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation" (Conditionally accepted by OOPSLA'22).

Artifact

Please check here for detailed documentation of the artifact prepared for OOPSLA'22.

Reproduce Bugs

Till submission, Tzer has been detected 40 bugs for TVM with 30 confirmed and 24 fixed (merged in the latest branch). Due to the anonymous review policy of OOPSLA, the links of actual bug reports will be provided after the review process.

We provide strong reproducibility of our work. To reproduce all bugs, all you need to do is a single click Open In Colab on your browser. Since some bugs need to be triggered by some complex GPU settings, to maximumly ease the hardware and software effort, the bugs are summarized in a Google Colab environment (No GPU required, but just a browser!).

Quick Start

You can easily start using Tzer with docker.

docker run --rm -it tzerbot/oopsla

# Inside the image.
cd tzer
python3 src/main_tir.py --fuzz-time 10     --report-folder ten-minute-fuzz
#                       run for 10 min.    bugs in folder `ten-minute-fuzz`

Successful installation looks like:

Report folder contents [click to expand]
  • cov_by_time.txt: a csv file where columns means "time" (second) and edge coverage;
  • ${BUG_TYPE}_${BUG_ID}.error_message.txt: error message snapshot of failures;
  • ${BUG_TYPE}_${BUG_ID}.ctx: context data to reproduce bugs (stored in Pickle. See config.py)
  • meta.txt: metadata including git version of TVM and experiment time;
  • tir_by_time.pickle: generated <F, P> (i.e., TIR and Passes) files (if TIR_REC=1 is set);
  • valid_seed_new_cov_count.txt: number of generated valid tests with new coverage;
Main commandline options [click to expand]

Commandline options (added as tail of commands):

  • --fuzz-time: Time budget of fuzzing (minute);
  • --tolerance: Parameter $N_{max}$ in the paper (control the interleaving of IR and pass mutation);
  • --report-folder: Path to store results (e.g., coverage trend);

Environment variables to control the algorithm options (added the prefix of commands):

  • PASS=1 to enable pass mutation;
  • NO_SEEDS=1 to disable initial seeds (start from an empty function);
  • NO_COV=1 to disable the coverage feedback;
  • TIR_REC=1to record generated TIR files (for evaluating non-coverage version);
Reproduce ablation study [click to expand]
# (1): General IR Mutation (No Coverage)*
TVM_HOME=$TVM_NO_COV_HOME PYTHONPATH=$TVM_HOME/python TIR_REC=1 NO_COV=1 python3 src/main_tir.py --fuzz-time 240 --report-folder ablation-1
python3 src/get_cov.py --folders ablation-1 # Evaluate samples on instrumented TVM to get coverage results.

# (2): (1) + Coverage Guidance
python3 src/main_tir.py --fuzz-time 240 --report-folder ablation-2

# (3): (2) + Domain-Specific IR Mutation
LOW=1 python3 src/main_tir.py --fuzz-time 240 --report-folder ablation-3

# (4): (3) + Random Pass Mutation
PASS=1 RANDOM_PASS=1 LOW=1 python3 src/main_tir.py --fuzz-time 240 --report-folder ablation-4

# (5): (3) + Evolutionary IR-Pass Mutation
# aka: Best Tzer! Pleasse use this command if you want to compare Tzer with your own system~
PASS=1 LOW=1 python3 src/main_tir.py --fuzz-time 240 --report-folder ablation-5 --tolerance 4

Note that fuzzing is performance-sensitive: To obtain reliable results, evaluation should be conducted in a "clean" environment (e.g., close irrelavant processes as many as possible). To determine how "clean" your environment is, you can log the load average of your Linux system. Expected load average should be around 1 or lower (as what we did in the experiments).

Installation

Expected requirements [click to expand]
  • Hardware: 8GB RAM; 256G Storage; X86 CPU; Good Network to GitHub; Docker (for Docker installation)
  • Software: Linux (tested under Manjaro and Ubuntu20.04. Other Linux distributions should also work)

We provide 3 methods for installing Tzer:

Docker Hub (Recommended, Out-of-the-box!) [click to expand]

Directly run Tzer in pre-built container image! Make sure you have docker installed.

docker run --rm -it tzerbot/oopsla
Docker Build (10~20 min., for customized development) [click to expand]

Build Tzer under a docker environment! Make sure you have docker installed.

  1. git clone https://github.com/Tzer-AnonBot/tzer.git && cd tzer
  2. docker build --tag tzer-oopsla:eval .
  3. docker run --rm -it tzer-oopsla:eval
Manual Build (20~30 min., for customized dev. and native performance) [click to expand]
Build Tzer natively on your Linux:

Prepare dependencies:

# Arch Linux / Manjaro
sudo pacman -Syy
sudo pacman -S compiler-rt llvm llvm-libs compiler-rt clang cmake git python3
# Ubuntu
sudo apt update
sudo apt install -y libfuzzer-12-dev # If you fail, try "libfuzzer-11-dev", "-10-dev", ...
sudo apt install -y clang cmake git python3

Build TVM and Tzer:

git clone https://github.com/Tzer-AnonBot/tzer.git
cd tzer/tvm_cov_patch

# Build TVM with intruments
bash ./build_tvm.sh # If you fail, check the script for step-by-step instruction;
cd ../../../
# If success:
# tvm with coverage is installed under `tvm_cov_patch/tvm`
# tvm without coverage is under `tvm_cov_patch/tvm-no-cov`

# Install Python dependency
python3 -m pip install -r requirements.txt

# Set up TVM_HOME and PYTHONPATH env var before using TVM and Tzer.
export TVM_HOME=$(realpath tvm_cov_patch/tvm)
export TVM_NO_COV_HOME=$(realpath tvm_cov_patch/tvm-no-cov)
export PYTHONPATH=$TVM_HOME/python

Extend Tzer

We implemented many re-usable functionalities for future and open research! To easily implement other coverage-guided fuzzing algorithm for TVM, after your installing TVM with memcov by applying tvm_cov_patch/memcov4tvm.patch to TVM (See tvm_cov_patch/build_tvm.sh), you can get current coverage of TVM by:

from tvm.contrib import coverage

print(coverage.get_now()) # Current visited # of CFG edges
print(coverage.get_total()) # Total number of # of CFG edges

coverage.push() # store current coverage snapshot to a stack and reset it to empty (useful for multi-process scenario)
coverage.pop()  # merge the top snapshot from the stack. 

Usage push-pop combo: Some times the target program might crash, but we don't want the fuzzer to be affected by the failure. Therefore, you can set a "safe guard" by:

  1. push: save current snapshot and reset the coverage hitmap;
  2. raise a sub-process to compile target IR & passes with TVM;
  3. pop: merge the snapshot of the sub-process and last stored snapshot (top of the stack) to get a complete coverage.

Latency of the combo is optimized to ~1ms as we applied bit-level optimization.

Cite Us

Please cite our paper if you find our contributions are helpful. :-)

@inproceedings{tzer-2022,
  title={Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation},
  author={Liu, Jiawei and Wei, Yuxiang and Yang, Sen and Deng, Yinlin and Zhang, Lingming},
  booktitle={Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications},
  year={2022}
}
You might also like...
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Master Docs License Apache MXNet (incubating) is a deep learning framework designed for both efficiency an

MOpt-AFL provided by the paper "MOPT: Optimized Mutation Scheduling for Fuzzers"

MOpt-AFL 1. Description MOpt-AFL is a AFL-based fuzzer that utilizes a customized Particle Swarm Optimization (PSO) algorithm to find the optimal sele

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

SCL Introduction Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)' We evaluated our approach using two baseline

Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"

DFSA Unofficial pytorch implementation of the ICCV 2021 paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution" (p

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

STCN Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [a

FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack
FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack

FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack Case study of the FCA. The code can be find in FCA. Cas

This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

How to Implement a First-Order Low-Pass Filter in Discrete Time We often teach or learn about filters in continuous time, but then need to implement t

Releases(tvm-0.8.dev1040)
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis

OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis Overview OpenABC-D is a large-scale labeled dataset generate

NYU Machine-Learning guided Design Automation (MLDA) 31 Nov 22, 2022
Code for the paper "Unsupervised Contrastive Learning of Sound Event Representations", ICASSP 2021.

Unsupervised Contrastive Learning of Sound Event Representations This repository contains the code for the following paper. If you use this code or pa

Eduardo Fonseca 81 Dec 22, 2022
EZ graph is an easy to use AI solution that allows you to make and train your neural networks without a single line of code.

EZ-Graph EZ Graph is a GUI that allows users to make and train neural networks without writing a single line of code. Requirements python 3 pandas num

1 Jul 03, 2022
Fast and simple implementation of RL algorithms, designed to run fully on GPU.

RSL RL Fast and simple implementation of RL algorithms, designed to run fully on GPU. This code is an evolution of rl-pytorch provided with NVIDIA's I

Robotic Systems Lab - Legged Robotics at ETH Zürich 68 Dec 29, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
Fast, accurate and reliable software for algebraic CT reconstruction

KCT CBCT Fast, accurate and reliable software for algebraic CT reconstruction. This set of software tools includes OpenCL implementation of modern CT

Vojtěch Kulvait 4 Dec 14, 2022
[NeurIPS 2021] "Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems"

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems Introduction Multi-agent control i

VITA 6 May 05, 2022
Python framework for Stochastic Differential Equations modeling

SDElearn: a Python package for SDE modeling This package implements functionalities for working with Stochastic Differential Equations models (SDEs fo

4 May 10, 2022
[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Garment4D [PDF] | [OpenReview] | [Project Page] Overview This is the codebase for our NeurIPS 2021 paper Garment4D: Garment Reconstruction from Point

Fangzhou Hong 112 Dec 23, 2022
Learning to Segment Instances in Videos with Spatial Propagation Network

Learning to Segment Instances in Videos with Spatial Propagation Network This paper is available at the 2017 DAVIS Challenge website. Check our result

Jingchun Cheng 145 Sep 28, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

📖 Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) 🔥 If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 503 Jan 04, 2023
smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the COVID-19 case by Storvik et al

smc.covid smc.covid is an R package related to the paper A sequential Monte Carlo approach to estimate a time varying reproduction number in infectiou

0 Oct 15, 2021
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 108 Dec 27, 2022
Weakly-supervised semantic image segmentation with CNNs using point supervision

Code for our ECCV paper What's the Point: Semantic Segmentation with Point Supervision. Summary This library is a custom build of Caffe for semantic i

27 Sep 14, 2022
FG-transformer-TTS Fine-grained style control in transformer-based text-to-speech synthesis

LST-TTS Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis. Submitted to ICASSP 2022. Audi

Li-Wei Chen 64 Dec 30, 2022
Project Tugas Besar pertama Pengenalan Komputasi Institut Teknologi Bandung

Vending_Machine_(Mesin_Penjual_Minuman) Project Tugas Besar pertama Pengenalan Komputasi Institut Teknologi Bandung Raw Sketch untuk Essay Ringkasan P

QueenLy 1 Nov 08, 2021
Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021)

Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces(ICML 2021) This repository contains the code

149 Dec 15, 2022
DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation This project hosts the code for implementing the DCT-MASK algorithms

Alibaba Cloud 57 Nov 27, 2022
Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

Playground4AWS Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021. Architecture Minecraft and Lamps This project i

Vinicius Senger 5 Nov 30, 2022
Codes for NeurIPS 2021 paper "On the Equivalence between Neural Network and Support Vector Machine".

On the Equivalence between Neural Network and Support Vector Machine Codes for NeurIPS 2021 paper "On the Equivalence between Neural Network and Suppo

Leslie 8 Oct 25, 2022