ParmeSan: Sanitizer-guided Greybox Fuzzing

Related tags

Deep Learningparmesan
Overview

ParmeSan: Sanitizer-guided Greybox Fuzzing

License

ParmeSan is a sanitizer-guided greybox fuzzer based on Angora.

Published Work

USENIX Security 2020: ParmeSan: Sanitizer-guided Greybox Fuzzing.

The paper can be found here: ParmeSan: Sanitizer-guided Greybox Fuzzing

Building ParmeSan

See the instructions for Angora.

Basically run the following scripts to install the dependencies and build ParmeSan:

build/install_rust.sh
PREFIX=/path/to/install/llvm build/install_llvm.sh
build/install_tools.sh
build/build.sh

ParmeSan also builds a tool bin/llvm-diff-parmesan, which can be used for target acquisition.

Building a target

First build your program into a bitcode file using clang (e.g., base64.bc). Then build your target in the same way, but with your selected sanitizer enabled. To get a single bitcode file for larger projects, the easiest solution is to use gllvm.

# Build the bitcode files for target acquisition
USE_FAST=1 $(pwd)/bin/angora-clang -emit-llvm -o base64.fast.bc -c base64.bc
USE_FAST=1 $(pwd)/bin/angora-clang -fsanitize=address -emit-llvm -o base64.fast.asan.bc -c base64.bc
# Build the actual binaries to be fuzzed
USE_FAST=1 $(pwd)/bin/angora-clang -o base64.fast -c base64.bc
USE_TRACK=1 $(pwd)/bin/angora-clang -o base64.track -c base64.bc

Then acquire the targets using:

bin/llvm-diff-parmesan -json base64.fast.bc base64.fast.asan.bc

This will output a file targets.json, which you provide to ParmeSan with the -c flag.

For example:

$(pwd)/bin/fuzzer -c ./targets.json -i in -o out -t ./base64.track -- ./base64.fast -d @@

Options

ParmeSan's SanOpt option can speed up the fuzzing process by dynamically switching over to a sanitized binary only once the fuzzer reaches one of the targets specified in the targets.json file.

Enable using the -s [SANITIZED_BIN] option.

Build the sanitized binary in the following way:

USE_FAST=1 $(pwd)/bin/angora-clang -fsanitize=address -o base64.asan.fast -c base64.bc

Targets input file

The targets input file consisit of a JSON file with the following format:

{
  "targets":  [1,2,3,4],
  "edges":   [[1,2], [2,3]],
  "callsite_dominators": {"1": [3,4,5]}
}

Where the targets denote the identify of the cmp instruction to target (i.e., the id assigned by the __angora_trace_cmp() calls) and edges is the overlay graph of cmp ids (i.e., which cmps are connected to each other). The edges filed can be empty, since ParmeSan will add newly discovered edges automatically, but note that the performance will be better if you provide the static CFG.

It is also possible to run ParmeSan in pure directed mode (-D option), meaning that it will only consider new seeds if the seed triggers coverage that is on a direct path to one of the specified targets. Note that this requires a somewhat complete static CFG to work (an incomplete CFG might contain no paths to the targets at all, which would mean that no new coverage will be considered at all).

ParmeSan Screenshot

How to get started

Have a look at BUILD_TARGET.md for a step-by-step tutorial on how to get started fuzzing with ParmeSan.

FAQ

  • Q: I get a warning like ==1561377==WARNING: DataFlowSanitizer: call to uninstrumented function gettext when running the (track) instrumented program.
  • A: In many cases you can ignore this, but it will lose the taint (meaning worse performance). You need to add the function to the abilist (e.g., llvm_mode/dfsan_rt/dfsan/done_abilist.txt) and add a custom DFSan wrapper (in llvm_mode/dfsan_rt/dfsan/dfsan_custom.cc). See the Angora documentation for more info.
  • Q: I get an compiler error when building the track binary.
  • A: ParmeSan/ Angora uses DFSan for dynamic data-flow analysis. In certain cases building target applications can be a bit tricky (especially in the case of C++ targets). Make sure to disable as much inline assembly as possible and make sure that you link the correct libraries/ llvm libc++. Some programs also do weird stuff like an indirect call to a vararg function. This is not supported by DFSan at the moment, so the easy solution is to patch out these calls, or do something like indirect call promotion.
  • Q: llvm-diff-parmesan generates too many targets!
  • A: You can do target pruning using the scripts in tools/ (in particular tools/prune.py) or use ASAP to generate a target bitcode file with fewer sanitizer targets.

Docker image

You can also get the pre-built docker image of ParmeSan.

docker pull vusec/parmesan
docker run --rm -it vusec/parmesan
# In the container you can build objdump
/parmesan/misc/build_objdump.sh
Owner
VUSec
VUSec
This is the repository of our article published on MDPI Entropy "Feature Selection for Recommender Systems with Quantum Computing".

Collaborative-driven Quantum Feature Selection This repository was developed by Riccardo Nembrini, PhD student at Politecnico di Milano. See the websi

Quantum Computing Lab @ Politecnico di Milano 10 Apr 21, 2022
TinyML Cookbook, published by Packt

TinyML Cookbook This is the code repository for TinyML Cookbook, published by Packt. Author: Gian Marco Iodice Publisher: Packt About the book This bo

Packt 93 Dec 29, 2022
A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.

europilot Overview Europilot is an open source project that leverages the popular Euro Truck Simulator(ETS2) to develop self-driving algorithms. A con

1.4k Jan 04, 2023
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr

Janghoon Han 83 Dec 20, 2022
시각 장애인을 위한 스마트 지팡이에 활용될 딥러닝 모델 (DL Model Repo)

SmartCane-DL-Model Smart Cane using semantic segmentation 참고한 Github repositoy 🔗 https://github.com/JunHyeok96/Road-Segmentation.git 데이터셋 🔗 https://

반드시 졸업한다 (Team Just Graduate) 4 Dec 03, 2021
Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

ETSformer - Pytorch Implementation of ETSformer, state of the art time-series Transformer, in Pytorch Install $ pip install etsformer-pytorch Usage im

Phil Wang 121 Dec 30, 2022
这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer

Time Series Research with Torch 这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer。 建立原因 相较于mxnet和TF,Torch框架中的神经网络层需要提前指定输入维度: # 建立线性层 TensorF

Chi Zhang 85 Dec 29, 2022
An example project demonstrating how the Autonomous Learning Library can be used to build new reinforcement learning agents.

About This repository shows how Autonomous Learning Library can be used to build new reinforcement learning agents. In particular, it contains a model

Chris Nota 5 Aug 30, 2022
Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.

Containerized Streamlit web app This repository is featured in a 3-part series on Deploying web apps with Streamlit, Docker, and AWS. Checkout the blo

Collin Prather 62 Jan 02, 2023
CVPR2021 Workshop - HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization.

HDRUNet [Paper Link] HDRUNet: Single Image HDR Reconstruction with Denoising and Dequantization By Xiangyu Chen, Yihao Liu, Zhengwen Zhang, Yu Qiao an

XyChen 105 Dec 20, 2022
Pytorch implementation of Learning with Opponent-Learning Awareness

Pytorch implementation of Learning with Opponent-Learning Awareness using DiCE

Alexis David Jacq 82 Sep 15, 2022
Repository for the paper "Exploring the Sensory Spaces of English Perceptual Verbs in Natural Language Data"

Sensory Spaces of English Perceptual Verbs This repository contains the code and collocational data described in the paper "Exploring the Sensory Spac

David Peng 0 Sep 07, 2021
Quantum-enhanced transformer neural network

Example of a Quantum-enhanced transformer neural network Get the code: git clone https://github.com/rdisipio/qtransformer.git cd qtransformer Create

Riccardo Di Sipio 61 Nov 08, 2022
Tooling for converting STAC metadata to ODC data model

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

Open Data Cube 65 Dec 20, 2022
FMA: A Dataset For Music Analysis

FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information

Michaël Defferrard 1.8k Dec 29, 2022
CVPRW 2021: How to calibrate your event camera

E2Calib: How to Calibrate Your Event Camera This repository contains code that implements video reconstruction from event data for calibration as desc

Robotics and Perception Group 104 Nov 16, 2022
Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet.

Ravens is a collection of simulated tasks in PyBullet for learning vision-based robotic manipulation, with emphasis on pick and place. It features a Gym-like API with 10 tabletop rearrangement tasks,

Google Research 367 Jan 09, 2023
RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021). RTS3D is efficiency and accuracy s

71 Nov 29, 2022
Official Implementation of DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation [Arxiv] [Paper] As acquiring pixel-wise an

Lukas Hoyer 305 Dec 29, 2022
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 03, 2022