A tool that automatically creates fuzzing harnesses based on a library

Overview

AutoHarness

Created by Akshat Parikh

What is this tool?

AutoHarness is a tool that automatically generates fuzzing harnesses for you. This idea stems from a concurrent problem in fuzzing codebases today: large codebases have thousands of functions and pieces of code that can be embedded fairly deep into the library. It is very hard or sometimes even impossible for smart fuzzers to reach that codepath. Even for large fuzzing projects such as oss-fuzz, there are still parts of the codebase that are not covered in fuzzing. Hence, this program tries to alleviate this problem in some capacity as well as provide a tool that security researchers can use to initially test a code base. This program only supports code bases which are coded in C and C++.

Setup/Demonstration

This program utilizes llvm and clang for libfuzzer, Codeql for finding functions, and python for the general program. This program was tested on Ubuntu 20.04 with llvm 12 and python 3. Here is the initial setup.

sudo apt-get update;
sudo apt-get install python3 python3-pip llvm-12* clang-12 git;
pip3 install pandas lief subprocess os argparse ast;

Follow the installation procedure for Codeql on https://github.com/github/codeql. Make sure to install the CLI tools and the libraries. For my testing, I have stored both the tools and libraries under one folder. Finally, clone this repository or download a release. Here is the program's output after running on nginx with the multiple argument mode set. This is the command I used.

python3 harness.py -L /home/akshat/nginx-1.21.0/objs/ -C /home/akshat/codeql-h/ -M 1 -O /home/akshat/autoharness/ -D nginx -G 1 -Y 1 -F "-I /home/akshat/nginx-1.21.0/objs -I /home/akshat/nginx-1.21.0/src/core -I /home/akshat/nginx-1.21.0/src/event -I /home/akshat/nginx-1.21.0/src/http -I /home/akshat/nginx-1.21.0/src/mail -I /home/akshat/nginx-1.21.0/src/misc -I /home/akshat/nginx-1.21.0/src/os -I /home/akshat/nginx-1.21.0/src/stream -I /home/akshat/nginx-1.21.0/src/os/unix" -X ngx_config.h,ngx_core.h

Results: image It is definitely possible to raise the success by further debugging the compilation and adding more header files and more. Note the nginx project does not have any shared objects after compiling. However, this program does have a feature that can convert PIE executables into shared libraries.

Planned Features (in order of progress)

  1. Struct Fuzzing

The current way implemented in the program to fuzz functions with multiple arguments is by using fuzzing data provider. There are some improvements to make in this integration; however, I believe I can incorporate this feature with data structures. A problem which I come across when coding this is with codeql and nested structs. It becomes especially hard without writing multiple queries which vary for every function. In short, this feature needs more work. I was also thinking about a simple solution using protobufs.

  1. Implementation Based Harness Creation

Using codeql, it is possible to use to generate a control flow graph that maps how the parameters in a function are initialized. Using that information, we can create a better harness. Another way is to look for implementations for the function that exist in the library and use that information to make an educated guess on an implementation of the function as a harness. The problems I currently have with this are generating the control flow graphs with codeql.

  1. Parallelized fuzzing/False Positive Detection

I can create a simple program that runs all the harnesses and picks up on any of the common false positives using ASAN. Also, I can create a new interface that runs all the harnesses at once and displays their statistics.

Contribution/Bugs

If you find any bugs with this program, please create an issue. I will try to come up with a fix. Also, if you have any ideas on any new features or how to implement performance upgrades or the current planned features, please create a pull request or an issue with the tag (contribution).

PSA

This tool generates some false positives. Please first analyze the crashes and see if it is valid bug or if it is just an implementation bug. Also, you can enable the debug mode if some functions are not compiling. This will help you understand if there are some header files that you are missing or any linkage issues. If the project you are working on does not have shared libraries but an executable, make sure to compile the executable in PIE form so that this program can convert it into a shared library.

References

  1. https://lief.quarkslab.com/doc/latest/tutorials/08_elf_bin2lib.html
Comments
  • Fuzzers never compile

    Fuzzers never compile

    Hello, I made it to the final step in your README:

    python3 harness.py -L /root/nginx/ -C /root/codeql/ -M 1 -O /root/autoharness/ -D nginx -G 1 -Y 1 -F "-I /root/nginx/objs -I /root/nginx/src/core -I /root/nginx/src/event -I /root/nginx/src/http -I /root/nginx/src/mail -I /root/nginx/src/misc -I /root/nginx/src/os -I /root/nginx/src/stream -I /root/nginx/src/os/unix" -X ngx_config.h,ngx_core.h

    and it appears to be doing something:

    Compiling query plan for /root/codeql/multiargfunc.ql.
    [1/1 comp 19.4s] Compiled /root/codeql/multiargfunc.ql.
    Starting evaluation of codeql-cpp/multiargfunc.ql.
    [1/1 eval 4.9s] Evaluation done; writing results to /root/autoharness/multiarg.bqrs.
    Shutting down query evaluator.
    

    however, it never builds any fuzzers, this is all that appears on screen:

                      f             g                                                  t
    0             _Exit          void                                              [int]
    1        __asprintf           int    [const char *__restrict__, char **__restrict__]
    2    __asprintf_chk           int  [int, const char *__restrict__, char **__restr...
    3        __bswap_16    __uint16_t                                       [__uint16_t]
    4        __bswap_32    __uint32_t                                       [__uint32_t]
    ..              ...           ...                                                ...
    812         waitpid       __pid_t                              [int, __pid_t, int *]
    813        wcstombs        size_t  [size_t, char *__restrict__, const wchar_t *__...
    814          wctomb           int                                  [char *, wchar_t]
    815           write       ssize_t                        [int, size_t, const void *]
    816          zError  const char *                                              [int]
    
    [817 rows x 3 columns]****
    

    Am using Ubuntu clang version 12.0.0-3ubuntu1~21.04.2 and have installed all of the requirements as mentioned in the README. Any tips would be much appreciated.

    opened by geeknik 11
  • Could not resolve type ...

    Could not resolve type ...

    Hi, The following command failed to generate harness.

    python3 harness.py -L /local/codeql-home/nginx-1.21.1/objs/ -C /local/codeql-home/codeql/ -M 1 -O /local/nginx/autoharness/ -D nginx -G 1 -Y 1 -F "-I /local/codeql-home/nginx-1.21.1/objs -I /local/codeql-home/nginx-1.21.1/src/core -I /local/codeql-home/nginx-1.21.1/src/event -I /local/codeql-home/nginx-1.21.1/src/http -I /local/codeql-home/nginx-1.21.1/src/mail -I /local/codeql-home/nginx-1.21.1/src/misc -I /local/codeql-home/nginx-1.21.1/src/os -I /local/codeql-home/nginx-1.21.1/src/stream -I /local/codeql-home/nginx-1.21.1/src/os/unix" -X ngx_config.h,ngx_core.h
    

    Error messages:

    Compiling query plan for /local/codeql-home/codeql/multiargfunc.ql.
    ERROR: Could not resolve module cpp. There should probably be a qlpack.yml file declaring dependencies in /local/codeql-home/codeql or an enclosing directory. (/local/codeql-home/codeql/multiargfunc.ql:1,8-11)
    ERROR: Could not resolve type Type (/local/codeql-home/codeql/multiargfunc.ql:3,1-5)
    ERROR: Could not resolve type Parameter (/local/codeql-home/codeql/multiargfunc.ql:3,30-39)
    ERROR: Could not resolve type PointerType (/local/codeql-home/codeql/multiargfunc.ql:6,40-51)
    ERROR: Could not resolve type Type (/local/codeql-home/codeql/multiargfunc.ql:9,1-5)
    ERROR: Could not resolve type Parameter (/local/codeql-home/codeql/multiargfunc.ql:9,27-36)
    ERROR: Could not resolve type PointerType (/local/codeql-home/codeql/multiargfunc.ql:10,65-76)
    ERROR: Could not resolve type Function (/local/codeql-home/codeql/multiargfunc.ql:13,6-14)
    ERROR: Could not resolve type Type (/local/codeql-home/codeql/multiargfunc.ql:13,18-22)
    ERROR: Could not resolve type Parameter (/local/codeql-home/codeql/multiargfunc.ql:14,18-27)
    ERROR: Could not resolve type Struct (/local/codeql-home/codeql/multiargfunc.ql:14,91-97)
    ERROR: 'result' cannot be used in this context (/local/codeql-home/codeql/multiargfunc.ql:4,3-9)
    ERROR: 'result' cannot be used in this context (/local/codeql-home/codeql/multiargfunc.ql:6,3-9)
    ERROR: 'result' cannot be used in this context (/local/codeql-home/codeql/multiargfunc.ql:10,3-9)
    ERROR: 'result' cannot be used in this context (/local/codeql-home/codeql/multiargfunc.ql:10,47-53)
    
    opened by JerryWang304 10
  • Master

    Master

    Lots of changes, some that I can remember:

    • Factoring out code to create bash commands into command_builder.py.
    • Change from readelf to nm to only harness dynamic, defined, exported function symbols.
    • Ignore functions with void * or array arguments. Arrays may be supported later.
    • Get parameters in the right order.
    • Handle const.

    Note: only tested using mode=1 and detect=1 on libsodium, with other settings you'll likely still get lots of errors but let's fix those by creating and resolving issues.

    opened by Jelle-Nauta 0
  • Support array parameters

    Support array parameters

    Functions with array parameters are currently ignored because they would lead to code like:

    auto data1= provider.ConsumeIntegral<char[16]>();
    

    Proposal: handle array-parameters as a special case, e.g. generating code like:

    char data1[16];
    for (size_t idx = 0; idx < 16; ++idx) {
        data1[idx] = provider.ConsumeIntegral<char>();
    }
    

    There may be a more elegant way to do this, but I don't see it at the moment.

    opened by Jelle-Nauta 0
  • Create test suite

    Create test suite

    Currently there are many different combinations of settings (mode, detect) and library-properties (exported or not, defined or not, function or other symbols, etc.) and many of these probably lead to errors, e.g. through uncompilable harness code.

    Proposal: create a minimal set of libraries with a comprehensive set of symbols and properties, and a test workflow to verify that autoharness works - or find bugs that can then be addressed.

    opened by Jelle-Nauta 0
Releases(1.0)
  • 1.0(Jul 10, 2021)

    Initial Release of AutoHarness -added executable to shared object functionality -added automatic header detection or function reconstruction -added automatic fuzzing harness creation for one argument and multiple arguments

    Source code(tar.gz)
    Source code(zip)
MobaXterm-GenKey

MobaXterm-GenKey 你懂的!! 本地启动 需要安装Python3!!!

malaohu 328 Dec 29, 2022
A StarkNet project template based on a Pythonic environment

StarkNet Project Template This is an opinionated StarkNet project template. It is based around the Python's ecosystem and best practices. tox to manag

Francesco Ceccon 5 Apr 21, 2022
Tugas kelompok Struktur Data

Binary-Tree Tugas kelompok Struktur Data Silahkan jika ingin mengubah tipe data pada operasi binary tree *Boleh juga semua program kelompok bisa disat

Usmar manalu 2 Nov 28, 2022
Streamlit apps done following data professor's course on YouTube

streamlit-twelve-apps Streamlit apps done following data professor's course on YouTube Español Curso de apps de data science hecho por Data Professor

Federico Bravin 1 Jan 10, 2022
Gerador de dafaces

🎴 DefaceGenerator Obs: esse script foi criado com a intenção de ajudar pessoas iniciantes no hacking que ainda não conseguem criar suas próprias defa

LordShinigami 3 Jan 09, 2022
A python API act as Control Center to control your Clevo Laptop via wmi on windows.

ClevoPyControlCenter A python API act as Control Center to control your Clevo Laptop via wmi on windows. Usage # pip3 install pymi from clevo_wmi impo

3 Sep 19, 2022
:snake: Complete C99 parser in pure Python

pycparser v2.20 Contents 1 Introduction 1.1 What is pycparser? 1.2 What is it good for? 1.3 Which version of C does pycparser support? 1.4 What gramma

Eli Bendersky 2.8k Dec 29, 2022
A Python script to delete movies with a certain tag after a certain amount of days.

radarr_autodelete Simple script, which deletes movies with a specific tag after a certain amount of days Pip Packages pip3 install pyarr python-dotenv

7 Dec 06, 2022
A Python script to convert your favorite TV series into an Anki deck.

Ankiniser A Python3.8 script to convert your favorite TV series into an Anki deck. How to install? Download the script with git or download it manualy

37 Nov 03, 2022
Architecture example simulator

SCADA architecture Example of a SCADA-like console application, used to serve as a minimal example of a standard architecture of an IIoT system. Insta

1 Nov 06, 2021
Buildium-to-stessa - Automation to assist in converting Buildium transactions into Stessa format

Buildium Transactions - Stessa Transactions There is currently no third-party i

Austin Comstock 4 Apr 17, 2022
AMTIO aka All My Tools in One

AMTIO AMTIO aka All My Tools In One. I plan to put a bunch of my tools in this one repo since im too lazy to make one big tool. Installation git clone

osintcat 3 Jul 29, 2021
A program that makes all 47 textures of Optifine CTM only using 2 textures

A program that makes all 47 textures of Optifine CTM only using 2 textures

1 Jan 22, 2022
A python module for DeSo

DeSo.py A python package for DeSo. Developed by ItsAditya Run pip install deso to install the module! Examples of How To Use DeSo.py Getting $DeSo pri

ItsAditya 0 Jun 30, 2022
a simple functional programming language compiler written in python

Functional Programming Language A compiler for my small functional language. Written in python with SLY lexer/parser generator library. Requirements p

Ashkan Laei 3 Nov 05, 2021
A curated collection of Amazing Python scripts from Basics to Advance with automation task scripts

📑 Introduction A curated collection of Amazing Python scripts from Basics to Advance with automation task scripts. This is your Personal space to fin

Amitesh kumar mishra 1 Jan 22, 2022
ChieriBot,词云API版,用于统计群友说过的怪话

wordCloud_API 词云API版,用于统计群友说过的怪话,基于wordCloud 消息储存在mysql数据库中.数据表结构见table.sql 为啥要做成API:这玩意太吃性能了,如果和Bot放在同一个服务器,可能会影响到bot的正常运行 你服务器性能够用的话就当我在放屁 依赖包 pip i

chinosk 7 Mar 20, 2022
A refresher for PowerBI Desktop documents

PowerBI_Refresher-NPP Informació Per executar el programa s'ha de tenir instalat el python versio 3 o mes. Requeriments a requirements.txt. El fitxer

Nil Pujol 1 May 02, 2022
Um pequeno painel de consulta grátis.

[PAINEL-DE-CONSULTA 3.8(BETA)] · Confira meu canal do YouTube. Clique aqui! Nota: Próxima Atualização será a última com coisas novas, o resto será par

276 Jan 05, 2023
OWASP Foundation Web Respository

WWWGrep OWASP Foundation Web Respository Author: Mark Deen & Aditi Mohan Introduction WWWGrep is a rapid search “grepping” mechanism that examines HTM

OWASP 34 Jun 15, 2022