Find target hash collisions for Apple's NeuralHash perceptual hash function.💣

Overview

neural-hash-collider

Find target hash collisions for Apple's NeuralHash perceptual hash function.

For example, starting from a picture of this cat, we can find an adversarial image that has the same hash as the picture of the dog in this post:

python collide.py --image cat.jpg --target 59a34eabe31910abfb06f308

Cat image with NeuralHash 59a34eabe31910abfb06f308 Dog image with NeuralHash 59a34eabe31910abfb06f308

We can confirm the hash collision using nnhash.py from AsuharietYgvar/AppleNeuralHash2ONNX:

$ python nnhash.py dog.png
59a34eabe31910abfb06f308
$ python nnhash.py adv.png
59a34eabe31910abfb06f308

How it works

NeuralHash is a perceptual hash function that uses a neural network. Images are resized to 360x360 and passed through a neural network to produce a 128-dimensional feature vector. Then, the vector is projected onto R^96 using a 128x96 "seed" matrix. Finally, to produce a 96-bit hash, the 96-dimensional vector is thresholded: negative entries turn into a 0 bit, and non-negative entries turn into a 1 bit.

This entire process, except for the thresholding, is differentiable, so we can use gradient descent to find hash collisions. This is a well-known property of neural networks, that they are vulnerable to adversarial examples.

We can define a loss that captures how close an image is to a given target hash: this loss is basically just the NeuralHash algorithm as described above, but with the final "hard" thresholding step tweaked so that it is "soft" (in particular, differentiable). Exactly how this is done (choices of activation functions, parameters, etc.) can affect convergence, so it can require some experimentation. After choosing the loss function, we can follow the standard method to find adversarial examples for neural networks: gradient descent.

Details

The implementation currently does an alternating projections style attack to find an adversarial example that has the intended hash and also looks similar to the original. See collide.py for the full details. The implementation uses two different loss functions: one measures the distance to the target hash, and the other measures the quality of the perturbation (l2 norm + total variation). We first optimize for a collision, focusing only on matching the target hash. Once we find a projection, we alternate between minimizing the perturbation and ensuring that the hash value does not change. The attack has a number of parameters; run python collide.py --help or refer to the code for a full list. Tweaking these parameters can make a big difference in convergence time and the quality of the output.

The implementation also supports a flag --blur [sigma] that blurs the perturbation on every step of the search. This can slow down or break convergence, but on some examples, it can be helpful for getting results that look more natural and less like glitch art.

Examples

Reproducing the Lena/Barbara result from this post:

The first image above is the original Lena image. The second was produced with --target a426dae78cc63799d01adc32 to collide with Barbara. The third was produced with the additional argument --blur 1.0. The fourth is the original Barbara image. Checking their hashes:

$ python nnhash.py lena.png
32dac883f7b91bbf45a48296
$ python nnhash.py lena-adv.png
a426dae78cc63799d01adc32
$ python nnhash.py lena-adv-blur-1.0.png
a426dae78cc63799d01adc32
$ python nnhash.py barbara.png
a426dae78cc63799d01adc32

Reproducing the Picard/Sidious result from this post:

The first image above is the original Picard image. The second was produced with --target e34b3da852103c3c0828fbd1 --tv-weight 3e-4 to collide with Sidious. The third was produced with the additional argument --blur 0.5. The fourth is the original Sidious image. Checking their hashes:

$ python nnhash.py picard.png
73fae120ad3191075efd5580
$ python nnhash.py picard-adv.png
e34b2da852103c3c0828fbd1
$ python nnhash.py picard-adv-blur-0.5.png
e34b2da852103c3c0828fbd1
$ python nnhash.py sidious.png
e34b2da852103c3c0828fbd1

Prerequisites

  • Get Apple's NeuralHash model following the instructions in AsuharietYgvar/AppleNeuralHash2ONNX and either put all the files in this directory or supply the --model / --seed arguments
  • Install Python dependencies: pip install -r requirements.txt

Usage

Run python collide.py --image [path to image] --target [target hash] to generate a hash collision. Run python collide.py --help to see all the options, including some knobs you can tweak, like the learning rate and some other parameters.

Limitations

The code in this repository is intended to be a demonstration, and perhaps a starting point for other exploration. Tweaking the implementation (choice of loss function, choice of parameters, etc.) might produce much better results than this code currently achieves.

Owner
Anish Athalye
grad student @mit-pdos
Anish Athalye
This app finds duplicate to near duplicate images by generating a hash value for each image stored with a specialized data structure called VP-Tree which makes searching an image on a dataset of 100Ks almost instantanious

Offline Reverse Image Search Overview This app finds duplicate to near duplicate images by generating a hash value for each image stored with a specia

53 Nov 15, 2022
Unique image & metadata generation using weighted layer collections.

nft-generator-py nft-generator-py is a python based NFT generator which programatically generates unique images using weighted layer files. The progra

Jonathan Becker 243 Dec 31, 2022
Typesheet is a tiny Python script for creating transparent PNG spritesheets from TrueType (.ttf) fonts.

typesheet typesheet is a tiny Python script for creating transparent PNG spritesheets from TrueType (.ttf) fonts. I made it because I couldn't find an

Grayson Chao 12 Dec 23, 2022
Wand is a ctypes-based simple ImageMagick binding for Python

Wand Wand is a ctypes-based simple ImageMagick binding for Python, supporting 2.7, 3.3+, and PyPy. All functionalities of MagickWand API are implement

Eric McConville 1.2k Jan 03, 2023
An open source image editor which can manipulate an image in many ways!

Image Editor - An open source image editor which can manipulate an image in many ways! If you need any more modes in repo or I

TroJanzHEX 44 Nov 17, 2022
Python library for ascii graphics

Python library for ascii graphics

Anton 6 Oct 20, 2021
Simple Python package to convert an image into a quantized image using a customizable palette

Simple Python package to convert an image into a quantized image using a customizable palette. Resulting image can be displayed by ePaper displays such as Waveshare displays.

Luis Obis 3 Apr 13, 2022
HTML2Image is a lightweight Python package that acts as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.

A package acting as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.

176 Jan 01, 2023
CadQuery is an intuitive, easy-to-use Python module for building parametric 3D CAD models.

A python parametric CAD scripting framework based on OCCT

1.9k Dec 30, 2022
PIX is an image processing library in JAX, for JAX.

PIX PIX is an image processing library in JAX, for JAX. Overview JAX is a library resulting from the union of Autograd and XLA for high-performance ma

DeepMind 294 Jan 08, 2023
A large-scale dataset of both raw MRI measurements and clinical MRI images

fastMRI is a collaborative research project from Facebook AI Research (FAIR) and NYU Langone Health to investigate the use of AI to make MRI scans faster. NYU Langone Health has released fully anonym

Facebook Research 907 Jan 04, 2023
Pyconvert is a python script that you can use to convert image files to another image format! (eg. PNG to ICO)

Pyconvert is a python script that you can use to convert image files to another image format! (eg. PNG to ICO)

1 Jan 16, 2022
An async Python library to automate solving ReCAPTCHA v2 by audio using Playwright.

Playwright nonoCAPTCHA An async Python library to automate solving ReCAPTCHA v2 by audio using Playwright. Disclaimer This project is for educational

Michael Mooney 69 Dec 28, 2022
Python QR Code image generator

Pure python QR Code generator Generate QR codes. For a standard install (which will include pillow for generating images), run: pip install qrcode[pil

Lincoln Loop 3.5k Dec 31, 2022
A Python3 library to generate dynamic SVGs

The Python library for generating dynamic SVGs using Python3

1 Dec 23, 2021
The ctypes-based simple ImageMagick binding for Python

Wand Wand is a ctypes-based simple ImageMagick binding for Python, supporting 2.7, 3.3+, and PyPy. All functionalities of MagickWand API are implement

Eric McConville 1.2k Dec 30, 2022
Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

Jianfei Guo 683 Jan 04, 2023
👾 Python project to help you convert any image into a pixel art.

👾 Pixel Art Generator Python project to help you convert any image into a pixel art. ⚙️ Developer's Guide Things you need to get started with this co

Atul Anand 6 Dec 14, 2022
Create QR Code for link using Python

Quick Response QR is short and named for a quick read from a cell phone. Used to view information from transitory media and put it on your cell phone.

Coding Taggers 1 Jan 09, 2022
CropImage is a simple toolkit for image cropping, detecting and cropping main body from pictures.

CropImage is a simple toolkit for image cropping, detecting and cropping main body from pictures. Support face and saliency detection.

Haofan Wang 15 Dec 22, 2022