An Unsupervised Graph-based Toolbox for Fraud Detection

Overview



Building GitHub Downloads Pypi version

An Unsupervised Graph-based Toolbox for Fraud Detection

Introduction: UGFraud is an unsupervised graph-based fraud detection toolbox that integrates several state-of-the-art graph-based fraud detection algorithms. It can be applied to bipartite graphs (e.g., user-product graph), and it can estimate the suspiciousness of both nodes and edges. The implemented models can be found here.

The toolbox incorporates the Markov Random Field (MRF)-based algorithm, dense-block detection-based algorithm, and SVD-based algorithm. For MRF-based algorithms, the users only need the graph structure and the prior suspicious score of the nodes as the input. For other algorithms, the graph structure is the only input.

Meanwhile, we have a deep graph-based fraud detection toolbox which implements state-of-the-art graph neural network-based fraud detectors.

We welcome contributions on adding new fraud detectors and extending the features of the toolbox. Some of the planned features are listed in TODO list.

If you use the toolbox in your project, please cite the paper below and the algorithms you used :

@inproceedings{dou2020robust,
  title={Robust Spammer Detection by Nash Reinforcement Learning},
  author={Dou, Yingtong and Ma, Guixiang and Yu, Philip S and Xie, Sihong},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  year={2020}
}

Useful Resources

Table of Contents

Installation

You can install UGFraud from pypi:

pip install UGFraud

or download and install from github:

git clone https://github.com/safe-graph/UGFraud.git
cd UGFraud
python setup.py install

Dataset

The demo data is not the intact data (rating and date information are missing). The rating information is only used in ZooBP demo. If you need the intact date to play demo, please email [email protected] to download the intact data from Yelp Spam Review Dataset. The metadata.gz file in /UGFraud/Yelp_Data/YelpChi includes:

  • user_id: 38063 number of users
  • product_id: 201 number of products
  • rating: from 1.0 (low) to 5.0 (high)
  • label: -1 is not spam, 1 is spam
  • date: data creation time

User Guide

Running the example code

You can find the implemented models in /UGFraud/Demo directory. For example, you can run fBox using:

python eval_fBox.py 

Running on your datasets

Have a look at the /UGFraud/Demo/data_to_network_graph.py to convert your data into the networkx graph.

In order to use your own data, you have to provide the following information at least:

  • a dict of dict:
'user_id':{
        'product_id':
                {
                'label': 1
                }
  • a dict of prior

You can use dict_to networkx(graph_dict) function from /Utils/helper.py file to convert your graph_dict into a networkx graph. For more details, please see data_to_network_graph.py.

The structure of code

The /UGFraud repository is organized as follows:

  • Demo/ contains the implemented models and the corresponding example code;
  • Detector/ contains the basic models;
  • Yelp_Data/ contains the necessary dataset files;
  • Utils/ contains the every help functions.

Implemented Models

Model Paper Venue Reference
SpEagle Collective Opinion Spam Detection: Bridging Review Networks and Metadata KDD 2015 BibTex
GANG GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graph ICDM 2017 BibTex
fBox Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective ICDM 2014 BibTex
Fraudar FRAUDAR: Bounding Graph Fraud in the Face of Camouflage KDD 2016 BibTex
ZooBP ZooBP: Belief Propagation for Heterogeneous Networks VLDB 2017 BibTex
SVD Singular value decomposition and least squares solutions - BibTex
Prior Evaluating suspicioueness based on prior information - -

Model Comparison

Model Application Graph Type Model Type
SpEagle Review Spam Tripartite MRF
GANG Social Sybil Bipartite MRF
fBox Social Fraudster Bipartite SVD
Fraudar Social Fraudster Bipartite Dense-block
ZooBP E-commerce Fraud Tripartite MRF
SVD Dimension Reduction Bipartite SVD

TODO List

  • Homogeneous graph implementation

How to Contribute

You are welcomed to contribute to this open-source toolbox. Currently, you can create issues or send email to [email protected] for inquiry.

You might also like...
OBBDetection: an oriented object detection toolbox modified from MMdetection
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

A Python Library for Graph Outlier Detection (Anomaly Detection)
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

A semantic segmentation toolbox based on PyTorch

Introduction vedaseg is an open source semantic segmentation toolbox based on PyTorch. Features Modular Design We decompose the semantic segmentation

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.
mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms. It provides easily interchangeable modeling and planning components, and a set of utility functions that allow writing model-based RL algorithms with only a few lines of code.

Deep learning toolbox based on PyTorch for hyperspectral data classification.
Deep learning toolbox based on PyTorch for hyperspectral data classification.

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

Paddle-Adversarial-Toolbox Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle. Model Zoo Common FGS

MMFlow is an open source optical flow toolbox based on PyTorch
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

Comments
  •  cannot import name 'Detector' most likely due to a circular import

    cannot import name 'Detector' most likely due to a circular import

    Performing a simple import as outlined in testing.py

    import sys
    import os
    __file__ = "~/env/lib/python3.8/site-packages/UGFraud"
    sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
    from UGFraud.Demo.eval_fBox import *
    

    However, this produces the below error:

    ---------------------------------------------------------------------------
    ImportError                               Traceback (most recent call last)
    ~/env/lib/python3.8/site-packages/UGFraud in <module>
          3 __file__ = "~/env/lib/python3.8/site-packages/UGFraud"
          4 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
    ----> 5 from UGFraud.Demo.eval_fBox import *
    
    ~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py in <module>
          1 # -*- coding: utf-8 -*-
          2 
    ----> 3 from . import Detector
          4 from . import Utils
          5 
    
    ImportError: cannot import name 'Detector' from partially initialized module 'UGFraud' (most likely due to a circular import) (~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py)
    
    opened by ragyibrahim 1
Releases(v0.1.0)
Owner
SafeGraph
Towards Secure Machine Learning on Graph Data
SafeGraph
RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds

RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds This repository contains the code asscoiated

Felix Hensel 14 Dec 12, 2022
Council-GAN - Implementation for our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020)

Council-GAN Implementation of our paper Breaking the Cycle - Colleagues are all you need (CVPR 2020) Paper Ori Nizan , Ayellet Tal, Breaking the Cycle

ori nizan 260 Nov 16, 2022
Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

OFA Sys 1.4k Jan 08, 2023
A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

ViZDoom http://vizdoom.cs.put.edu.pl ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is pri

Hyeonwoo Noh 1 Aug 19, 2020
Open Source Differentiable Computer Vision Library for PyTorch

Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer

kornia 7.6k Jan 04, 2023
Athena is the only tool that you will ever need to optimize your portfolio.

Athena Portfolio optimization is the process of selecting the best portfolio (asset distribution), out of the set of all portfolios being considered,

Indrajit 1 Mar 25, 2022
Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models (published in ICLR2018)

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models Pouya Samangouei*, Maya Kabkab*, Rama Chellappa [*: authors co

Maya Kabkab 212 Dec 07, 2022
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)

MusCaps: Generating Captions for Music Audio Ilaria Manco1 2, Emmanouil Benetos1, Elio Quinton2, Gyorgy Fazekas1 1 Queen Mary University of London, 2

Ilaria Manco 57 Dec 07, 2022
harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Harmonic-percussive-residual separation plug-in This work is a study on the plausibility of a sines-transients-noise decomposition inspired algorithm

Derp Learning 9 Sep 01, 2022
A short and easy PyTorch implementation of E(n) Equivariant Graph Neural Networks

Simple implementation of Equivariant GNN A short implementation of E(n) Equivariant Graph Neural Networks for HOMO energy prediction. Just 50 lines of

Arsenii Senya Ashukha 97 Dec 23, 2022
Applying curriculum to meta-learning for few shot classification

Curriculum Meta-Learning for Few-shot Classification We propose an adaptation of the curriculum training framework, applicable to state-of-the-art met

Stergiadis Manos 3 Oct 25, 2022
Code for Environment Inference for Invariant Learning (ICML 2020 UDL Workshop Paper)

Environment Inference for Invariant Learning This code accompanies the paper Environment Inference for Invariant Learning, which appears at ICML 2021.

Elliot Creager 40 Dec 09, 2022
Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification

Fine-grainedImageClassification Weakly Supervised Posture Mining with Reverse Cross-entropy for Fine-grained Classification We trained model here: lin

ZhenchaoTang 14 Oct 21, 2022
Deep learning with dynamic computation graphs in TensorFlow

TensorFlow Fold TensorFlow Fold is a library for creating TensorFlow models that consume structured data, where the structure of the computation graph

1.8k Dec 28, 2022
Code for SALT: Stackelberg Adversarial Regularization, EMNLP 2021.

SALT: Stackelberg Adversarial Regularization Code for Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach, EMNLP 2021. R

Simiao Zuo 10 Jan 10, 2022
Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

CoG-BART Contrast and Generation Make BART a Good Dialogue Emotion Recognizer Quick Start: To run the model on test sets of four datasets, Download th

39 Dec 24, 2022
meProp: Sparsified Back Propagation for Accelerated Deep Learning

meProp The codes were used for the paper meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting (ICML 2017) [pdf]

LancoPKU 107 Nov 18, 2022
A set of tests for evaluating large-scale algorithms for Wasserstein-2 transport maps computation.

Continuous Wasserstein-2 Benchmark This is the official Python implementation of the NeurIPS 2021 paper Do Neural Optimal Transport Solvers Work? A Co

Alexander 22 Dec 12, 2022
Implementation of Neural Style Transfer in Pytorch

PytorchNeuralStyleTransfer Code to run Neural Style Transfer from our paper Image Style Transfer Using Convolutional Neural Networks. Also includes co

Leon Gatys 396 Dec 01, 2022
Code and datasets for TPAMI 2021

SkeletonNet This repository constains the codes and ShapeNetV1-Surface-Skeleton,ShapNetV1-SkeletalVolume and 2d image datasets ShapeNetRendering. Plea

34 Aug 15, 2022