An Unsupervised Graph-based Toolbox for Fraud Detection

Last update: Dec 11, 2022

Overview

An Unsupervised Graph-based Toolbox for Fraud Detection

Introduction: UGFraud is an unsupervised graph-based fraud detection toolbox that integrates several state-of-the-art graph-based fraud detection algorithms. It can be applied to bipartite graphs (e.g., user-product graph), and it can estimate the suspiciousness of both nodes and edges. The implemented models can be found here.

The toolbox incorporates the Markov Random Field (MRF)-based algorithm, dense-block detection-based algorithm, and SVD-based algorithm. For MRF-based algorithms, the users only need the graph structure and the prior suspicious score of the nodes as the input. For other algorithms, the graph structure is the only input.

Meanwhile, we have a deep graph-based fraud detection toolbox which implements state-of-the-art graph neural network-based fraud detectors.

We welcome contributions on adding new fraud detectors and extending the features of the toolbox. Some of the planned features are listed in TODO list.

If you use the toolbox in your project, please cite the paper below and the algorithms you used :

@inproceedings{dou2020robust,
  title={Robust Spammer Detection by Nash Reinforcement Learning},
  author={Dou, Yingtong and Ma, Guixiang and Yu, Philip S and Xie, Sihong},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  year={2020}
}

Useful Resources

Table of Contents

Installation
User Guide
Implemented Models
Model Comparison
TODO List
How to Contribute

Installation

You can install UGFraud from pypi:

pip install UGFraud

or download and install from github:

git clone https://github.com/safe-graph/UGFraud.git
cd UGFraud
python setup.py install

Dataset

The demo data is not the intact data (rating and date information are missing). The rating information is only used in ZooBP demo. If you need the intact date to play demo, please email [email protected] to download the intact data from Yelp Spam Review Dataset. The metadata.gz file in /UGFraud/Yelp_Data/YelpChi includes:

user_id: 38063 number of users
product_id: 201 number of products
rating: from 1.0 (low) to 5.0 (high)
label: -1 is not spam, 1 is spam
date: data creation time

User Guide

Running the example code

You can find the implemented models in /UGFraud/Demo directory. For example, you can run fBox using:

python eval_fBox.py

Running on your datasets

Have a look at the /UGFraud/Demo/data_to_network_graph.py to convert your data into the networkx graph.

In order to use your own data, you have to provide the following information at least:

a dict of dict:

'user_id':{
        'product_id':
                {
                'label': 1
                }

a dict of prior

You can use dict_to networkx(graph_dict) function from /Utils/helper.py file to convert your graph_dict into a networkx graph. For more details, please see data_to_network_graph.py.

The structure of code

The /UGFraud repository is organized as follows:

Demo/ contains the implemented models and the corresponding example code;
Detector/ contains the basic models;
Yelp_Data/ contains the necessary dataset files;
Utils/ contains the every help functions.

Implemented Models

Model	Paper	Venue	Reference
SpEagle	Collective Opinion Spam Detection: Bridging Review Networks and Metadata	KDD 2015	BibTex
GANG	GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graph	ICDM 2017	BibTex
fBox	Spotting Suspicious Link Behavior with fBox: An Adversarial Perspective	ICDM 2014	BibTex
Fraudar	FRAUDAR: Bounding Graph Fraud in the Face of Camouflage	KDD 2016	BibTex
ZooBP	ZooBP: Belief Propagation for Heterogeneous Networks	VLDB 2017	BibTex
SVD	Singular value decomposition and least squares solutions	-	BibTex
Prior	Evaluating suspicioueness based on prior information	-	-

Model Comparison

Model	Application	Graph Type	Model Type
SpEagle	Review Spam	Tripartite	MRF
GANG	Social Sybil	Bipartite	MRF
fBox	Social Fraudster	Bipartite	SVD
Fraudar	Social Fraudster	Bipartite	Dense-block
ZooBP	E-commerce Fraud	Tripartite	MRF
SVD	Dimension Reduction	Bipartite	SVD

TODO List

Homogeneous graph implementation

How to Contribute

You are welcomed to contribute to this open-source toolbox. Currently, you can create issues or send email to [email protected] for inquiry.

You might also like...

OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

3 Nov 11, 2022

A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

757 Jan 4, 2023

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

519 Jan 2, 2023

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

12 Oct 28, 2022

A semantic segmentation toolbox based on PyTorch

Introduction vedaseg is an open source semantic segmentation toolbox based on PyTorch. Features Modular Design We decompose the semantic segmentation

407 Dec 15, 2022

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms. It provides easily interchangeable modeling and planning components, and a set of utility functions that allow writing model-based RL algorithms with only a few lines of code.

724 Jan 4, 2023

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Comments

cannot import name 'Detector' most likely due to a circular import

Performing a simple import as outlined in testing.py

import sys
import os
__file__ = "~/env/lib/python3.8/site-packages/UGFraud"
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
from UGFraud.Demo.eval_fBox import *

However, this produces the below error:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
~/env/lib/python3.8/site-packages/UGFraud in <module>
      3 __file__ = "~/env/lib/python3.8/site-packages/UGFraud"
      4 sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
----> 5 from UGFraud.Demo.eval_fBox import *

~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py in <module>
      1 # -*- coding: utf-8 -*-
      2 
----> 3 from . import Detector
      4 from . import Utils
      5 

ImportError: cannot import name 'Detector' from partially initialized module 'UGFraud' (most likely due to a circular import) (~/miniconda3/lib/python3.8/site-packages/UGFraud/__init__.py)

opened by ragyibrahim 1

Releases(v0.1.0)

v0.1.0(Jul 31, 2020)
Implement seven algorithms

Unify the input format with json and networkx

Publish the package to pypi

Source code(tar.gz)
Source code(zip)

An Unsupervised Graph-based Toolbox for Fraud Detection

Related tags

Overview

An Unsupervised Graph-based Toolbox for Fraud Detection

Installation

Dataset

User Guide

Running the example code

Running on your datasets

The structure of code

Implemented Models

Model Comparison

TODO List

How to Contribute

You might also like...

OBBDetection: an oriented object detection toolbox modified from MMdetection

A Python Library for Graph Outlier Detection (Anomaly Detection)

This is an open-source toolkit for Heterogeneous Graph Neural Network(OpenHGNN) based on DGL [Deep Graph Library] and PyTorch.

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

A semantic segmentation toolbox based on PyTorch

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

Deep learning toolbox based on PyTorch for hyperspectral data classification.

Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

MMFlow is an open source optical flow toolbox based on PyTorch

Comments

cannot import name 'Detector' most likely due to a circular import

Releases(v0.1.0)

v0.1.0(Jul 31, 2020)

Owner

SafeGraph

Code for the Paper: Conditional Variational Capsule Network for Open Set Recognition

PyTorch implementation of probabilistic deep forecast applied to air quality.

A framework for multi-step probabilistic time-series/demand forecasting models

PyTorch implementation of SwAV (Swapping Assignments between Views)

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

PyVideoAI: Action Recognition Framework

Implementation of hyperparameter optimization/tuning methods for machine learning & deep learning models

RefineMask (CVPR 2021)

Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions

[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Python package for Bayesian Machine Learning with scikit-learn API

Julia package for multiway (inverse) covariance estimation.

A High-Level Fusion Scheme for Circular Quantities published at the 20th International Conference on Advanced Robotics

A library for differentiable nonlinear optimization.

3D detection and tracking viewer (visualization) for kitti & waymo dataset

Unsupervised Feature Loss (UFLoss) for High Fidelity Deep learning (DL)-based reconstruction

Pytorch implementation of our paper under review — Lottery Jackpots Exist in Pre-trained Models

Python implementation of the multistate Bennett acceptance ratio (MBAR)