A PyTorch implementation of "Say No to the Discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information" (WSDM 2021)

Last update: Jan 04, 2023

Overview

FairGNN

A PyTorch implementation of "Say No to the Discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information" (WSDM 2021). [paper]

Abstract

Graph neural networks (GNNs) have shown great power in modeling graph structured data. However, similar to other machine learning models, GNNs may make predictions biased on protected sensitive attributes, e.g., skin color, gender, and nationality. Because machine learning algorithms including GNNs are trained to faithfully reflect the distribution of the training data which often contains historical bias towards sensitive attributes. In addition, the discrimination in GNNs can be magnified by graph structures and the message-passing mechanism. As a result, the applications of GNNs in sensitive domains such as crime rate prediction would be largely limited. Though extensive studies of fair classification have been conducted on i.i.d data, methods to address the problem of discrimination on non-i.i.d data are rather limited. Furthermore, the practical scenario of sparse annotations in sensitive attributes is rarely considered in existing works. Therefore, we study the novel and important problem of learning fair GNNs with limited sensitive attribute information. FairGNN is proposed to eliminate the bias of GNNs whilst maintaining high node classification accuracy by leveraging graph structures and limited sensitive information. Our theoretical analysis shows that FairGNN can ensure the fairness of GNNs under mild conditions given limited nodes with known sensitive attributes. Extensive experiments on real-world datasets also demonstrate the effectiveness of FairGNN in debiasing and keeping high accuracy.

Requirements

torch==1.2.0
DGL=0.4.3

Run the code

After installation, you can clone this repository

git clone https://github.com/EnyanDai/FariGNN.git
cd FairGNN/src
python train_fairGNN.py \
        --seed=42 \
        --epochs=2000 \
        --model=GCN \
        --sens_number=200 \
        --dataset=pokec_z \
        --num-hidden=128 \
        --acc=0.69 \
        --roc=0.76 \
        --alpha=100 \
        --beta=1

Model Selection

During the training phase, we will select the best epoch based on the performance on the validation set. More speciafically, the selection rules are:

We only care about the epochs that the accuracy and roc socre of the FairGNN on the validation set are higher than the thresholds (defined by --acc and --roc).
We will select the epoch whose summation of parity and equal opportunity is the smallest.

Data Set

Pokec_z and Pokec_n are stored in dataset\pokec as region_job.xxx and region_job_2.xxx, respectively. They are sampled from soc_Pokec.

@inproceedings{takac2012data,
  title={Data analysis in public social networks},
  author={Takac, Lubos and Zabovsky, Michal},
  booktitle={International scientific conference and international workshop present day trends of innovations},
  volume={1},
  number={6},
  year={2012}

NBA is stored in dataset\NBA as nba.xxx It is collected with through the Twitter social network and the players' information on Kaggle

Reproduce the results

All the hyper-parameters settings are included in src\scripts folder.

To reproduce the performance reported in the paper, you can run the bash files in folder src\scripts.

bash scripts/pokec_z/train_fairGCN.sh

Cite

If you find this repo to be useful, please cite our paper. Thank you.

@inproceedings{dai2021say,
  title={Say No to the Discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information},
  author={Dai, Enyan and Wang, Suhang},
  booktitle={Proceedings of the 14th ACM International Conference on Web Search and Data Mining},
  pages={680--688},
  year={2021}
}

A PyTorch implementation of "Say No to the Discrimination: Learning Fair Graph Neural Networks with Limited Sensitive Attribute Information" (WSDM 2021)

Related tags

Overview

FairGNN

Abstract

Requirements

Run the code

Model Selection

Data Set

Reproduce the results

Cite

Owner

Incorporating User Micro-behaviors and Item Knowledge 59 60 3 into Multi-task Learning for Session-based Recommendation

Bert4rec for news Recommendation

Code for MB-GMN, SIGIR 2021

RecList is an open source library providing behavioral, "black-box" testing for recommender systems.

RetaGNN: Relational Temporal Attentive Graph Neural Networks for Holistic Sequential Recommendation

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

A tensorflow implementation of the RecoGCN model in a CIKM'19 paper, titled with "Relation-Aware Graph Convolutional Networks for Agent-Initiated Social E-Commerce Recommendation".

Global Context Enhanced Social Recommendation with Hierarchical Graph Neural Networks

Reinforcement Knowledge Graph Reasoning for Explainable Recommendation

Mutual Fund Recommender System. Tailor for fund transactions.

Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Cross Domain Recommendation via Bi-directional Transfer Graph Collaborative Filtering Networks

Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction

EXEMPLO DE SISTEMA ESPECIALISTA PARA RECOMENDAR SERIADOS EM PYTHON

The source code for "Global Context Enhanced Graph Neural Network for Session-based Recommendation".

reXmeX is recommender system evaluation metric library.

A library of Recommender Systems

Detecting Beneficial Feature Interactions for Recommender Systems, AAAI 2021

Recommendation System to recommend top books from the dataset

Code for ICML2019 Paper "Compositional Invariance Constraints for Graph Embeddings"