The implementation of FOLD-R++ algorithm

Overview

FOLD-R-PP

The implementation of FOLD-R++ algorithm. The target of FOLD-R++ algorithm is to learn an answer set program for a classification task.

Installation

Prerequisites

FOLD-R++ is developed with only python3. Numpy is the only dependency:

python3 -m pip install numpy

Instruction

Data preparation

The FOLD-R++ algorithm takes tabular data as input, the first line for the tabular data should be the feature names of each column. The FOLD-R++ does not need encoding for training. It can deal with numeric, categorical, and even mixed type features (one column contains categorical and numeric values) directly. But, the numeric features should be specified before loading data, otherwise they would be dealt like categorical features (only literals with = and != would be generated).

There are many UCI datasets can be found in the data directory, and the code pieces of data preparation should be added to datasets.py.

For example, the UCI breast-w dataset can be loaded with the following code:

columns = ['clump_thickness', 'cell_size_uniformity', 'cell_shape_uniformity', 'marginal_adhesion',
'single_epi_cell_size', 'bare_nuclei', 'bland_chromatin', 'normal_nucleoli', 'mitoses']
nums = columns
data, num_idx, columns = load_data('data/breastw/breastw.csv', attrs=columns, label=['label'], numerics=nums, pos='benign')

columns lists all the features needed, nums lists all the numeric features, label implies the feature name of the label, pos indicates the positive value of the label.

Training

The FOLD-R++ algorithm generates an explainable model that is represented with an answer set program for classification tasks. Here's an training example for breast-w dataset:

X_train, Y_train = split_xy(data_train)
X_pos, X_neg = split_X_by_Y(X_train, Y_train)
rules1 = foldrpp(X_pos, X_neg, [])

We have got a rule set rules1 in a nested intermediate representation. Flatten and decode the nested rules to answer set program:

fr1 = flatten(rules1)
rule_set = decode_rules(fr1, attrs)
for r in rule_set:
    print(r)

The training process can be started with: python3 main.py

An answer set program that is compatible with s(CASP) is generated as below.

% breastw dataset (699, 10).
% the answer set program generated by foldr++:

label(X,'benign'):- bare_nuclei(X,'?').
label(X,'benign'):- bland_chromatin(X,N6), N6=<4.0,
		    clump_thickness(X,N0), N0=<6.0,  
                    bare_nuclei(X,N5), N5=<1.0, not ab7(X).   
label(X,'benign'):- cell_size_uniformity(X,N1), N1=<2.0,
		    not ab3(X), not ab5(X), not ab6(X).  
label(X,'benign'):- cell_size_uniformity(X,N1), N1=<4.0,
		    bare_nuclei(X,N5), N5=<3.0,
		    clump_thickness(X,N0), N0=<3.0, not ab8(X).  
ab2(X):- clump_thickness(X,N0), N0=<1.0.  
ab3(X):- bare_nuclei(X,N5), N5>5.0, not ab2(X).  
ab4(X):- cell_shape_uniformity(X,N2), N2=<1.0.  
ab5(X):- clump_thickness(X,N0), N0>7.0, not ab4(X).  
ab6(X):- bare_nuclei(X,N5), N5>4.0, single_epi_cell_size(X,N4), N4=<1.0.  
ab7(X):- marginal_adhesion(X,N3), N3>4.0.  
ab8(X):- marginal_adhesion(X,N3), N3>6.0.  

% foldr++ costs:  0:00:00.027710  post: 0:00:00.000127
% acc 0.95 p 0.96 r 0.9697 f1 0.9648 

Testing in Python

The testing data X_test, a set of testing data, can be predicted with the predict function in Python.

Y_test_hat = predict(rules1, X_test)

The classify function can also be used to classify a single data.

y_test_hat = classify(rules1, x_test)

Justification by using s(CASP)

Classification and justification can be conducted with s(CASP), but the data also need to be converted into predicate format. The decode_test_data function can be used for generating predicates for testing data.

data_pred = decode_test_data(data_test, attrs)
for p in data_pred:
    print(p)

Here is an example of generated testing data predicates along with the answer set program for acute dataset:

% acute dataset (120, 7) 
% the answer set program generated by foldr++:

ab2(X):- a5(X,'no'), a1(X,N0), N0>37.9.
label(X,'yes'):- not a4(X,'no'), not ab2(X).

% foldr++ costs:  0:00:00.001990  post: 0:00:00.000040
% acc 1.0 p 1.0 r 1.0 f1 1.0 

id(1).
a1(1,37.2).
a2(1,'no').
a3(1,'yes').
a4(1,'no').
a5(1,'no').
a6(1,'no').

id(2).
a1(2,38.1).
a2(2,'no').
a3(2,'yes').
a4(2,'yes').
a5(2,'no').
a6(2,'yes').

id(3).
a1(3,37.5).
a2(3,'no').
a3(3,'no').
a4(3,'yes').
a5(3,'yes').
a6(3,'yes').

s(CASP)

All the resources of s(CASP) can be found at https://gitlab.software.imdea.org/ciao-lang/sCASP.

Citation

@misc{wang2021foldr,
      title={FOLD-R++: A Toolset for Automated Inductive Learning of Default Theories from Mixed Data}, 
      author={Huaduo Wang and Gopal Gupta},
      year={2021},
      eprint={2110.07843},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
Datasets for new state-of-the-art challenge in disentanglement learning

High resolution disentanglement datasets This repository contains the Falcor3D and Isaac3D datasets, which present a state-of-the-art challenge for co

NVIDIA Research Projects 37 May 26, 2022
Pytorch Implementation for (STANet+ and STANet)

Pytorch Implementation for (STANet+ and STANet) V2-Weakly Supervised Visual-Auditory Saliency Detection with Multigranularity Perception (arxiv), pdf:

GuotaoWang 14 Nov 29, 2022
A denoising diffusion probabilistic model synthesises galaxies that are qualitatively and physically indistinguishable from the real thing.

Realistic galaxy simulation via score-based generative models Official code for 'Realistic galaxy simulation via score-based generative models'. We us

Michael Smith 32 Dec 20, 2022
https://sites.google.com/cornell.edu/recsys2021tutorial

Counterfactual Learning and Evaluation for Recommender Systems (RecSys'21 Tutorial) Materials for "Counterfactual Learning and Evaluation for Recommen

yuta-saito 45 Nov 10, 2022
Exploit ILP to learn symmetry breaking constraints of ASP programs.

ILP Symmetry Breaking Overview This project aims to exploit inductive logic programming to lift symmetry breaking constraints of ASP programs. Given a

Research Group Production Systems 1 Apr 13, 2022
The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

Yuki M. Asano 249 Dec 22, 2022
Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models Abstract Many applications of generative models rely on the marginali

Stanford Intelligent Systems Laboratory 9 Jun 06, 2022
[ICCV 2021] Our work presents a novel neural rendering approach that can efficiently reconstruct geometric and neural radiance fields for view synthesis.

MVSNeRF Project page | Paper This repository contains a pytorch lightning implementation for the ICCV 2021 paper: MVSNeRF: Fast Generalizable Radiance

Anpei Chen 529 Dec 30, 2022
Source code for Acorn, the precision farming rover by Twisted Fields

Acorn precision farming rover This is the software repository for Acorn, the precision farming rover by Twisted Fields. For more information see twist

Twisted Fields 198 Jan 02, 2023
A PyTorch Implementation of Neural IMage Assessment

NIMA: Neural IMage Assessment This is a PyTorch implementation of the paper NIMA: Neural IMage Assessment (accepted at IEEE Transactions on Image Proc

yunxiaos 418 Dec 29, 2022
This repository contains the source code of our work on designing efficient CNNs for computer vision

Efficient networks for Computer Vision This repo contains source code of our work on designing efficient networks for different computer vision tasks:

Sachin Mehta 386 Nov 26, 2022
A python library for time-series smoothing and outlier detection in a vectorized way.

tsmoothie A python library for time-series smoothing and outlier detection in a vectorized way. Overview tsmoothie computes, in a fast and efficient w

Marco Cerliani 517 Dec 28, 2022
[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

DiffHand This repository contains the implementation for the paper An End-to-End Differentiable Framework for Contact-Aware Robot Design (RSS 2021). I

Jie Xu 60 Jan 04, 2023
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Scalable Incomplete Network Embedding ⠀⠀ A PyTorch implementation of Scalable Incomplete Network Embedding (ICDM 2018). Abstract Attributed network em

Benedek Rozemberczki 69 Sep 22, 2022
This project implements "virtual speed" from heart rate monito

ANT+ Virtual Stride Based Speed and Distance Monitor Overview This project imple

2 May 20, 2022
project page for VinVL

VinVL: Revisiting Visual Representations in Vision-Language Models Updates 02/28/2021: Project page built. Introduction This repository is the project

308 Jan 09, 2023
Library to enable Bayesian active learning in your research or labeling work.

Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components

ElementAI 687 Dec 25, 2022
Geometric Deep Learning Extension Library for PyTorch

Documentation | Paper | Colab Notebooks | External Resources | OGB Examples PyTorch Geometric (PyG) is a geometric deep learning extension library for

Matthias Fey 16.5k Jan 08, 2023
Morphable Detector for Object Detection on Demand

Morphable Detector for Object Detection on Demand (ICCV 2021) PyTorch implementation of the paper Morphable Detector for Object Detection on Demand. I

9 Feb 23, 2022
ArcaneGAN by Alex Spirin

ArcaneGAN by Alex Spirin

Alex 617 Dec 28, 2022