This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

Overview

Stability Audit

This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems, Humantic AI and Crystal. This codebase supports the 2021 manuscript entitled "External Stability Auditing to Test the Validity of Personality Prediction in AI Hiring," authored by Alene K. Rhea, Kelsey Markey, Lauren D'Arinzo, Hilke Schellmann, Mona Sloane, Paul Squires, and Julia Stoyanovich.

Code

The Jupyter notebook analysis.ipynb reads in the survey and system output data, and performs all stability analysis. The notebook begins with a demographic summarization, and then estimates stability metrics for each facet experiment as described in the manuscript.

Spearman's rank correlation is used to measure rank-order stability, two-tailed Wilcoxon signed rank testing is used to measure locational stability, and normalized L1 distance is used to measure total change across each facet. Medians of each facet treatment are estimated as well. Results are saved to the results directory, organized by metric and by system (Humantic AI and Crystal). Subgroup analysis is performed for rank-order stability and total change. Highlighting is employed to indicate correlations below 0.95 and 0.90, and Wilcoxon p-values below the Bonferroni and Benjamini-Hochberg corrected thresholds. Scatterplots are produced to compare the outputs from each pair of facet treatments. Boxplots illustrate total change. Boxplots comparing relevant subgroup analysis for each facet are produced as well.

Data

Survey

Anonymized survey results are saved in data/survey.csv. Columns described in the table below.

Column Type Description Values
Participant_ID str Unique ID used to identify participant. "ID2" - "ID101" (missing IDs indicate potential subjects were screened out of participation)
gender str Participant gender, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. ["Male" "Female" "Other Gender"]
race str Participant race, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. Empty entries indicates participants declined to self-identify their race in the survey. ["Asian" "White" "Other Race" NaN]
birth_country str Participant birth country, as reported in the survey. Pre-processed to mask rare responses in order to preserve anonymity. Empty entries indicates participants declined to provide their birth country in the survey. ["China" "India" "USA" "Other Country" NaN]
primary_language str Primary language of participant, as reported in the survey. ["English" "Other Langauge"]
resume bool Boolean flag indicating whether participant provided a resume in the survey. ["True" "False"]
linkedin bool Boolean flag indicating whether participant provided a LinkedIn in the survey. ["True" "False"]
twitter bool Boolean flag indicating whether participant provided a public Twitter handle in the survey. ["True" "False"]
linkedin_in_orig_resume bool Boolean flag indicating whether participant included a reference to their LinkedIn in the resume they submitted. Empty entries indicate participants did not submit a resume. ["True" "False" NaN]
orig_embed_type str Description of the method by which the participant referenced their LinkedIn in their submitted resume. Empty entries indicate participant did not submit a resume containing a reference to LinkedIn. ["Full url hyperlinked" "Full url not hyperlinked" "Text hyperlinked" "Other not hyperlinked" NaN]
orig_file_type str Filetype of the resume submitted by the participant. Empty entries indicate participants did not submit a resume. ["pdf" "docx" "txt" NaN]

Humantic AI and Crystal Output

Output from Humantic AI and Crystal is saved in the data directory. Each run is saved as a CSV and is named with its Run ID. Tables 3 and 4 in the manuscript (reproduced below) provide details of each run. Each file contains one row for each submitted input. Participant_ID provides a unique key, and output_success is a Boolean flag indicating that the system successfully produced output from the given input. Wherever output_success is true, there will be numeric predictions for each trait. Crystal results contain predictions for DiSC traits, and Humantic AI results contain predictions for DiSC traits and Big Five traits.

Run ID System Description Run Dates
HRo1 Humantic AI Original Resume 11/23/2020 - 01/14/2021
HRi1 Humantic AI De-Identified Resume 03/20/2021 - 03/28/2021
HRi2 Humantic AI De-Identified Resume 04/20/2021 - 04/28/2021
HRi3 Humantic AI De-Identified Resume 04/20/2021 - 04/28/2021
HRd1 Humantic AI DOCX Resume 03/20/2021 - 03/28/2021
HRu1 Humantic AI URL-Embedded Resume 04/09/2021 - 04/11/2021
HL1 Humantic AI LinkedIn 11/23/2020 - 01/14/2021
HL2 Humantic AI LinkedIn 08/10/2021 - 08/11/2021
HT1 Humantic AI Twitter 11/23/2020 - 01/14/2021
HT2 Humantic AI Twitter 08/10/2021 - 08/11/2021
CRr1 Crystal Raw Text Resume 03/31/2021 - 04/02/2021
CRr2 Crystal Raw Text Resume 05/01/2021 - 05/03/2021
CRr3 Crystal Raw Text Resume 05/01/2021 - 05/03/2021
CRp1 Crystal PDF Resume 11/23/2020 - 01/14/2021
CL1 Crystal LinkedIn 11/23/2020 - 01/14/2021
CL2 Crystal LinkedIn 09/13/2020 - 09/16/2021
Owner
Data, Responsibly
responsible data management: platform and tools
Data, Responsibly
Fortuitous Forgetting in Connectionist Networks

Fortuitous Forgetting in Connectionist Networks Introduction This repository includes reference code for the paper Fortuitous Forgetting in Connection

Hattie Zhou 14 Nov 26, 2022
Code release for NeRF (Neural Radiance Fields)

NeRF: Neural Radiance Fields Project Page | Video | Paper | Data Tensorflow implementation of optimizing a neural representation for a single scene an

6.5k Jan 01, 2023
Neural network for digit classification powered by cuda

cuda_nn_mnist Neural network library for digit classification powered by cuda Resources The library was built to work with MNIST dataset. python-mnist

Nikita Ardashev 1 Dec 20, 2021
Learning 3D Part Assembly from a Single Image

Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single

18 Dec 21, 2022
Anchor-free Oriented Proposal Generator for Object Detection

Anchor-free Oriented Proposal Generator for Object Detection Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han, Intro

jbwang1997 56 Nov 15, 2022
Repo for our ICML21 paper Unsupervised Learning of Visual 3D Keypoints for Control

Unsupervised Learning of Visual 3D Keypoints for Control [Project Website] [Paper] Boyuan Chen1, Pieter Abbeel1, Deepak Pathak2 1UC Berkeley 2Carnegie

Boyuan Chen 34 Jul 22, 2022
A model which classifies reviews as positive or negative.

SentiMent Analysis In this project I built a model to classify movie reviews fromn the IMDB dataset of 50K reviews. WordtoVec : Neural networks only w

Rishabh Bali 2 Feb 09, 2022
InvTorch: memory-efficient models with invertible functions

InvTorch: Memory-Efficient Invertible Functions This module extends the functionality of torch.utils.checkpoint.checkpoint to work with invertible fun

Modar M. Alfadly 12 May 12, 2022
Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure

Geometric Vector Perceptron Implementation of equivariant GVP-GNNs as described in Learning from Protein Structure with Geometric Vector Perceptrons b

Dror Lab 142 Dec 29, 2022
Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Multiplicative Filter Networks This repository contains a PyTorch MFN implementation and code to perform & reproduce experiments from the ICLR 2021 pa

Bosch Research 66 Jan 04, 2023
imbalanced-DL: Deep Imbalanced Learning in Python

imbalanced-DL: Deep Imbalanced Learning in Python Overview imbalanced-DL (imported as imbalanceddl) is a Python package designed to make deep imbalanc

NTUCSIE CLLab 19 Dec 28, 2022
Deep Reinforcement Learning for Keras.

Deep Reinforcement Learning for Keras What is it? keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seaml

Keras-RL 0 Dec 15, 2022
Command-line tool for downloading and extending the RedCaps dataset.

RedCaps Downloader This repository provides the official command-line tool for downloading and extending the RedCaps dataset. Users can seamlessly dow

RedCaps dataset 33 Dec 14, 2022
Most popular metrics used to evaluate object detection algorithms.

Most popular metrics used to evaluate object detection algorithms.

Rafael Padilla 4.4k Dec 25, 2022
For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

ImgAlign For auto aligning, cropping, and scaling HR and LR images for training image based neural networks Usage Make sure OpenCV is installed, 'pip

15 Dec 04, 2022
Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting

QAConv Interpretable and Generalizable Person Re-Identification with Query-Adaptive Convolution and Temporal Lifting This PyTorch code is proposed in

Shengcai Liao 166 Dec 28, 2022
YKKDetector For Python

YKKDetector OpenCVを利用した機械学習データをもとに、VRChatのスクリーンショットなどからYKKさん(もとい「幽狐族のお姉様」)を検出できるソフトウェアです。 マニュアル こちらから実行環境のセットアップから解説する詳細なマニュアルをご覧いただけます。 ライセンス 本ソフトウェア

あんふぃとらいと 5 Dec 07, 2021
A Python Reconnection Tool for alt:V

altv-reconnect What? It invokes a reconnect in the altV Client Dev Console. You get to determine when your local client should reconnect when developi

8 Jun 30, 2022
D2Go is a toolkit for efficient deep learning

D2Go D2Go is a production ready software system from FacebookResearch, which supports end-to-end model training and deployment for mobile platforms. W

Facebook Research 744 Jan 04, 2023
This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

SummaC: Summary Consistency Detection This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Det

Philippe Laban 24 Jan 03, 2023