Linear programming solver for paper-reviewer matching and mind-matching

Last update: Jul 05, 2022

Overview

Paper-Reviewer Matcher

A python package for paper-reviewer matching algorithm based on topic modeling and linear programming. The algorithm is implemented based on this article). This package solves problem of assigning paper to reviewers with constrains by solving linear programming problem. We minimize global distance between papers and reviewers in topic space (e.g. topic modeling can be Principal component, Latent Semantic Analysis (LSA), etc.).

Here is a diagram of problem setup and how we solve the problem.

Mind-Match Command Line

Mind-Match is a session we run at Cognitive Computational Neuroscience (CCN) conference. We use a combination of topic modeling and linear programming to solve optimal matching problem. To run example Mind-Match algorithm on sample of 500 people, you can clone the repository and run the following

python mindmatch.py data/mindmatch_example.csv --n_match=6 --n_trim=50

in the root of this repo. This should produce a matching output output_match.csv in this relative location. However, when people get much larger this script takes quite a long time to run. We use pre-cluster into groups before running the mind-matching to make the script runs faster. Below is an example script for pre-clustering and mind-matching on all data:

python mindmatch_cluster.py data/mindmatch_example.csv --n_match=6 --n_trim=50 --n_clusters=4

Example script for the conferences

Here, I include a recent scripts for our Mind Matching session for CCN conference.

ccn_mind_matching_2019.py contains script for Mind Matching session (match scientists to scientists) for CCN conference
ccn_paper_reviewer_matching.py contains script for matching publications to reviewers for CCN conference, see example of CSV files in data folder

The code makes the distance metric of topics between incoming papers with reviewers (for ccn_paper_reviewer_matching.py) and between people with people (for ccn_mind_matching_2019). We trim the metric so that the problem is not too big to solve using or-tools. It then solves linear programming problem to assign the best matches which minimize the global distance between papers to reviewers. After that, we make the output that can be used by the organizers of the CCN conference -- pairs of paper and reviewers or mind-matching schedule between people to people in the conference. You can see of how it works below.

Dependencies

Use pip to install dependencies

pip install -r requirements.txt

Please see Stackoverflow if you have a problem installing or-tools on MacOS. You can use pip to install protobuf before installing or-tools

pip install protobuf==3.0.0b4
pip install ortools

for Python 3.6,

pip install --user --upgrade ortools

Citations

If you use Paper-Reviewer Matcher in your work or conference, please cite us as follows

@misc{achakulvisut2018,
    author = {Achakulvisut, Titipat and Acuna, Daniel E. and Kording, Konrad},
    title = {Paper-Reviewer Matcher},
    year = {2018},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/titipata/paper-reviewer-matcher}},
    commit = {9d346ee008e2789d34034c2b330b6ba483537674}
}

Members

Daniel Acuna (original author)
Titipat Achakulvisut (refactor)
Konrad Kording

Linear programming solver for paper-reviewer matching and mind-matching

Related tags

Overview

Paper-Reviewer Matcher

Mind-Match Command Line

Example script for the conferences

Dependencies

Citations

Members

Owner

Titipat Achakulvisut

Pre-training with Extracted Gap-sentences for Abstractive SUmmarization Sequence-to-sequence models

Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

Utilize Korean BERT model in sentence-transformers library

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

iBOT: Image BERT Pre-Training with Online Tokenizer

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

HAIS_2GNN: 3D Visual Grounding with Graph and Attention

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Open solution to the Toxic Comment Classification Challenge

Sequence model architectures from scratch in PyTorch

CATs: Semantic Correspondence with Transformers

A framework for evaluating Knowledge Graph Embedding Models in a fine-grained manner.

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

Deduplication is the task to combine different representations of the same real world entity.

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE

Code from the paper "High-Performance Brain-to-Text Communication via Handwriting"