Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Last update: Dec 07, 2022

Overview

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

About

This repository contains the code to replicate the synthetic experiment conducted in the paper "Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model" by Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto, which has been accepted to WSDM2022.

If you find this code useful in your research then please site:

@inproceedings{kiyohara2022doubly,
  author = {Kiyohara, Haruka and Saito, Yuta and Matsuhiro, Tatsuya and Narita, Yusuke and Shimizu, Nobuyuki and Yamamoto, Yasuo},
  title = {Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model},
  booktitle = {Proceedings of the 15th International Conference on Web Search and Data Mining},
  pages = {xxx--xxx},
  year = {2022},
}

Dependencies

This repository supports Python 3.7 or newer.

numpy==1.20.0
pandas==1.2.1
scikit-learn==0.24.1
matplotlib==3.4.3
obp==0.5.2
hydra-core==1.0.6

Note that the proposed Cascade-DR estimator is implemented in Open Bandit Pipeline (obp.ope.SlateCascadeDoublyRobust).

Running the code

To conduct the synthetic experiment, run the following commands.

(i) run OPE simulations with varying data size, with the fixed slate size.

python src/main.py setting=n_rounds

(ii), (iii) run OPE simulations with varying slate size and policy similarities, with the fixed data size.

python src/main.py

Once the code is finished executing, you can find the results (squared_error.csv, relative_ee.csv, configuration.csv) in the ./logs/ directory. Lower value is better for squared error and relative estimation error (relative-ee).

Visualize the results

To visualize the results, run the following commands. Make sure that you have executed the above two experiments (by running python src/main.py and python src/main.py setting=default) before visualizing the results.

python src/visualize.py

Then, you will find the following figures (slate size (standard/cascade/independent).png, evaluation policy similarity (standard/cascade/independent).png, data size (standard/cascade/independent).png) in the ./logs/ directory. Lower value is better for the relative-MSE (y-axis).

reward structure	Standard	Cascade	Independent
varying data size (n)
varying slate size (L)
varying evaluation policy similarity (λ)

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

Related tags

Overview

Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model

About

Dependencies

Running the code

Visualize the results

Owner

Haruka Kiyohara

NAS-Bench-x11 and the Power of Learning Curves

PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

GrailQA: Strongly Generalizable Question Answering

Fuwa-http - The http client implementation for the fuwa eco-system

New approach to benchmark VQA models

Norm-based Analysis of Transformer

A library of scripts that interact with the PythonTurtle module to create games, drawings, and more

Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

Motion planning algorithms commonly used on autonomous vehicles. (path planning + path tracking)

Lightweight Salient Object Detection in Optical Remote Sensing Images via Feature Correlation

Intro-to-dl - Resources for "Introduction to Deep Learning" course.

Code release for Universal Domain Adaptation(CVPR 2019)

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》

Code base of object detection

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)