ICLR 2022 Paper submission trend analysis

Last update: Dec 06, 2022

Related tags

Data Analysis ICLR2022-OpenReviewData

Overview

Visualize ICLR 2022 OpenReview Data

ICLR 2022 Paper submission analysis from https://openreview.net/group?id=ICLR.cc/2022/Conference

Requirements

pip install wordcloud nltk pandas imageio selenium tqdm

download nltk packages

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
nltk.download('stopwords')

if you got anything wrong when calling webdriver.Edge('msedgedriver.exe'), you can

Delete msedgedriver.exe since it may only work on my computer (Windows)
Install Microsoft Edge (Chromium): Ensure you have installed Microsoft Edge (Chromium). To confirm that you have Microsoft Edge (Chromium) installed, go to edge://settings/help in the browser, and verify the version number is Version 75 or later.
Download Microsoft Edge Driver:
- Go to edge://settings/help to get the version of Edge.
Navigate to the Microsoft Edge Driver downloads page and download the driver that matches the Edge version number.

From https://stackoverflow.com/questions/63529124/how-to-open-up-microsoft-edge-using-selenium-and-python

Crawl Data

Run crawl_paperlist.py to crawl the list of papers (~0.5h).

Paper List (3,407 submission in total

crawl_paperlist.py only crawls 3,000 papers, but it has 3,407 in total. The full paper list are in follows:

Visualization

Keywords Frequency

The top 50 common keywords (uncased) and their frequency:

Keywords Cloud

The word clouds formed by keywords of submissions show the hot topics including deep learning, reinforcement learning, representation learning, graph neural network, etc.

Title Keywords Frequency

The top 50 common title keywords (uncased) and their frequency:

Title Keywords Cloud

The word clouds formed by keywords of submission titles:

Acknowledgment

Inspired by this repo: https://github.com/evanzd/ICLR2021-OpenReviewData

ICLR 2022 Paper submission trend analysis

Related tags

Overview

Visualize ICLR 2022 OpenReview Data

Requirements

Crawl Data

Paper List (3,407 submission in total

Visualization

Acknowledgment

Owner

Jintang Li

pyETT: Python library for Eleven VR Table Tennis data

Monitor the stability of a pandas or spark dataframe ⚙︎

Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

signac-flow - manage workflows with signac

Extract data from a wide range of Internet sources into a pandas DataFrame.

This project is the implementation template for HW 0 and HW 1 for both the programming and non-programming tracks

Desafio 1 ~ Bantotal

Implementation in Python of the reliability measures such as Omega.

An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

statDistros is a Python library for dealing with various statistical distributions

The OHSDI OMOP Common Data Model allows for the systematic analysis of healthcare observational databases.

Extract Thailand COVID-19 Cluster data from daily briefing pdf.

PyIOmica (pyiomica) is a Python package for omics analyses.

An easy-to-use feature store

First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

An Aspiring Drop-In Replacement for NumPy at Scale

Integrate bus data from a variety of sources (batch processing and real time processing).

Instant search for and access to many datasets in Pyspark.

Python script for transferring data between three drives in two separate stages

A stock analysis app with streamlit