This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Last update: Jan 11, 2022

Related tags

Deep Learning text-representation

Overview

Introduction

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

If you find this code useful, please cite the following paper:

@article{tan2022coherence,
  title = {Coherence-Based Distributed Document Representation Learning for Scientific Documents},
  author = {Tan, Shicheng and Zhao, Shu and Zhang, Yanping},
  journal = {arXiv},
  year = {2022},
  type = {Journal Article}
}

Run

Installation environment (ref. requirements.txt)
Download data: Link: https://pan.baidu.com/s/1EEJk0_P55Ov5ReXsmyVZPA Password: rkh0
python _av_CTE.py

信息检索数据运行指南

数据处理（4个文件）：使用“...data helper-IR.py”获取3份数据，原始数据处理暂存文件、原始数据处理暂存文件的语料、构建的数据集，然后使用“_aj_get dataset corpus.py”获得构建的数据集的语料
词向量训练（4个文件）：使用“_ak_get word embedding.py”训练第一步的2个语料得到2个词表和2个词向量文件，glove需要去除后缀名“.txt”
运行5次“_al_em-avg.py”得到5个结果，avg-word2vec、avg-word2vec(globe)、avg-glove、avg-glove(globe)、random embedding
运行“_ac_tf-idf.py”得到一个距离矩阵和1个结果，矩阵用于CTE方法
LDA、doc2vec、BM25、LSI、GPT2、XLNet、GPT、Transformer-XL、XLM 对应文件各运行一次得到9个结果
运行“_ah_WMD.py”4次得到4个结果，WMD-word2vec、WMD-word2vec(globe)、WMD-glove、WMD-glove(globe)
运行“_at_BERT.py”2次得到2个结果，BERT-Large uncased、BERT-Large uncased(wwm)
运行“_at_ELMo.py”2次得到2个结果，ELMo-Original(5.5B)、ELMo-Original(5.5B,级联)
运行“_av_CET.py”13次得到13个结果，基于 random embedding 等13种基础词向量

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Related tags

Overview

Introduction

Run

信息检索数据运行指南

Owner

tsc

Keras implementation of Deeplab v3+ with pretrained weights

A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

Analysis code and Latex source of the manuscript describing the conditional permutation test of confounding bias in predictive modelling.

CTF Challenge for CSAW Finals 2021

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}

WatermarkRemoval-WDNet-WACV2021

'Solving the sampling problem of the Sycamore quantum supremacy circuits

Learning with Subset Stacking

Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

YouRefIt: Embodied Reference Understanding with Language and Gesture

Weakly- and Semi-Supervised Panoptic Segmentation (ECCV18)

HomoInterpGAN - Homomorphic Latent Space Interpolation for Unpaired Image-to-image Translation

This is an example implementation of the paper "Cross Domain Robot Imitation with Invariant Representation".

Libtorch yolov3 deepsort

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms