HeCo

This repo is for source code of KDD 2021 paper "Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning".
Paper Link: https://arxiv.org/abs/2105.09111

Environment Settings

python==3.8.5
scipy==1.5.4
torch==1.7.0
numpy==1.19.2
scikit_learn==0.24.2

GPU: GeForce RTX 2080 Ti
CPU: Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz

Usage

Fisrt, go into ./code, and then you can use the following commend to run our model:

python main.py acm --gpu=0

Here, "acm" can be replaced by "dblp", "aminer" or "freebase".

Some tips in parameters

We suggest you to carefully select the “pos_num” (existed in ./data/pos.py) to ensure the threshold of postives for every node. This is very important to final results. Of course, more effective way to select positives is welcome.
In ./code/utils/params.py, except "lr" and "patience", meticulously tuning dropout and tau is applaudable.
In our experiments, we only assign target type of nodes with original features, but assign other type of nodes with one-hot. This is because most of datasets used only provide features of target nodes in their original version. So, we believe in that if high-quality features of other type of nodes are provided, the overall results will improve a lot. The AMiner dataset is an example. In this dataset, there are not original features, so every type of nodes are all asigned with one-hot. In other words, every node has the same quality of features, and in this case, our HeCo is far ahead of other baselines. So, we strongly suggest that if you have high-quality features for other type of nodes, try it!

Cite

Contact

If you have any questions, please feel free to contact me with [email protected]

The source code of HeCo

Related tags

Overview

HeCo

Environment Settings

Usage

Some tips in parameters

Cite

Contact

Owner

Nian Liu

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

Sequence-to-Sequence Framework in PyTorch

This repo contains simple to use, pretrained/training-less models for speaker diarization.

WikiPron - a command-line tool and Python API for mining multilingual pronunciation data from Wiktionary

A natural language modeling framework based on PyTorch

Conditional probing: measuring usable information beyond a baseline

The training code for the 4th place model at MDX 2021 leaderboard A.

Constituency Tree Labeling Tool

Codename generator using WordNet parts of speech database

A versatile token stream for handwritten parsers.

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Generating Korean Slogans with phonetic and structural repetition

Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA

Beautiful visualizations of how language differs among document types.

Words-per-minute - A terminal app written in python utilizing the curses module that tests the user's ability to type

Exploration of BERT-based models on twitter sentiment classifications

Automated question generation and question answering from Turkish texts using text-to-text transformers

Text vectorization tool to outperform TFIDF for classification tasks

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!