Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Last update: Aug 04, 2022

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

download the 'resources.zip' file here: https://drive.google.com/file/d/1X88cMrLVpAcJd5Z4Gg6MfTLclIuGF-d6/view?usp=sharing
extract the content of 'resources.zip'

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Execute the following command to train and evaluate the model. The evaluation results are saved under the folder 'results'.

python main.py -c config.json

Optimizing Hyperparameters

The "config.json" file contains hyperparameters that can be changed to train different variants of the model.

{
  "base_dir": "",
  "batch_size": 64,
  "epochs": 20,
  "epoch_patience": 5,
  "bert_model_dir": "resources/hatebert",
  "monitor": "loss",
  "tweet_text_seq_len": 80,
  "tweet_text_char_len": 128,
  "char_size": 29,
  "max_learning_rate": 0.001,
  "end_learning_rate": 0.0000001,
  "rnn_type": "lstm",
  "rnn_layer_size": 200,
  "text_models": ["char_emb", "bert", "hate_words"],
  "normalize_text": true,
  "dataset_year": "2021",
  "optimizer": "adam",
  "text_use_attention": false,
  "oversample": true,
  "feature_normalization_layer_size": 512,
  "min_feature_normalization_layer_size": 64
}

bert_model_dir

"bert_model_dir": "resources/hatebert"
     OR
"bert_model_dir": "resources/bert-base"

dataset_year

"dataset_year": "2019"
	OR
"dataset_year": "2020"
	OR
"dataset_year": "2021"

text_models

"text_models": ["hate_words"]
	OR
"text_models": ["bert"]
	OR
"text_models": ["char_emb"]
	OR
"text_models": ["char_emb", "bert", "hate_words"]

rnn_type

"rnn_type": "lstm"
	OR
"rnn_type": "gru"
	OR
"rnn_type": "bi-gru"

Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Related tags

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Optimizing Hyperparameters

Owner

Sherzod Hakimov

Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'

An educational tool to introduce AI planning concepts using mobile manipulator robots.

This application explain how we can easily integrate Deepface framework with Python Django application

This repo provides the base code for pytorch-lightning and weight and biases simultaneous integration.

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Pytorch implementation of the Variational Recurrent Neural Network (VRNN).

RNN Predict Street Commercial Vitality

D2Go is a toolkit for efficient deep learning

A library for implementing Decentralized Graph Neural Network algorithms.

HyperDict - Self linked dictionary in Python

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

My implementation of transformers related papers for computer vision in pytorch

A toolkit for document-level event extraction, containing some SOTA model implementations

Research using Cirq!

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

FLSim a flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API

Reference code for the paper "Cross-Camera Convolutional Color Constancy" (ICCV 2021)

NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM

A PyTorch Toolbox for Face Recognition

An implementation of a sequence to sequence neural network using an encoder-decoder