Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

Related tags

Deep Learningsemco
Overview

SemCo

The official pytorch implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training (appearing in CVPR2021)

SemCo Conceptual Diagram

Install Dependencies

  • Create a new environment and install dependencies using pip install -r requirements.txt
  • Install apex to enable automatic mixed precision training (AMP).
git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install --cpp_ext --cuda_ext

Note: Installing apex is optional, if you don't want to implement amp, you can simply pass --no_amp command line argument to the launcher.

Dataset

We use a standard directory structure for all our datasets to enable running the code on any dataset of choice without the need to edit the dataloaders. The datasets directory follow the below structure (only shown for cifar100 but is the same for all other datasets):

datasets
└───cifar100
   └───train
       │   <image1>
       │   <image2>
       │   ...
   └───test
       │   <image1-test>
       │   <image2-test>
       │   ...
   └───labels
       │   labels_train.feather
       │   labels_test.feather

An example of the above directory structure for cifar100 can be found here.

To preprocess a generic dataset into the above format, you can refer to utils/utils.py for several examples.

To configure the datasets directory path, you can either set the environment variable SEMCO_DATA_PATH or pass a command line argument --dataset-path to the launcher. (e.g. export SEMCO_DATA_PATH=/home/data). Note that this path references the parent datasets directory which contains the different sub directories for the individual datasets (e.g. cifar100, mini-imagenet, etc.)

Label Semantics Embeddings

SemCo expects a prior representation of all class labels via a semantic embedding for each class name. In our experiments, we use embeddings obtained from ConceptNet knowledge graph which contains a total of ~550K term embeddings. SemCo uses a matching criteria to find the best embedding for each of the class labels. Alternatively, you can use class attributes as the prior (like we did for CUB200 dataset), so you can build your own semantic dictionary.

To run experiments, please download the semantic embedding file here and set the path to the downloaded file either via SEMCO_WV_PATH environment variable or --word-vec-path command line argument. (e.g. export SEMCO_WV_PATH=/home/inas0003/data/numberbatch-en-19.08_128D.dict.pkl

Defining the Splits

For each of the experiments, you will need to specify to the launcher 4 command line arguments:

  • --dataset-name: denoting the dataset directory name (e.g. cifar100)
  • --train-split-pickle: path to pickle file with training split
  • --valid-split-pickle: (optional) path to pickle file with validation/test split (by default contains all the files in the test folder)
  • --classes-pickle: (optional) path to pickle file with list of class names

To obtain the three pickle files for any dataset, you can use generate_tst_pkls.py script specifying the dataset name and the number of instances per label and optionally a random seed. Example as follows:

python generate_tst_pkls.py --dataset-name cifar100 --instances-per-label 10 --random-seed 000 --output-path splits

The above will generate a train split with 10 images per class using a random seed of 000 together with the class names and the validation split containing all the files placed in the test folder. This can be tweaked by editing the python script.

Training the model

To train the model on cifar100 with 40 labeled samples, you can run the script:

    $ python launch_semco.py --dataset-name cifar100 --train-split-pickle splits/cifar100_labelled_data_40_seed123.pkl --model_backbone=wres --wres-k=2

or without amp

    $ python launch_semco.py --dataset-name cifar100 --train-split-pickle splits/cifar100_labelled_data_40_seed123.pkl --model_backbone=wres --wres-k=2 --no_amp

Similary to train the model on mini_imagenet with 400 labeled samples, you can run the script:

    $  python launch_semco.py --dataset-name mini_imagenet --train-split-pickle testing/mini_imagenet_labelled_data_40_seed456.pkl --model_backbone=resnet18 --im-size=84 --cropsize=84 
A project that uses optical flow and machine learning to detect aimhacking in video clips.

waldo-anticheat A project that aims to use optical flow and machine learning to visually detect cheating or hacking in video clips from fps games. Che

waldo.vision 542 Dec 03, 2022
Simulation of self-focusing of laser beams in condensed media

What is it? Program for scientific research, which allows to simulate the phenomenon of self-focusing of different laser beams (including Gaussian, ri

Evgeny Vasilyev 13 Dec 24, 2022
MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet.

Lightweight-Detection-and-KD MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet. This repo also includes detection knowledge di

Egqawkq 12 Jan 05, 2023
Code for "Continuous-Time Meta-Learning with Forward Mode Differentiation" (ICLR 2022)

Continuous-Time Meta-Learning with Forward Mode Differentiation ICLR 2022 (Spotlight) - Installation - Example - Citation This repository contains the

Tristan Deleu 25 Oct 20, 2022
Parameter Efficient Deep Probabilistic Forecasting

PEDPF Parameter Efficient Deep Probabilistic Forecasting (PEDPF) is a repository containing code to run experiments for several deep learning based pr

Olivier Sprangers 10 Jun 13, 2022
Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

ORB-SLAM2 Authors: Raul Mur-Artal, Juan D. Tardos, J. M. M. Montiel and Dorian Galvez-Lopez (DBoW2) 13 Jan 2017: OpenCV 3 and Eigen 3.3 are now suppor

Raul Mur-Artal 7.8k Dec 30, 2022
Small little script to scrape, parse and check for active tor nodes. Can be used as proxies.

TorScrape TorScrape is a small but useful script made in python that scrapes a website for active tor nodes, parse the html and then save the nodes in

5 Dec 04, 2022
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Paddle-PANet 目录 结果对比 论文介绍 快速安装 结果对比 CTW1500 Method Backbone Fine

7 Aug 08, 2022
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

Arthur Bražinskas 39 Jan 01, 2023
LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Deep-Leafsnap Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhev

Sujith Vishwajith 48 Nov 27, 2022
Playable Video Generation

Playable Video Generation Playable Video Generation Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci Paper: ArX

Willi Menapace 136 Dec 31, 2022
Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

Hust Visual Learning Team 178 Nov 28, 2022
🔅 Shapash makes Machine Learning models transparent and understandable by everyone

🎉 What's new ? Version New Feature Description Tutorial 1.6.x Explainability Quality Metrics To help increase confidence in explainability methods, y

MAIF 2.1k Dec 27, 2022
FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks

FedCV: A Federated Learning Framework for Diverse Computer Vision Tasks Image Classification Dataset: Google Landmark, COCO, ImageNet Model: Efficient

FedML-AI 62 Dec 10, 2022
fklearn: Functional Machine Learning

fklearn: Functional Machine Learning fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning. Th

nubank 1.4k Dec 07, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

Jia Research Lab 137 Dec 14, 2022
Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019) Introduction Official implementation of Dynamic Multi-scale Filters for Semant

23 Oct 21, 2022
Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

ImageNet-21K Pretraining for the Masses Paper | Pretrained models Official PyTorch Implementation Tal Ridnik, Emanuel Ben-Baruch, Asaf Noy, Lihi Zelni

574 Jan 02, 2023
Cortex-compatible model server for Python and TensorFlow

Nucleus model server Nucleus is a model server for TensorFlow and generic Python models. It is compatible with Cortex clusters, Kubernetes clusters, a

Cortex Labs 14 Nov 27, 2022
Generate Cartoon Images using Generative Adversarial Network

AvatarGAN ✨ Generate Cartoon Images using DC-GAN Deep Convolutional GAN is a generative adversarial network architecture. It uses a couple of guidelin

Aakash Jhawar 50 Dec 29, 2022