IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Overview

SSKT(Accepted WACV2022)

Concept map

concept

Dataset

  • Image dataset
    • CIFAR10 (torchvision)
    • CIFAR100 (torchvision)
    • STL10 (torchvision)
    • Pascal VOC (torchvision)
    • ImageNet(I) (torchvision)
    • Places365(P)
  • Video dataset

Pre-trained models

  • Imagenet
    • we used the pre-trained model in torchvision.
    • using resnet18, 50
  • Places365

Option

  • isSource
    • Single Source Transfer Module
    • Transfer Module X, Only using auxiliary layer
  • transfer_module
    • Single Source Transfer Module
  • multi_source
    • multiple task transfer learning

Training

  • 2D PreLeKT
 python main.py --model resnet20  --source_arch resnet50 --sourceKind places365 --result /raid/video_data/output/PreLeKT --dataset stl10 --lr 0.1 --wd 5e-4 --epochs 200 --classifier_loss_method ce --auxiliary_loss_method kd --isSource --multi_source --transfer_module
  • 3D PreLeKT
 python main.py --root_path /raid/video_data/ucf101/ --video_path frames --annotation_path ucf101_01.json  --result_path /raid/video_data/output/PreLeKT --n_classes 400 --n_finetune_classes 101 --model resnet --model_depth 18 --resnet_shortcut A --batch_size 128 --n_threads 4 --pretrain_path /nvadmin/Pretrained_model/resnet-18-kinetics.pth --ft_begin_index 4 --dataset ucf101 --isSource --transfer_module --multi_source

Experiment

Comparison with other knowledge transfer methods.

  • For a further analysis of SSKT, we compared its performance with those of typical knowledge transfer methods, namely KD[1] and DML[3]
  • For KD, the details for learning were set the same as in [1], and for DML, training was performed in the same way as in [3].
  • In the case of 3D-CNN-based action classification[2], both learning from scratch and fine tuning results were included
Tt Model KD DML SSKT(Ts)
CIFAR10 ResNet20 91.75±0.24 92.37±0.15 92.46±0.15 (P+I)
CIFAR10 ResNet32 92.61±0.31 93.26±0.21 93.38±0.02 (P+I)
CIFAR100 ResNet20 68.66±0.24 69.48±0.05 68.63±0.12 (I)
CIFAR100 ResNet32 70.5±0.05 71.9±0.03 70.94±0.36 (P+I)
STL10 ResNet20 77.67±1.41 78.23±1.23 84.56±0.35 (P+I)
STL10 ResNet32 76.07±0.67 77.14±1.64 83.68±0.28 (I)
VOC ResNet18 64.11±0.18 39.89±0.07 76.42±0.06 (P+I)
VOC ResNet34 64.57±0.12 39.97±0.16 77.02±0.02 (P+I)
VOC ResNet50 62.39±0.6 39.65±0.03 77.1±0.14 (P+I)
UCF101 3D ResNet18(scratch) - 13.8 52.19(P+I)
UCF101 3D ResNet18(fine-tuning) - 83.95 84.58 (P)
HMDB51 3D ResNet18(scratch) - 3.01 17.91 (P+I)
HMDB51 3D ResNet18(fine-tuning) - 56.44 57.82 (P)

The performance comparison with MAXL[4], another auxiliary learning-based transfer learning method

  • The difference between the learning scheduler in MAXL and in our experiment is whether cosine annealing scheduler and focal loss are used or not.
  • In VGG16, SSKT showed better performance in all settings. In ResNet20, we also showed better performance in our settings than MAXL in all settings.
Tt Model MAXL (ψ[i]) SSKT (Ts, Loss ) Ts Model
CIFAR10 VGG16 93.49±0.05 (5) 94.1±0.1 (I, F) VGG16
CIFAR10 VGG16 - 94.22±0.02 (I, CE) VGG16
CIFAR10 ResNet20 91.56±0.16 (10) 91.48±0.03 (I, F) VGG16
CIFAR10 ResNet20 - 92.46±0.15 (P+I, CE) ResNet50, ResNet50

Citation

If you use SSKD in your research, please consider citing:

@InProceedings{SSKD_2022_WACV,
author = {Seungbum Hong, Jihun Yoon, and Min-Kook Choi},
title = {Self-Supervised Knowledge Transfer via Loosely Supervised Auxiliary Tasks},
booktitle = {In The IEEE Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2022}
}

References

This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.

Locus This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order

Robotics and Autonomous Systems Group 96 Dec 15, 2022
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

[ICLR 2022] Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity by Shiwei Liu, Tianlong Chen, Zahra Atashgahi, Xiaohan Chen, Ghada Sokar, Elen

VITA 18 Dec 31, 2022
A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

MPT A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities. Implementation for our AAAI 2022 paper: Multi-

yidiLi 4 May 08, 2022
Addition of pseudotorsion caclulation eta, theta, eta', and theta' to barnaba package

Addition to Original Barnaba Code: This is modified version of Barnaba package to calculate RNA pseudotorsion angles eta, theta, eta', and theta'. Ple

Mandar Kulkarni 1 Jan 11, 2022
Virtual Dance Reality Stage is a feature that offers you to share a stage with another user virtually.

Virtual Dance Reality Stage is a feature that offers you to share a stage with another user virtually. It uses the concept of Image Background Removal using DeepLab Architecture (based on Semantic Se

Devashi Choudhary 5 Aug 24, 2022
Semantic Segmentation for Aerial Imagery using Convolutional Neural Network

This repo has been deprecated because whole things are re-implemented by using Chainer and I did refactoring for many codes. So please check this newe

Shunta Saito 27 Sep 23, 2022
A task-agnostic vision-language architecture as a step towards General Purpose Vision

Towards General Purpose Vision Systems By Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, and Derek Hoiem Overview Welcome to the official code base f

AI2 79 Dec 23, 2022
Algebraic effect handlers in Python

PyEffect: Algebraic effects in Python What IDK. Usage effects.handle(operation, handlers=None) effects.set_handler(effect, handler) Supported effects

Greg Werbin 5 Dec 27, 2021
Title: Heart-Failure-Classification

This Notebook is based off an open source dataset available on where I have created models to classify patients who can potentially witness heart failure on the basis of various parameters. The best

Akarsh Singh 2 Sep 13, 2022
Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

Kaiyang 679 Jan 04, 2023
ECLARE: Extreme Classification with Label Graph Correlations

ECLARE ECLARE: Extreme Classification with Label Graph Correlations @InProceedings{Mittal21b, author = "Mittal, A. and Sachdeva, N. and Agrawal

Extreme Classification 35 Nov 06, 2022
Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

SGNet Project for the IJCAI 2021 paper "Structure Guided Lane Detection" Abstract Recently, lane detection has made great progress with the rapid deve

Jinming Su 27 Dec 08, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
High-Resolution 3D Human Digitization from A Single Image.

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization (CVPR 2020) News: [2020/06/15] Demo with Google Colab (i

Meta Research 8.4k Dec 29, 2022
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

Here is deepparse. Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning. Use deepparse to Use the pr

GRAAL/GRAIL 192 Dec 20, 2022
Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Parallel and High-Fidelity Text-to-Lip Generation This repository is the official PyTorch implementation of our AAAI-2022 paper, in which we propose P

Zhying 77 Dec 21, 2022
Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction".

TGIN Tensorflow implementation of our method: "Triangle Graph Interest Network for Click-through Rate Prediction". Files in the folder dataset/ electr

Alibaba 21 Dec 21, 2022
Official implementation of the paper Momentum Capsule Networks (MoCapsNet)

Momentum Capsule Network Official implementation of the paper Momentum Capsule Networks (MoCapsNet). Abstract Capsule networks are a class of neural n

8 Oct 20, 2022
Pytorch implementation of the paper: "A Unified Framework for Separating Superimposed Images", in CVPR 2020.

Deep Adversarial Decomposition PDF | Supp | 1min-DemoVideo Pytorch implementation of the paper: "Deep Adversarial Decomposition: A Unified Framework f

Zhengxia Zou 72 Dec 18, 2022
Telegram chatbot created with deep learning model (LSTM) and telebot library.

Telegram chatbot Telegram chatbot created with deep learning model (LSTM) and telebot library. Description This program will allow you to create very

1 Jan 04, 2022