Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Last update: Aug 03, 2022

Overview

SSWS-loss_function_based_on_MS-TCN

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Abstract

Recently, more and more videos have been uploaded to the network, so that video analysis task has been one of the most important applications in various fields. At present, video analysis methods can be divided into two kinds: weakly supervised video action segmentation and supervised video action segmentation. The former uses a sliding window or Markov model, while the latter uses the TCN model. In this paper, we introduce the Supervised Sliding Window Smooth Loss Function (SSWS) into the TCN baseline, which is a complement to MS-TCN smoothing loss function TMSE. In this method, three discriminant frames are selected from the video prediction sequence and combined into an adaptive sliding window to selectively smooth the whole prediction sequence. In particular, it doubles the penalty when it slides to the wrong place in the category. Compared to TMSE, our method effectively increases the receptive field of smoothing loss function. And, the proposed new supervised loss function only penalizes error frames. The experiment shows that compared with the Smoothing loss function TMSE of MS-TCN, SSWS has significantly improved in the three datasets: 50Salads, GTEA and the Breakfast Dataset.

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Related tags

Overview

SSWS-loss_function_based_on_MS-TCN

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Abstract

Owner

Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

This repository contains the implementations related to the experiments of a set of publicly available datasets that are used in the time series forecasting research space.

PyTorch implementation of TSception V2 using DEAP dataset

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

A Light CNN for Deep Face Representation with Noisy Labels

A rule learning algorithm for the deduction of syndrome definitions from time series data.

Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness

An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

A simplified framework and utilities for PyTorch

Neural Network Libraries

Parsing, analyzing, and comparing source code across many languages

Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

Global Rhythm Style Transfer Without Text Transcriptions

StyleGAN-Human: A Data-Centric Odyssey of Human Generation

Neural Oblivious Decision Ensembles

NHS AI Lab Skunkworks project: Long Stayer Risk Stratification

《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)