PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Last update: Dec 18, 2022

Overview

Daft-Exprt - PyTorch Implementation

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

The validation logs up to 70K of synthesized mel and alignment are shown below (VCTK_val_p237-088).

Quickstart

DATASET refers to the names of datasets such as VCTK in the following documents.

Dependencies

You can install the Python dependencies with

pip3 install -r requirements.txt

Also, Dockerfile is provided for Docker users.

Inference

You have to download the pretrained models and put them in output/ckpt/DATASET/.

For a multi-speaker TTS, run

python3 synthesize.py --text "YOUR_DESIRED_TEXT" --speaker_id SPEAKER_ID --restore_step RESTORE_STEP --mode single --dataset DATASET --ref_audio REF_AUDIO

to synthesize speech with the style of input audio at REF_AUDIO. The dictionary of learned speakers can be found at preprocessed_data/VCTK/speakers.json, and the generated utterances will be put in output/result/.

Batch Inference

Batch inference is also supported, try

python3 synthesize.py --source preprocessed_data/DATASET/val.txt --restore_step RESTORE_STEP --mode batch --dataset DATASET

to synthesize all utterances consuming themselves as a reference audio in preprocessed_data/DATASET/val.txt.

Controllability

The pitch/volume/speaking rate of the synthesized utterances can be controlled by specifying the desired pitch/energy/duration ratios. For example, one can increase the speaking rate by 20 % and decrease the volume by 20 % by

python3 synthesize.py --text "YOUR_DESIRED_TEXT" --speaker_id SPEAKER_ID --restore_step RESTORE_STEP --mode single --dataset DATASET --ref_audio REF_AUDIO --duration_control 0.8 --energy_control 0.8

Training

Datasets

The supported datasets are

VCTK: The CSTR VCTK Corpus includes speech data uttered by 110 English speakers (multi-speaker TTS) with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation paragraph used for the speech accent archive.
Any of multi-speaker TTS dataset (e.g., LibriTTS) can be added following VCTK.

Preprocessing

For a multi-speaker TTS with external speaker embedder, download ResCNN Softmax+Triplet pretrained model of philipperemy's DeepSpeaker for the speaker embedding and locate it in ./deepspeaker/pretrained_models/.
Run
```
python3 prepare_align.py --dataset DATASET
```
for some preparations.

For the forced alignment, Montreal Forced Aligner (MFA) is used to obtain the alignments between the utterances and the phoneme sequences. Pre-extracted alignments for the datasets are provided here. You have to unzip the files in preprocessed_data/DATASET/TextGrid/. Alternately, you can run the aligner by yourself.

After that, run the preprocessing script by
```
python3 preprocess.py --dataset DATASET
```

Training

Train your model with

python3 train.py --dataset DATASET

TensorBoard

Use

tensorboard --logdir output/log

to serve TensorBoard on your localhost. The loss curves, synthesized mel-spectrograms, and audios are shown.

Implementation Issues

RangeParameterPredictor is built with BiLSTM rather than a single linear layer with softplus() activation (it is however implemented and named as 'range_param_predictor_paper' in GaussianUpsampling).
Use 32 batch size instead of 48 due to memory issues.
Use log duration instead of normal duration.
Follow FastSpeech2 for the preprocess of pitch and energy.
Two options for embedding for the multi-speaker TTS setting: training speaker embedder from scratch or using a pre-trained philipperemy's DeepSpeaker model (as STYLER did). You can toggle it by setting the config (between 'none' and 'DeepSpeaker').
DeepSpeaker on VCTK dataset shows clear identification among speakers. The following figure shows the T-SNE plot of extracted speaker embedding.

For vocoder, HiFi-GAN and MelGAN are supported.

Citation

@misc{lee2021daft_exprt,
  author = {Lee, Keon},
  title = {Daft-Exprt},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/keonlee9420/Daft-Exprt}}
}

References

keonlee9420's WaveGrad2 for GaussianUpsampling and RangeParameterPredictor
keonlee9420's STYLER for the (domain) adversarial training of SpeakerClassifier
keonlee9420's StyleSpeech for reference auido interface
FiLM: Visual Reasoning with a General Conditioning Layer
TADAM: Task dependent adaptive metric for improved few-shot learning

Comments

Is the pretrained model correct?
I've downloaded the file @ https://drive.google.com/drive/folders/1rmeW24lrCg_qwPkVI0D4bRNq4HSv_4rE but can't manage to get the model to load, on either the master branch or the v1.0.0 zip file.

I get the following error on v1.0.0

Unexpected key(s) in state_dict: "speaker_emb.bias". size mismatch for speaker_emb.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 128]).

or master:

RuntimeError: Error(s) in loading state_dict for DaftExprt: size mismatch for speaker_emb.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([5, 128]).

(both with speaker embedding set to "none", with DeepSpeaker I get a different error during preprocessing). Both errors suggest the model is expecting a speaker embedding of size 128, but it's loading an embedding of size 512 (ignoring the first dim n_speakers, I can fix that if needed).
opened by nmfisher 5
FileNotFoundError: [Errno 2] No such file or directory: './preprocessed_data/VCTK/spker_embed/p317-spker_embed.npy'

Method to Reproduce Error: python3 synthesize.py --text "YOUR_DESIRED_TEXT" --restore_step 900000 --mode single --dataset VCTK --ref_audio ../my-voice-analysis/record.wav --speaker_id p317

Error: FileNotFoundError: [Errno 2] No such file or directory: './preprocessed_data/VCTK/spker_embed/p317-spker_embed.npy'

opened by anish-rajan 2
Unseen Speaker Synthesis
Hello! Would you be able to synthesize speech using speakers not used during the training? ie. load the unseen speaker's embeddings during inference

I think by changing this particular line in synthesize.py:

spker_embeds = np.load(os.path.join( preprocess_config["path"]["preprocessed_path"], "spker_embed", "{}-spker_embed.npy".format(args.speaker_id), )) if load_spker_embed else None

However, I'm not sure if it is wise to do this and would result to poor quality. Have you tried doing this?
opened by migi-gon 2
Error when inference
Hello! Thanks for the amazing work! I'm meeting something wrong when inference:

FileNotFoundError: [Errno 2] No such file or directory: './preprocessed_data/VCTK/spker_embed/p317-spker_embed.npy'

What am I missing? Thanks!
opened by godspirit00 2
Question about film?

https://github.com/keonlee9420/Daft-Exprt/blob/e8e7a646c73e45332004c570df1f2c367698e42a/model/blocks.py#L62

this should be gammas * x + betas, right?

according to https://github.com/ethanjperez/film/blob/fe43ddf8a22b339dcca2efa07091ce9d498955cf/vr/models/filmed_net.py#L26

one more, do u have some style changed samples?

opened by azraelkuan 2
Bump tensorflow from 2.5.0 to 2.5.1
Bumps tensorflow from 2.5.0 to 2.5.1.

Release notes

Sourced from tensorflow's releases.

TensorFlow 2.5.1

Release 2.5.1

This release introduces several vulnerability fixes:

Fixes a heap out of bounds access in sparse reduction operations (CVE-2021-37635)

Fixes a floating point exception in SparseDenseCwiseDiv (CVE-2021-37636)

Fixes a null pointer dereference in CompressElement (CVE-2021-37637)

Fixes a null pointer dereference in RaggedTensorToTensor (CVE-2021-37638)

Fixes a null pointer dereference and a heap OOB read arising from operations restoring tensors (CVE-2021-37639)

Fixes an integer division by 0 in sparse reshaping (CVE-2021-37640)

Fixes a division by 0 in ResourceScatterDiv (CVE-2021-37642)

Fixes a heap OOB in RaggedGather (CVE-2021-37641)

Fixes a std::abort raised from TensorListReserve (CVE-2021-37644)

Fixes a null pointer dereference in MatrixDiagPartOp (CVE-2021-37643)

Fixes an integer overflow due to conversion to unsigned (CVE-2021-37645)

Fixes a bad allocation error in StringNGrams caused by integer conversion (CVE-2021-37646)

Fixes a null pointer dereference in SparseTensorSliceDataset (CVE-2021-37647)

Fixes an incorrect validation of SaveV2 inputs (CVE-2021-37648)

Fixes a null pointer dereference in UncompressElement (CVE-2021-37649)

Fixes a segfault and a heap buffer overflow in {Experimental,}DatasetToTFRecord (CVE-2021-37650)

Fixes a heap buffer overflow in FractionalAvgPoolGrad (CVE-2021-37651)

Fixes a use after free in boosted trees creation (CVE-2021-37652)

Fixes a division by 0 in ResourceGather (CVE-2021-37653)

Fixes a heap OOB and a CHECK fail in ResourceGather (CVE-2021-37654)

Fixes a heap OOB in ResourceScatterUpdate (CVE-2021-37655)

Fixes an undefined behavior arising from reference binding to nullptr in RaggedTensorToSparse (CVE-2021-37656)

Fixes an undefined behavior arising from reference binding to nullptr in MatrixDiagV* ops (CVE-2021-37657)

Fixes an undefined behavior arising from reference binding to nullptr in MatrixSetDiagV* ops (CVE-2021-37658)

Fixes an undefined behavior arising from reference binding to nullptr and heap OOB in binary cwise ops (CVE-2021-37659)

Fixes a division by 0 in inplace operations (CVE-2021-37660)

Fixes a crash caused by integer conversion to unsigned (CVE-2021-37661)

Fixes an undefined behavior arising from reference binding to nullptr in boosted trees (CVE-2021-37662)

Fixes a heap OOB in boosted trees (CVE-2021-37664)

Fixes vulnerabilities arising from incomplete validation in QuantizeV2 (CVE-2021-37663)

Fixes vulnerabilities arising from incomplete validation in MKL requantization (CVE-2021-37665)

Fixes an undefined behavior arising from reference binding to nullptr in RaggedTensorToVariant (CVE-2021-37666)

Fixes an undefined behavior arising from reference binding to nullptr in unicode encoding (CVE-2021-37667)

Fixes an FPE in tf.raw_ops.UnravelIndex (CVE-2021-37668)

Fixes a crash in NMS ops caused by integer conversion to unsigned (CVE-2021-37669)

Fixes a heap OOB in UpperBound and LowerBound (CVE-2021-37670)

Fixes an undefined behavior arising from reference binding to nullptr in map operations (CVE-2021-37671)

Fixes a heap OOB in SdcaOptimizerV2 (CVE-2021-37672)

Fixes a CHECK-fail in MapStage (CVE-2021-37673)

Fixes a vulnerability arising from incomplete validation in MaxPoolGrad (CVE-2021-37674)

Fixes an undefined behavior arising from reference binding to nullptr in shape inference (CVE-2021-37676)

Fixes a division by 0 in most convolution operators (CVE-2021-37675)

Fixes vulnerabilities arising from missing validation in shape inference for Dequantize (CVE-2021-37677)

Fixes an arbitrary code execution due to YAML deserialization (CVE-2021-37678)

Fixes a heap OOB in nested tf.map_fn with RaggedTensors (CVE-2021-37679)

... (truncated)

Changelog

Sourced from tensorflow's changelog.

Release 2.5.1

This release introduces several vulnerability fixes:

Fixes a heap out of bounds access in sparse reduction operations (CVE-2021-37635)

Fixes a floating point exception in SparseDenseCwiseDiv (CVE-2021-37636)

Fixes a null pointer dereference in CompressElement (CVE-2021-37637)

Fixes a null pointer dereference in RaggedTensorToTensor (CVE-2021-37638)

Fixes a null pointer dereference and a heap OOB read arising from operations restoring tensors (CVE-2021-37639)

Fixes an integer division by 0 in sparse reshaping (CVE-2021-37640)

Fixes a division by 0 in ResourceScatterDiv (CVE-2021-37642)

Fixes a heap OOB in RaggedGather (CVE-2021-37641)

Fixes a std::abort raised from TensorListReserve (CVE-2021-37644)

Fixes a null pointer dereference in MatrixDiagPartOp (CVE-2021-37643)

Fixes an integer overflow due to conversion to unsigned (CVE-2021-37645)

Fixes a bad allocation error in StringNGrams caused by integer conversion (CVE-2021-37646)

Fixes a null pointer dereference in SparseTensorSliceDataset (CVE-2021-37647)

Fixes an incorrect validation of SaveV2 inputs (CVE-2021-37648)

Fixes a null pointer dereference in UncompressElement (CVE-2021-37649)

Fixes a segfault and a heap buffer overflow in {Experimental,}DatasetToTFRecord (CVE-2021-37650)

Fixes a heap buffer overflow in FractionalAvgPoolGrad (CVE-2021-37651)

Fixes a use after free in boosted trees creation (CVE-2021-37652)

Fixes a division by 0 in ResourceGather (CVE-2021-37653)

Fixes a heap OOB and a CHECK fail in ResourceGather (CVE-2021-37654)

Fixes a heap OOB in ResourceScatterUpdate (CVE-2021-37655)

Fixes an undefined behavior arising from reference binding to nullptr in RaggedTensorToSparse

... (truncated)

Commits

8222c1c Merge pull request #51381 from tensorflow/mm-fix-r2.5-build

d584260 Disable broken/flaky test

f6c6ce3 Merge pull request #51367 from tensorflow-jenkins/version-numbers-2.5.1-17468

3ca7812 Update version numbers to 2.5.1

4fdf683 Merge pull request #51361 from tensorflow/mm-update-relnotes-on-r2.5

05fc01a Put CVE numbers for fixes in parentheses

bee1dc4 Update release notes for the new patch release

47beb4c Merge pull request #50597 from kruglov-dmitry/v2.5.0-sync-abseil-cmake-bazel

6f39597 Merge pull request #49383 from ashahab/abin-load-segfault-r2.5

0539b34 Merge pull request #48979 from liufengdb/r2.5-cherrypick

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Single Speaker TTS

How to perform prosody transfer for single speaker? What value should be given in speaker id, Is any change required in the config file. Can an example be given for the same.

opened by anupama-deo 0

RuntimeError: The size of tensor a (65) must match the size of tensor b (10) at non-singleton dimension 1

hello I want to train the model using python train.py --dataset VCTK command but i faced following error:

Number of Daft-Exprt Parameters: 20603604
Removing weight norm...
Training:   0%|                                                                                                                | 0/900000 [00:00<?, ?it/s
Traceback (most recent call last):                                                                                                 | 0/674 [00:00<?, ?it/s]
  File "train.py", line 190, in <module>
    main(args, configs)
  File "train.py", line 85, in main
    output = model(*(batch[2:]))
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/prosody_control/Daft-Exprt-main/model/DaftExprt.py", line 98, in forward
    p_control, e_control, d_control, src_masks, ref_mel_lens, ref_max_mel_len, ref_mel_masks, src_lens
  File "/home/prosody_control/daftenv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/prosody_control/Daft-Exprt-main/model/modules.py", line 539, in forward
    s_input = p_embed + e_embed + d_embed + encoder_outputs
RuntimeError: The size of tensor a (65) must match the size of tensor b (10) at non-singleton dimension 1

Training:   0%|                                                                                                   | 1/900000 [00:04<1071:57:08,  4.29s/it]
Epoch 1:   0%|                                                                                                                    | 0/674 [00:04<?, ?it/s]```

opened by niasalva 0

Releases(v1.0.1)

v1.0.1(Oct 15, 2021)

Source code(tar.gz)
Source code(zip)
v1.0.0(Aug 11, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Keon Lee

GitHub Repository

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Deep-Learning-based-Spectrum-Sensing Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectru

10 Dec 14, 2022

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Contextual Action Language Model (CALM) and the ClubFloyd Dataset Code and data for paper Keep CALM and Explore: Language Models for Action Generation

43 Dec 16, 2022

Data augmentation for NLP, accepted at EMNLP 2021 Findings

AEDA: An Easier Data Augmentation Technique for Text Classification This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Techni

81 Dec 09, 2022

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

93 Dec 25, 2022

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain

187 Dec 24, 2022

OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

3 Nov 11, 2022

Unifying Global-Local Representations in Salient Object Detection with Transformer

GLSTR (Global-Local Saliency Transformer) This is the official implementation of paper "Unifying Global-Local Representations in Salient Object Detect

11 Aug 24, 2022

A parametric soroban written with CADQuery.

A parametric soroban written in CADQuery The purpose of this project is to demonstrate how "code CAD" can be intuitive to learn. See soroban.py for a

4 Aug 13, 2022

An atmospheric growth and evolution model based on the EVo degassing model and FastChem 2.0

EVolve Linking planetary mantles to atmospheric chemistry through volcanism using EVo and FastChem. Overview EVolve is a linked mantle degassing and a

2 Jan 17, 2022

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

GDR-Net This repo provides the PyTorch implementation of the work: Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided

169 Jan 07, 2023

The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.

Human Trajectory Prediction via Counterfactual Analysis (CausalHTP) The official PyTorch code implementation of "Human Trajectory Prediction via Count

46 Dec 03, 2022

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

71 Dec 29, 2022

Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Kaggle-titanic This is a tutorial in an IPython Notebook for the Kaggle competition, Titanic Machine Learning From Disaster. The goal of this reposito

800 Dec 15, 2022

Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)

Contrastive Unpaired Translation (CUT) video (1m) | video (10m) | website | paper We provide our PyTorch implementation of unpaired image-to-image tra

1.7k Dec 27, 2022

Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

898 Jan 07, 2023

Implementation of Segnet, FCN, UNet , PSPNet and other models in Keras.

Image Segmentation Keras : Implementation of Segnet, FCN, UNet, PSPNet and other models in Keras. Implementation of various Deep Image Segmentation mo

2.6k Jan 05, 2023

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022

Monify: an Expense tracker Program implemented in a Graphical User Interface that allows users to keep track of their expenses

💳 MONIFY (EXPENSE TRACKER PRO) 💳 Description Monify is an Expense tracker Program implemented in a Graphical User Interface allows users to add inco

1 Dec 14, 2021

Image Lowpoly based on Centroid Voronoi Diagram via python-opencv and taichi

CVTLowpoly: Image Lowpoly via Centroid Voronoi Diagram Image Sharp Feature Extraction using Guide Filter's Local Linear Theory via opencv-python. The

4 Jul 29, 2022

Temporal-Relational CrossTransformers

Temporal-Relational Cross-Transformers (TRX) This repo contains code for the method introduced in the paper: Temporal-Relational CrossTransformers for

83 Dec 12, 2022

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Related tags

Overview

Daft-Exprt - PyTorch Implementation

Quickstart

Dependencies

Inference

Batch Inference

Controllability

Training

Datasets

Preprocessing

Training

TensorBoard

Implementation Issues

Citation

References

Comments

TensorFlow 2.5.1

Release 2.5.1

Release 2.5.1

Releases(v1.0.1)

v1.0.1(Oct 15, 2021)

v1.0.0(Aug 11, 2021)

Owner

Keon Lee

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Data augmentation for NLP, accepted at EMNLP 2021 Findings

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

OBBDetection: an oriented object detection toolbox modified from MMdetection

Unifying Global-Local Representations in Salient Object Detection with Transformer

A parametric soroban written with CADQuery.

An atmospheric growth and evolution model based on the EVo degassing model and FastChem 2.0

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.

Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)

Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Implementation of Segnet, FCN, UNet , PSPNet and other models in Keras.

The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Monify: an Expense tracker Program implemented in a Graphical User Interface that allows users to keep track of their expenses

Image Lowpoly based on Centroid Voronoi Diagram via python-opencv and taichi

Temporal-Relational CrossTransformers