Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

Overview

Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph

This repository provides a pipeline to create a knowledge graph from raw texts. The pipeline concatenate major steps including:

  • Data processing: transform labeled text data to the Subject-Predicate-Object (SPO) format
  • Training: use a RNN-based algorithm to train an AI model to predict SPOs from given texts
  • Create a Neptune database: if the training metric (F1-Score) passes the threshold, create a Neptune database
  • Batch Transform: use the model trained in the Training step to do inferences on the test data
  • Bulk load: transform the inference results to the format which can be recognized by the bulkload function of Neptune, and load the transformed data to the Neptune database.

Prerequisites

  • Create an AWS account or use an existing AWS account.
  • Create a SageMaker Notebook instance. When you set up the notebook instance, you need to pay attention to following configurations:
    1. IAM role: you should attach policies of AmazonSageMakerFullAccess, IAMFullAccess, AmazonS3FullAccess, AmazonSNSFullAccess and NeptuneFullAccess to the IAM role.
    2. Network: in order to access the Neptune database created in the pipeline, a VPC is required to run the notebook.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Owner
AWS Samples
AWS Samples
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners This repository is built upon BEiT, thanks very much! Now, we on

Zhiliang Peng 2.3k Jan 04, 2023
Chainer implementation of recent GAN variants

Chainer-GAN-lib This repository collects chainer implementation of state-of-the-art GAN algorithms. These codes are evaluated with the inception score

399 Oct 23, 2022
GANfolk: Using AI to create portraits of fictional people to sell as NFTs

GANfolk are AI-generated renderings of fictional people. Each image in the collection was created by a pair of Generative Adversarial Networks (GANs) with names and backstories also created with AI.

Robert A. Gonsalves 32 Dec 02, 2022
Management Dashboard for Torchserve

Torchserve Dashboard Torchserve Dashboard using Streamlit Related blog post Usage Additional Requirement: torchserve (recommended:v0.5.2) Simply run:

Ceyda Cinarel 103 Dec 10, 2022
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

Adam Van Etten 145 Jan 01, 2023
BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work

BasicNeuralNetwork - This project looks over the basic structure of a neural network and how machine learning training algorithms work. For this project, I used the sigmoid function as an activation

Manas Bommakanti 1 Jan 22, 2022
repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments

repro_eval repro_eval is a collection of measures to evaluate the reproducibility/replicability of system-oriented IR experiments. The measures were d

IR Group at Technische Hochschule Köln 9 May 25, 2022
This repository collects 100 papers related to negative sampling methods.

Negative-Sampling-Paper This repository collects 100 papers related to negative sampling methods, covering multiple research fields such as Recommenda

RUCAIBox 119 Dec 29, 2022
Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

Fastformer-PyTorch Unofficial PyTorch implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Usage : import t

Hong-Jia Chen 126 Dec 06, 2022
Relative Human dataset, CVPR 2022

Relative Human (RH) contains multi-person in-the-wild RGB images with rich human annotations, including: Depth layers (DLs): relative depth relationsh

Yu Sun 112 Dec 02, 2022
Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 06, 2022
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
Point cloud processing tool library.

Point Cloud ToolBox This point cloud processing tool library can be used to process point clouds, 3d meshes, and voxels. Environment python 3.7.5 Dep

ZhangXinyun 40 Dec 09, 2022
ObjectDetNet is an easy, flexible, open-source object detection framework

Getting started with the ObjectDetNet ObjectDetNet is an easy, flexible, open-source object detection framework which allows you to easily train, resu

5 Aug 25, 2020
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple 385 Dec 28, 2022
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.

2.3k Jan 09, 2023
Husein pet projects in here!

project-suka-suka Husein pet projects in here! List of projects mysejahtera-density. Generate resolution points using meshgrid and request each points

HUSEIN ZOLKEPLI 47 Dec 09, 2022
Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection Main requirements torch = 1.0 torchvision = 0.2.0 Python 3 Environm

15 Apr 04, 2022
Res2Net for Instance segmentation and Object detection using MaskRCNN

Res2Net for Instance segmentation and Object detection using MaskRCNN Since the MaskRCNN-benchmark of facebook is deprecated, we suggest to use our mm

Res2Net Applications 55 Oct 30, 2022
SeqAttack: a framework for adversarial attacks on token classification models

A framework for adversarial attacks against token classification models

Walter 23 Nov 25, 2022