Software to help automate collecting crowdsourced annotations using Mechanical Turk.

Overview

Video Crowdsourcing

Software to help automate collecting crowdsourced annotations using Mechanical Turk.

The goal of this project is to enable crowdsourced collection of annotations on video data. This was built to collect skill annotations on medium length snippets of video (1-2 minutes), but was built with flexibility in mind so researchers can adapt the code to fit their needs.


How it Works

Videos from a YouTube playlist are used to programatically build surveys, including a "qualification" survey to verify responses. These surveys are sent to Mechanical Turk to create HITs for crowd workers. Once on Mechanical Turk, this software includes tools to manage payments to workers who do and do not pass the qualification questions. Finally, all responses from the workers can be collected in one place.


Instructions

1) Install requirements

You will need:

  • Access to a command line (terminal)
  • Download of this respository
    • git clone https://github.com/mpeven/Video_Crowdsourcing.git
  • Python
    • Note: this can be done easily using Conda to install Python and required libraries
  • Installation of required libraries
    • If using conda: conda install -c conda-forge --file requirements.txt
    • If using pip: pip install -r requirements.txt

2) Run Command Line Interface (CLI)

The CLI can be run with python main.py and should guide you through the rest of steps outlined below. Refer to this README if more details are needed.

3) Upload videos

  1. Upload videos to YouTube
    • Go to https://studio.youtube.com/ and click 'Create' to upload videos
    • Make sure videos are published and do not have 'Draft' status
    • IMPORTANT: Make sure videos are listed as Unlisted or Public (Private YouTube videos can't be seen in the survey)
  2. Create YouTube playlists for qualification videos and survey (un-annotated) videos
    • Once the videos are uploaded, create these two playlists and move them into the correct playlist
  3. Put title of the YouTube playlists in the SURVEY section of the config file

4) Create surveys

  1. Get access to YouTube Data API
    • Instructions here: link
    • IMPORTANT: Make sure you set "Application type" as Desktop app when you are on the page "Create OAuth client ID"
    • Download the JSON file of the OAuth client secrets and remember the path for the next step
  2. Fill out the needed sections of the config file
    • YOUTUBE section: oauth client secrets json file location
    • SURVEY section: number of videos per survey
  3. Create surveys using the option in the CLI
  4. Verify the survey is correct by opening the sample survey in a web browser

5) MTURK steps

  1. Create an AWS account
    • Instructions here: link
    • Put the access keys in the config file
  2. Create a Mechanical Turk Account
  3. Create a Mechanical Turk "Sandbox" Account for testing
  4. Upload sandbox-mode HITs using CLI
  5. Upload live HITs using CLI
  6. Periodically check on status and manage payments

Authors

  • Michael Peven (main contact - mpeven@jhu.edu)
  • Tingwen Guo

This work builds upon previous work done by Anand Malpani and Colin Lea


Acknowledgements

We would like to thank the following for support and funding:

  • Swaroop Vedula
  • Gregory Hager
  • Science of Learning Institute
Owner
Mike Peven
Mike Peven
Networkx with neo4j back-end

Dump networkx graph into nodes/relations TSV from neo4jnx.tsv import graph_to_tsv g = pklload('indranet_dir_graph.pkl') graph_to_tsv(g, 'docker/nodes.

Benjamin M. Gyori 1 Oct 27, 2021
Simple script to export contacts from telegram into vCard file

Telegram Contacts Exporter Simple script to export contacts from telegram into vCard file Getting Started Prerequisites You must to put your Telegram

Pere Antoni 1 Oct 17, 2021
ULID implementation for Python

What is this? This is a port of the original JavaScript ULID implementation to Python. A ULID is a universally unique lexicographically sortable ident

Martin Domke 158 Jan 04, 2023
Generates a random prnt.sc link and display image.

Generates a random prnt.sc link and display image.

Emirhan 3 Oct 08, 2021
EVE-NG tools, A Utility to make operations with EVE-NG more friendly.

EVE-NG tools, A Utility to make operations with EVE-NG more friendly. Also it support different snapshot operations with same style as Libvirt/KVM

Bassem Aly 8 Jan 05, 2023
It is a tool that looks for a specific username in social networks

It is a tool that looks for a specific username in social networks

MasterBurnt 6 Oct 07, 2022
A collection of tools for biomedical research assay analysis in Python.

waltlabtools A collection of tools for biomedical research assay analysis in Python. Key Features Analysis for assays such as digital ELISA, including

Tyler Dougan 1 Apr 18, 2022
API Rate Limit Decorator

ratelimit APIs are a very common way to interact with web services. As the need to consume data grows, so does the number of API calls necessary to re

Tomas Basham 575 Jan 05, 2023
Import the module and create an object of the class LocalVariable.

LocalVariable Import the module and create an object of the class LocalVariable. Call the save method with the name and the value of a variable as arg

Sajedur Rahman Fiad 2 Dec 14, 2022
Extract XML from the OS X dictionaries.

Extract XML from the OS X dictionaries.

Joshua Olson 13 Dec 11, 2022
Check subdomains for Open S3 buckets

SuBuket v1.0 Check subdomains for Open S3 buckets Coded by kaiz3n Basically, this tool makes use of another tool (sublist3r) to fetch subdomains, and

kaiz3n 4 Dec 29, 2021
About Library for extract infomation from thai personal identity card.

ThaiPersonalCardExtract Library for extract infomation from thai personal identity card. imprement from easyocr and tesseract New Feature v1.3.2 🎁 In

ggafiled 26 Nov 15, 2022
Course-parsing - Parsing Course Info for NIT Kurukshetra

Parsing Course Info for NIT Kurukshetra Overview This repository houses code for

Saksham Mittal 3 Feb 03, 2022
Rabbito is a mini tool to find serialized objects in input values

Rabbito-ObjectFinder Rabbito is a mini tool to find serialized objects in input values What does Rabbito do Rabbito has the main object finding Serial

7 Dec 13, 2021
SH-PUBLIC is a python based cloning script. You can clone unlimited UID facebook accounts by using this tool.

SH-PUBLIC is a python based cloning script. You can clone unlimited UID facebook accounts by using this tool. This tool works on any Android devices without root.

(Md. Tanvir Ahmed) 5 Mar 09, 2022
The Black shade analyser and comparison tool.

diff-shades The Black shade analyser and comparison tool. AKA Richard's personal take at a better black-primer (by stealing ideas from mypy-primer) :p

Richard Si 10 Apr 29, 2022
A simple example for calling C++ functions in Python by `ctypes`.

ctypes-example A simple example for calling C++ functions in Python by ctypes. Features call C++ function int bar(int* value, char* msg) with argumene

Yusu Pan 3 Nov 23, 2022
Playing with python imports and inducing those pesky errors.

super-duper-python-imports In this repository we are playing with python imports and inducing those pesky ImportErrors. File Organization project │

James Kelsey 2 Oct 14, 2021
Report Bobcat Status to Google Sheets

bobcat-status-reporter Report Bobcat Status to Google Sheets Why? I recently relocated my miner from my root into the attic. Bobcat recommends operati

Jasmit Tarang 3 Sep 22, 2021
A script to parse and display buy_tag and sell_reason for freqtrade backtesting trades

freqtrade-buyreasons A script to parse and display buy_tag and sell_reason for freqtrade backtesting trades Usage Copy the buy_reasons.py script into

Robert Davey 31 Jan 01, 2023