Convert Lecture Videos to PDF

Overview

Convert Lecture Videos to PDF

Description

Want to go through lecture videos faster without missing any information? Wish you can read the lecture video instead of watching it? Now you can! With this python application, you can convert lecture videos to a PDF file! The PDF file will contain a screenshot of lecture slides presented in the video, along with a transcription of your instructor explaining those lecture slide. It can also handle instructors making annotations on their lecture slides and mild amounts of PowerPoint animations.

Table of Contents

  • Walkthrough
  • Getting Started
  • Tweeking the Application
  • Next steps
  • Usage
  • Credits
  • License

Walkthrough of this project

Users will need to download a video file of their lecture. For instance, the video file might look like this:

Users will also need a copy of the video's subtitles.

After running the command line tool, they will get a PDF that looks like this:

where each page contains an image of the lecture video, and a transcription of the instructor explaining about that slide.

Getting Started

  1. Ensure Python3 and Pip is installed on your machine

  2. Next, install package dependencies by running:

    pip3 install -r requirements.txt

  3. Now, run:

    python3 src/main.py tests/videos/input_1.mp4 -s tests/subtitles/subtitles_1.vtt -o output.pdf

    to generate a PDF of this lecture video with these subtitles

  4. The generated PDF will be saved as output.pdf

Tweeking the Application

This application uses computer vision with OpenCV to detect when the instructor has moved on to the next PowerPoint slide, detect animations, etc.

You can adjust the sensitivity to video frame changes in the src/video_segment_finder.py file. You can also visualize how well the application detect transitions and animations via the src/plot.py tool.

Next Steps

  • Automatically generate subtitles
  • Wrap project into a web app?

Usage

Please note that this project is used for educational purposes and is not intended to be used commercially. We are not liable for any damages/changes done by this project.

Credits

Emilio Kartono, who made the entire project.

License

This project is protected under the GNU licence. Please refer to the LICENSE.txt for more information.

Owner
Emilio Kartono
Computer Science Graduate from the University of Toronto
Emilio Kartono
Excalibur: A web interface to extract tabular data from PDFs

Excalibur: A web interface to extract tabular data from PDFs Excalibur is a web interface to extract tabular data from PDFs, written in Python 3! It i

1.2k Jan 04, 2023
DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata

DietPDF aims at reducing PDF file size while not degrading quality nor losing metadata

Frédéric BISSON 6 Jul 27, 2022
Camelot is a Python library that makes it easy for anyone to extract tables from PDF files

Camelot: PDF Table Extraction for Humans Camelot is a Python library that makes it easy for anyone to extract tables from PDF files! Note: You can als

Atlan Technologies Pvt Ltd 3.3k Jan 06, 2023
Python PDF Parser (Not actively maintained). Check out pdfminer.six.

PDFMiner PDFMiner is a text extraction tool for PDF documents. Warning: As of 2020, PDFMiner is not actively maintained. The code still works, but thi

Yusuke Shinyama 4.9k Jan 04, 2023
A Python tool to generate a static HTML file that represents the internal structure of a PDF file

PDFSyntax A Python tool to generate a static HTML file that represents the internal structure of a PDF file At some point the low-level functions deve

Martin D. 394 Dec 30, 2022
Python lib for Simple PDF text extraction

Python lib for Simple PDF text extraction

Jason Alan Palmer 651 Jan 01, 2023
JoplinPdf2Images - Converts a PDF to images in Joplin and adds it to the specified note as a printout

joplinPdf2Images Converts a PDF to images in Joplin and adds it to the specified

Morten Haahr Kristensen 2 Apr 20, 2022
Scans pdfs for links written in plaintext and checks if they are active or returns an error code.

Scans pdfs for links written in plaintext and checks if they are active or returns an error code. It then generates a report of its findings. Extract references (pdf, url, doi, arxiv) and metadata fr

Marshal Miller 22 Nov 21, 2022
minipdf is a package for creating simple, single-page PDF documents.

minipdf minipdf is a package for creating simple, single-page PDF documents. Installation You can install the development version from GitHub with: #

mikefc 41 Dec 19, 2022
Converting Html files to pdf using python script, pdfkit module and wkhtmltopdf.

Html-to-pdf-pdfkit-wkhtml- This repository has code for converting local html files and online html resources into pdf. It is an python script which u

Hemachandran P 1 Nov 09, 2021
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf

8k Jan 08, 2023
Convert Lecture Videos to PDF

Convert Lecture Videos to PDF Description Want to go through lecture videos faster without missing any information? Wish you can read the lecture vide

Emilio Kartono 20 Nov 25, 2022
CLI tool to generate pdf invoices written in python

invoicepy CLI invoice tool, store and print invoices as pdf. save companies and customers for later use. installation pip install invoicepy config co

Adam Wojtczak 9 Aug 01, 2022
A python library for extracting text from PDFs without losing the formatting of the PDF content.

Multilingual PDF to Text Install Package from Pypi Install it using pip. pip install multilingual-pdf2text The library uses Tesseract which can be ins

Shahrukh Khan 49 Nov 07, 2022
Convert MD files to PDF automatically (with CSS) 📄🚀

MD2PDF Action Convert MD files to PDF automatically (with CSS)! Converts a pattern described set of markdown files and converts them to pdf whilst app

Will Fantom 1 Feb 09, 2022
Simple python tool created for downloading PDF.

PDFdownloader Usage Open PDF in full-screen mode Run scan.exe Enter how many pages you want to scan Focus PDF After scanning is done, run merge.exe En

5 Oct 27, 2021
Python script that split PDF files.

Automatic PDF Splitter This script can create new single-page PDFs files from multipaged PDFs. Requirements Python 3.0+ # Debian distros sudo apt-get

Leandro Padula 5 Apr 02, 2022
Convert PDF to AudioBook and Audio Speech to PDF

In this Python project, we will build a GUI-based PDF to Audio and Audio to PDF converter using the Tkinter, OS, path, pyttsx3, SpeechRecognition, PyPDF4, and Pydub libraries and the messagebox modul

RISHABH MISHRA 1 Feb 13, 2022
A backend for mdbook in Python for generating PDF based on Chrome DevTools Protocol.

mdbook-pdf A backend for mdbook written in Python for generating PDF based on Chrome DevTools Protocol. Python library dependency Usage Put mdbook-pdf

Hollow Man 49 Dec 27, 2022
pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative markdown file as input

pystitcher pystitcher stitches your PDF files together, generating nice customizable bookmarks for you using a declarative input in the form of a mark

Nemo 387 Dec 10, 2022