Generate a repository with mirror links for DriveDroid app

Last update: Nov 19, 2022

Overview

DriveDroid Repository Generator

Generate a repository for the app that allow boot a PC using ISO files stored on your Android phone

Check also an official scraper written in JavaScript

Try Already Built Repo

Add the next link to image repositories in DriveDroid app:

https://dd.hexed.pw

https://raw.githubusercontent.com/flameshikari/ddrg/master/repo/repo.json

Requirements
Usage
How to Make a Scraper
Misc
Roadmap
Credits
License

Requirements

Python 3.6+ with packages included in requirements.txt.

I recommend to create a venv then install packages there.

Usage

python ./src/main.py [-i dir] [-o dir] [-g]

-i dir where dir is a directory with distro scrapers (./src/distros is default).

-o dir where dir is a directory where the built repo will be saved (./build is default).

-g will generate a webpage to present the content of repo.json.

-h option is available anyway.

How to Make a Scraper

Create a folder in ./src/distros with next structure:

distro_name
├── info.toml
├── logo.png
└── scraper.py

If distro_name starts with underscore (e.g. _disabled), it will not be counted.

Let's take a look for every file.

`info.toml`

info.toml contains a distro name and a link to the official website. Arch Linux info.toml example:

name = "Arch Linux" # name of distro
url  = "https://example.com" # official site

If info.toml is missing or values ain't provided, fallback values will be used. Arch Linux fallback values will be next:

name = "arch" # distro folder name as value, also used in url
url  = "https://distrowatch.com/table.php?distribution=arch"

`logo.png`

Should be 128x128px with transparent background. Arch Linux logo.png example:

If logo.png is missing, the fallback logo will be used:

`scraper.py`

A scraper can be written as you like, as long as it returns the desired values.

It must return an array of tuples (every tuple contains iso_url, iso_arch, iso_size, iso_version in order).

Arch Linux scraper returns next values:

[
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.05.01/archlinux-2021.05.01-x86_64.iso',
    'x86_64',
    792014848,
    '2021.05.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.06.01/archlinux-2021.06.01-x86_64.iso',
    'x86_64',
    811937792,
    '2021.06.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/2021.07.01/archlinux-2021.07.01-x86_64.iso',
    'x86_64',
    817180672,
    '2021.07.01'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/archboot/2020.07/archlinux-2020.07-1-archboot-network.iso',
    'x86_64',
    516947968,
    '2020.07'
  ),
  (
    'https://mirror.yandex.ru/archlinux/iso/archboot/2020.07/archlinux-2020.07-1-archboot.iso',
    'x86_64',
    1280491520,
    '2020.07'
  )
]

A scraper includes from public import * in top which imports next stuff to the namespace:

bs (short for BeautifulSoup)
json
re
requests

Also it includes these functions:

get_afh_url(iso_url) — returns a download link for the file from AndroidFileHost
iso_url must be like this: https://androidfilehost.com/?fid=8889791610682936459
get_iso_arch(iso_url) — returns the used processor architecture of iso_url
get_iso_size(iso_url) — returns the file size of iso_url in bytes

Arch Linux scraper.py example:

from public import *  # noqa


def init():

    array = []
    base_urls = [
        "https://mirror.yandex.ru/archlinux/iso/latest",
        "https://mirror.yandex.ru/archlinux/iso/archboot/latest"
    ]

    for base_url in base_urls:

        html = bs(requests.get(base_url).text, "html.parser")

        for filename in html.find_all("a", {"href": re.compile("^.*\.iso$")}):

            iso_url = f"{base_url}/{filename['href']}"
            iso_arch = get_iso_arch(iso_url)
            iso_size = get_iso_size(iso_url)
            iso_version = re.search(r"-(\d+.\d+(.\d+)?)", iso_url).group(1)

            array.append((iso_url, iso_arch, iso_size, iso_version))

    return array

Misc

Here's a snippet for nginx if you decided to self host the repository with website and you wanna access repo.json only by hostname via DriveDroid. Place it in server section of your config:

location = / {
  if ($http_user_agent ~* 'okhttp') {
    rewrite ^/(.*)$ /repo.json break;
  }
}

Roadmap

Option to generate a webpage
Add a mechanism to retry scraping if a network error occurs
Option to select mirrors (mainly uses mirrors based in Russia)
Package this project perhaps
Probably make the code better

Credits

afh-dl by kade-robertson
Yandex.Disk direct links by DokPub

License

MIT License

Generate a repository with mirror links for DriveDroid app

Related tags

Overview

DriveDroid Repository Generator

Try Already Built Repo

Contents

Requirements

Usage

How to Make a Scraper

`info.toml`

`logo.png`

`scraper.py`

Misc

Roadmap

Credits

License

Owner

Evgeny

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

A Python module to bypass Cloudflare's anti-bot page.

A Simple Web Scraper made to Extract Download Links from Todaytvseries2.com

script to scrape direct download links (ddls) from google drive index.

Web scrapping

A simple proxy scraper that utilizes the requests module in python.

Meme-videos - Scrapes memes and turn them into a video compilations

Simple proxy scraper made by using ProxyScrape's api.

Web scraped S&P 500 Data from Wikipedia using Pandas and performed Exploratory Data Analysis on the data.

An experiment to deploy a serverless infrastructure for a scrapy project.

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

This project was created using Python technology and flask tools to scrape a music site

News, full-text, and article metadata extraction in Python 3. Advanced docs:

TarkovScrappy - A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov!

让中国用户使用git从github下载的速度提高1000倍!

Danbooru scraper with python

Poolbooru gelscraper - a simple python script for scraping images off gelbooru pools.

UdemyBot - A Simple Udemy Free Courses Scrapper

Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil