A github actions + python code to extract URLs to code repositories to put into standard form, starting with github

Overview

repo_link_extractor

A github actions + python code to extract URLs to code repositories to put into standard form, starting with github

---- NOTE: JUST STARTED ONLY AN IDEA TO COME BACK TO ----

Summary

first minimum viable product goal

The first minimum viable product goal will be to harvest from https://github.com/softwareunderground/awesome-open-geoscience/blob/main/README.md all the github repositories URLs such that the form returned is https://github.com/ + "username" + "repository name" and then add them to the "repos" key in an existing JSON in a form like this: https://github.com/softwareunderground/open_geosciene_code_projects_viz/blob/main/_explore/input_lists.json , which is summarized below:

{
    "memberOrgs": [
        "softwareunderground"
    ],
    "orgs": [
        "agile-geoscience",
        "softwareunderground"
    ],
    "repos": [
        "ahotovec/redpy",
        "whamlyn/auralib"
    ]
}

2nd intermediate product goal

  • Fires from a GitHubAction

3rd intermediate product goal

Eventual product goal

  • works for public, internal, and private GitHub URLs
  • Works for GitHub, GitLab, BitBucket, and other code repository URLS & APIs
  • Keeps track of harvest date, source file name, source file URL & code platform & domain in an intermediate file.

Related Projects

This is referenced on an issue here: https://github.com/softwareunderground/open_geosciene_code_projects_viz/issues/23

Potential Useful Bits

regular expression (https:\/\/github.com\/)\w+(\/)\w+ seems like a good starting point for the extraction of Github URLs.

GitHub Actions Structure Tentative:

  • download README file
  • replace old README file with new
  • extract all links matching a regular expression
  • sort & take out duplicates
  • make into JSON with domain, URL, org or username, repository name, source file name, source file link, and date of harvests
  • pull out org or username & repository name from above and put into appropriate key of the file JSON if not already there in either org or repo keys.

How to Integrate into https://github.com/softwareunderground/open_geosciene_code_projects_viz ??????

Options:

  1. Put all of the code here into the repository: https://github.com/softwareunderground/open_geosciene_code_projects_viz
  2. Call the code here from https://github.com/softwareunderground/open_geosciene_code_projects_viz
If calling the code....
  • (1) add the script to read the README to MASTER.sh as the first step
  • (2) set master.sh to be callled by GitHub actions
  • (3) when triggered the github actions does the entirity of the github actions in this repo, including calling the python scripts as its first step.
  • (4) latter steps include setting up the environnment and calling all the python scripts that the master.sh bash script calls. The code would need to be called by either a GitHub Action on (pull request, push, manual, or cron job) or by trigger after the call to refresh the
Owner
Justin Gosses
Machine-Learning | Data Visualization | Geoscience | NASA |
Justin Gosses
A Discord webhook spammer made in Python

A Python made Discord webhook spammer usually used for token loggers to spam them/delete them original by cattyn changes listed below.

2 Jan 12, 2022
Discord Mass Report script that uses multiple tokens

Discord-Mass-Report Discord Mass Report script that uses multiple tokens, full credits to https://github.com/hoki0/Discord-mass-report who made it in

cChimney 4 Jun 08, 2022
SickNerd aims to slowly enumerate Google Dorks via the googlesearch API then requests found pages for metadata

CLI tool for making Google Dorking a passive recon experience. With the ability to fetch and filter dorks from GHDB.

Jake Wnuk 21 Jan 02, 2023
E-Commerce Telegram Bot for UCA Students

ucaStudentStore To buy from and sell to other students Features Register the first time, after that you will always be recognised You can login either

Shukur Sabzaliev 5 Jun 26, 2022
My telegram bot to download Instagram Profiles

Instagram Profile Get for Telegram My telegram bot to download Instagram Profiles First you have to get a telegrm bot api key from @BotFather Then you

Ali Yoonesi 2 Sep 22, 2022
Azure free vpn for students only! (Self hosted/No sketchy services/Fast and free)

Azpn-Azure-Free-VPN Azure free vpn for students only! (Self hosted/No sketchy services/Fast and free) This is an alternative secure way of accessing f

Harishankar Kumar 6 Mar 19, 2022
3X Fast Telethon Based Bot

📺 YouTube Song Downloader Bot For Telegram 🔮 3X Fast Telethon Based Bot ⚜ Easy To Deploy 🤗

@Dk_king_offcial 1 Dec 09, 2021
⛑ REDCap API interface in Python

REDCap API in Python Description Supports structured data extraction for REDCap projects. The API module d3b_redcap_api.redcap.REDCapStudy can be logi

D3b 1 Nov 21, 2022
Pixiv 爬虫,使用 Python 实现。支持批量下载、上传到图床。

用 Python 实现的 Pixiv 爬虫,支持批量下载和上传。 随机图片 API: https://loliapi.ml/ Deploy Github Action 集成部署 建议使用本方法部署,相较于本地部署,无需搭建环境,全程在线上完成。并且使用国外服务器下载、上传,网络更加通畅。 Fork

18 Feb 26, 2022
Analyzed the data of VISA applicants to build a predictive model to facilitate the process of VISA approvals.

Analyzed the data of Visa applicants, built a predictive model to facilitate the process of visa approvals, and based on important factors that significantly influence the Visa status recommended a s

Jesus 1 Jan 08, 2022
A Discord bot written in Python that can be used to control event management on a server.

Event Management Discord Bot A Discord bot written in Python that can be used to control event management on a Discord server. Made originally for GDS

Suvaditya Mukherjee 2 Dec 07, 2021
Cookiecutter templates for Serverless applications using AWS SAM and the Rust programming language.

Cookiecutter SAM template for Lambda functions in Rust This is a Cookiecutter template to create a serverless application based on the Serverless Appl

AWS Samples 24 Nov 11, 2022
✨ A Telegram mirror/leech bot By SparkXcloud Group ✨

SparkXcloud-Gdrive-MirrorBot SparkXcloud-Gdrive-MirrorBot is a multipurpose Telegram Bot writen in Python for mirroring files on the Internet to our b

119 Oct 23, 2022
A twitter bot that simply replies with a beautiful screenshot of the tweet, powered by beautify.dhravya.dev

Poet this! Replies with a beautiful screenshot of the tweet, powered by poet.so Installation git clone https://github.com/dhravya/poet-this.git cd po

Dhravya Shah 30 Dec 04, 2022
a Disqus alternative

Isso – a commenting server similar to Disqus Isso – Ich schrei sonst – is a lightweight commenting server written in Python and JavaScript. It aims to

Martin Zimmermann 4.7k Jan 02, 2023
An Advanced Python Playing Card Module that makes creating playing card games simple and easy!

playingcards.py An Advanced Python Playing Card Module that makes creating playing card games simple and easy! Features Easy to Understand Class Objec

Blake Potvin 5 Aug 30, 2022
Slack->DynamDB->Some applications

slack-event-subscriptions About The Project Do you want to get simple attendance checks? If you are using Slack, participants can just react on a spec

UpstageAI 26 May 28, 2022
Discord Token Nuker With Python

Discord token nuker a.k.a A$$Fvcker Setup For installing the requirements do this: pip install -r requirements.txt To start the Token nuker run this

PR3C14D0 8 Sep 22, 2022
Simple web-based hcaptcha bypass

Hcaptcha-Bypass !!! If you found this useful, please click the STAR button !!! Simple web-based hcaptcha bypass Just a demonstration right now, and yo

Kieronia 4 Oct 06, 2021
Pincer-ext-commands - A simple, lightweight package for pincer prefixed commands

pincer.ext.commands A reimagining of pincer's command system and bot system. Ins

Vincent 2 Jan 11, 2022