Python script to download (TCR) genes from IMGT/GENE-DB

Related tags

DownloaderIMGTgeneDL
Overview

IMGTgeneDL

0.1.0

Jamie Heather | CCR @ MGH | 2021

This script provides an alternative way to access TCR and IG genes stored in IMGT/GENE-DB. It's primarily designed for downloading human/mouse TCRs, but it's readily adaptable to other species/loci.

Usage

This script is tested on python >= 3.6, not requiring any non-standard packages.

Download specific gene types

The primary way this script is intended to be used is to tell it the species, loci, and sequence types that you want to download. This will be downloaded and saved to a file named with the date the IMGT release used, and details of the combination of parameters searched for (unless overriden with the -o / --out_path flag).

Species

The script must be run on single species at a time, given via the -s / --species flag as a full genus species with a '+' symbol in place of the space. E.g.:

  • -s Homo+sapiens
  • -s Mus+musculus

Note that it doesn't seem that the IMGT URL interface will accept either genus or species alone, and it can be particular about formatting so maintaining proper case is advised. However it seems to download sub-species (e.g.\ searching for Mus+musculus will return all covered strains).

Loci

This script is currently configured to download the four common TCR loci:

  • A / TRA / alpha
  • B / TRB / beta
  • G / TRG / gamma
  • D / TRD / delta

These must be provided to the script using the -L / --loci flag, giving it the desired loci as a string of characters, e.g. -L AB to just download alpha and beta sequences, or -L G to just download gamma. Alternatively -L TR will simply download all four chains' sequences (equivalent to -L ABGD).

Sequence types

This script is designed to help in the aid in the analysis of typical expressed repertoires, and thus is configured by default to download the relevant parts of the loci that end up involved in expressed transcripts. The download of each is achieved using specific flags:

  • -l / --get_l: download leader sequences
  • -v / --get_v: download V sequences
  • -d / --get_d: download D sequences
  • -j / --get_j: download J sequences
  • -c / --get_c: download constant region sequences

Note that these can be combined, e.g. -vdj will just download the V, D, and J gene sequences. Alternatively users can apply the -r / --get_all_regions flag to just download all of these regions (equivalent to -lvdjc).

Examples

The following is the basic command to download all relevant human sequences for all chains:

python3 IMGTgeneDL.py -s Homo+sapiens -L TR -r

While this is a command to just download delta chain J genes from mice:

python3 -i IMGTgeneDL.py -s Mus+musculus -j -L D

Download whole database

If no locus and gene type flags are used, or if the -a / --get_all flag is used, then the script will just download the whole of GENE-DB - all species, all genes, all loci. By default this downloads the ungapped nucleotide file, with all pseudogenes, and saves this to a file named with the date and the IMGT release used. This can be changed using the following flags:

  • -gap / --gapped: downloads the gapped FASTA instead of the ungapped
  • -ifp / --in_frame_p: downloads the 'inframeP' FASTA instead of the 'allP'
  • -o / --out_path: as above, sets the path to a specific file if you don't wish to use the automatic names or save in the same directory

Notes

Downloaded regions

The architecture of the TCR loci differs a little between the genes and across species, which the IMGT nomenclature has specific terms to cope with. However the URL based searching this script does requires provision exact exon names for each species. This script assumes generic defaults, but these can be overriden by providing specific details in the tab-delimited region-overrides.tsv file. This allows users to override which fields are downloaded, or download additional fields, by adding an entry for the relevant gene/species combination and filling in the final 'Field(s)' comma-delimited field with the IMGT labels to be downloaded.

The most relevant place this comes in to place is in the constant regions, which have differing numbers and names of exons. The relevant differences for these loci/species are that exon 4 of the alpha and delta chains is an UTR, while gamma chains lack a fourth exon and have duplicated exon 2 variants. If users wish to run the script to download specific sequences including constant regions for species other than humans or mice they will need to edit this document appropriately first.

The other default IMGT labels downloaded are:

  • L-PART1+L-PART2 for leader sequences
  • V-/D-/J-REGION for V/D/J genes

IMGT FASTA headers

The IMGT header FASTA fields (as reported in the output of GENE-DB) are:

The FASTA header contains 15 fields separated by '|':

1. IMGT/LIGM-DB accession number(s)
2. IMGT gene and allele name
3. species
4. IMGT allele functionality
5. exon(s), region name(s), or extracted label(s)
6. start and end positions in the IMGT/LIGM-DB accession number(s)
7. number of nucleotides in the IMGT/LIGM-DB accession number(s)
8. codon start, or 'NR' (not relevant) for non coding labels
9. +n: number of nucleotides (nt) added in 5' compared to the corresponding label extracted from IMGT/LIGM-DB
10. +n or -n: number of nucleotides (nt) added or removed in 3' compared to the corresponding label extracted from IMGT/LIGM-DB
11. +n, -n, and/or nS: number of added, deleted, and/or substituted nucleotides to correct sequencing errors, or 'not corrected' if non corrected sequencing errors
12. number of amino acids (AA): this field indicates that the sequence is in amino acids
13. number of characters in the sequence: nt (or AA)+IMGT gaps=total
14. partial (if it is)
15. reverse complementary (if it is)
Disclaimer

I am not affiliated with IMGT, and this tool is only shared as a way to increase the utility of their platform. Please TCR responsibly.

Owner
Jamie Heather
Postdoc research working in cancer immunology at MGH.
Jamie Heather
Itchio Downloader Tool with python

Itchio Downloader Tool Install pip install git+https://github.com/emersont1/itchio Download All Games in library from account python -m itchio.downloa

Peter Taylor 69 Dec 05, 2022
Download all your URI Online Judge source codes and upload to GitHub with simple steps.

URI-Code-Downloader Download all your URI Online Judge source codes and upload to GitHub with simple steps. Prerequisites Python 3.x Installing Downlo

Luan Simões 9 Mar 23, 2022
Download any video from YouTube playlists

youtube-dl Download any videos from YouTube playlists. Requirements Python 3 BeautifulSoup4 PyQt PyQtWebEngine pytube pyyoutube python-decouple Usage

Antonio Fortin 1 Oct 26, 2021
📺 YouTube Song Downloader Bot For Telegram 🔮

📺 YouTube Song Downloader Bot For Telegram 🔮 Powerd By TamilBots.

Tamil Bots 146 Dec 31, 2022
DYA ( Ditch YouTube API ) is a package created to power the user with YouTube Data API functionality without any API Key

Ditch YouTubeAPI (BETA) DYA ( Ditch YouTube API ) is a package created to power the user with YouTube Data API functionality without any API Key Detai

Sougata Jana 23 Dec 22, 2022
Script that allows to download portable installers of different versions Adobe software for macOS

What is this and for what This is a script that allows you to download portable installers of programs from Adobe for macOS with different versions. T

715 Jan 06, 2023
Python module to download all media from a GoFile gallery.

GoFile Downloader Setup First of all, clone this repository : ~$ git clone https://github.com/quatrecentquatre-404/gofile-downloader Second, oh wait..

Quatrecentquatre 61 Jan 01, 2023
This Program helps you download songs from the Spotify track's link you give in.

Spotify-Downloader-GUI This Program helps you download songs from the Spotify track's link you give in. It uses yt-dlp to download songs from Youtube.

Harish 12 Jun 14, 2022
TikTok downloader video without watermark from Telegram bot

⬇️ How to download video from Tik Tok via telegram bot? Send a link to the video from tik tok to our telegram bot and it will send you a video without

1 Mar 04, 2022
Can automatically download mods from a Curseforge modpack

Curseforge-Modpack-Downloader A Python script which automatically downloads mods from a Curseforge modpack. Installing Dependencies ⚠ Make sure you ha

Rayr 1 Sep 20, 2022
this is udemy course downloader, before a start you know how to get access token.

udemy_downloader this is udemy course downloader, before a start you know how to get access token. To get the access_token on Google Chrome (once on U

OkUgur 18 Dec 04, 2022
Python module to donwload all Pixiv artworks of a user using it's user ID.

Python module to donwload all Pixiv artworks of a user using it's user ID. You need a PHPSESSID token to export NSFW.

Quatrecentquatre 1 Jan 27, 2022
Python-Youtube-Downloader - An Open Source Python Youtube Downloader

Python-Youtube-Downloader Hello There This Is An Open Source Python Youtube Down

Flex Tools 3 Jun 14, 2022
bing image downloader app used to download bulk images for a specific search term created using streamlit and bing_image_downloader python packages

bing image downloader app bing image downloader app is used to download bulk images for a specific search term. bing image downloader app gets the sea

Siva Prakash 8 Apr 05, 2022
Heroic-gogdl - GOG Downloading module for Heroic Games Launcher

heroic-gogdl GOG download module for Heroic Games Launcher Purpose This will tak

Paweł Lidwin 36 Dec 23, 2022
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.

A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.

pytube 7.9k Jan 02, 2023
⚙️ A CLI tool that can download songs from youtube.

⚙️ Music Downloader Music Downloader is a tool that can download songs from Youtube. Installation Base requirements: Python 3.7+ If you have Python 3.

matjs 4 Nov 03, 2021
Download all posts and comments in a subreddit

subreddit downloader This subreddit downloader downloads all posts and comments in a subreddit For a tutorial to use this program please follow this m

Guneet 6 Dec 16, 2022
A bot to download songs from YouTube to telegram.

Song-Downloader-Bot A BOT TO DOWNLOAD SONGS FROM YOUTUBE. Mandatory variables API_ID - Get It From my.telegram.org API_HASH - Get It From my.telegram.

Ashik Muhammed 38 Dec 11, 2022