Internationalized Domain Names for Python (IDNA 2008 and UTS #46)

Overview

Internationalized Domain Names in Applications (IDNA)

Support for the Internationalised Domain Names in Applications (IDNA) protocol as specified in RFC 5891. This is the latest version of the protocol and is sometimes referred to as “IDNA 2008”.

This library also provides support for Unicode Technical Standard 46, Unicode IDNA Compatibility Processing.

This acts as a suitable replacement for the “encodings.idna” module that comes with the Python standard library, but which only supports the older superseded IDNA specification (RFC 3490).

Basic functions are simply executed:

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

Installation

To install this library, you can use pip:

$ pip install idna

Alternatively, you can install the package using the bundled setup script:

$ python setup.py install

Usage

For typical usage, the encode and decode functions will take a domain name argument and perform a conversion to A-labels or U-labels respectively.

>>> import idna
>>> idna.encode('ドメイン.テスト')
b'xn--eckwd4c7c.xn--zckzah'
>>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
ドメイン.テスト

You may use the codec encoding and decoding methods using the idna.codec module:

>>> import idna.codec
>>> print('домен.испытание'.encode('idna'))
b'xn--d1acufc.xn--80akhbyknj4f'
>>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna'))
домен.испытание

Conversions can be applied at a per-label basis using the ulabel or alabel functions if necessary:

>>> idna.alabel('测试')
b'xn--0zwm56d'

Compatibility Mapping (UTS #46)

As described in RFC 5895, the IDNA specification does not normalize input from different potential ways a user may input a domain name. This functionality, known as a “mapping”, is considered by the specification to be a local user-interface issue distinct from IDNA conversion functionality.

This library provides one such mapping, that was developed by the Unicode Consortium. Known as Unicode IDNA Compatibility Processing, it provides for both a regular mapping for typical applications, as well as a transitional mapping to help migrate from older IDNA 2003 applications.

For example, “Königsgäßchen” is not a permissible label as LATIN CAPITAL LETTER K is not allowed (nor are capital letters in general). UTS 46 will convert this into lower case prior to applying the IDNA conversion.

>>> import idna
>>> idna.encode('Königsgäßchen')
...
idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
>>> idna.encode('Königsgäßchen', uts46=True)
b'xn--knigsgchen-b4a3dun'
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
königsgäßchen

Transitional processing provides conversions to help transition from the older 2003 standard to the current standard. For example, in the original IDNA specification, the LATIN SMALL LETTER SHARP S (ß) was converted into two LATIN SMALL LETTER S (ss), whereas in the current IDNA specification this conversion is not performed.

>>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
'xn--knigsgsschen-lcb0w'

Implementors should use transitional processing with caution, only in rare cases where conversion from legacy labels to current labels must be performed (i.e. IDNA implementations that pre-date 2008). For typical applications that just need to convert labels, transitional processing is unlikely to be beneficial and could produce unexpected incompatible results.

encodings.idna Compatibility

Function calls from the Python built-in encodings.idna module are mapped to their IDNA 2008 equivalents using the idna.compat module. Simply substitute the import clause in your code to refer to the new module name.

Exceptions

All errors raised during the conversion following the specification should raise an exception derived from the idna.IDNAError base class.

More specific exceptions that may be generated as idna.IDNABidiError when the error reflects an illegal combination of left-to-right and right-to-left characters in a label; idna.InvalidCodepoint when a specific codepoint is an illegal character in an IDN label (i.e. INVALID); and idna.InvalidCodepointContext when the codepoint is illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ but the contextual requirements are not satisfied.)

Building and Diagnostics

The IDNA and UTS 46 functionality relies upon pre-calculated lookup tables for performance. These tables are derived from computing against eligibility criteria in the respective standards. These tables are computed using the command-line script tools/idna-data.

This tool will fetch relevant codepoint data from the Unicode repository and perform the required calculations to identify eligibility. There are three main modes:

  • idna-data make-libdata. Generates idnadata.py and uts46data.py, the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors who wish to track this library against a different Unicode version may use this tool to manually generate a different version of the idnadata.py and uts46data.py files.
  • idna-data make-table. Generate a table of the IDNA disposition (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC 5892 and the pre-computed tables published by IANA.
  • idna-data U+0061. Prints debugging output on the various properties associated with an individual Unicode codepoint (in this case, U+0061), that are used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging or analysis.

The tool accepts a number of arguments, described using idna-data -h. Most notably, the --version argument allows the specification of the version of Unicode to use in computing the table data. For example, idna-data --version 9.0.0 make-libdata will generate library data against Unicode 9.0.0.

Additional Notes

  • Packages. The latest tagged release version is published in the Python Package Index.
  • Version support. This library supports Python 3.5 and higher. As this library serves as a low-level toolkit for a variety of applications, many of which strive for broad compatibility with older Python versions, there is no rush to remove older intepreter support. Removing support for older versions should be well justified in that the maintenance burden has become too high.
  • Python 2. Python 2 is supported by version 2.x of this library. While active development of the version 2.x series has ended, notable issues being corrected may be backported to 2.x. Use "idna<3" in your requirements file if you need this library for a Python 2 application.
  • Testing. The library has a test suite based on each rule of the IDNA specification, as well as tests that are provided as part of the Unicode Technical Standard 46, Unicode IDNA Compatibility Processing.
  • Emoji. It is an occasional request to support emoji domains in this library. Encoding of symbols like emoji is expressly prohibited by the technical standard IDNA 2008 and emoji domains are broadly phased out across the domain industry due to associated security risks. For now, applications that wish need to support these non-compliant labels may wish to consider trying the encode/decode operation in this library first, and then falling back to using encodings.idna. See the Github project for more discussion.
Owner
Kim Davies
Kim Davies
Steal Files on a Windows Machine

File-Stealer Steal Files on a Windows Machine About This Script will steal certain Files on a Windows Machine and sends them to a FTP Server. Preview

Marcel 5 Nov 17, 2022
'Our Drowsinessdetector detects drivers eyes if they are closed for more than 2 seconds and alerts driver'

Data analysis Document here the project: DriverDrowsinessDetector Description: Project Description Data Source: Type of analysis: Please document the

3 Jul 03, 2022
Hikvision 流媒体管理服务器敏感信息泄漏

Hikvisioninformation Hikvision 流媒体管理服务器敏感信息泄漏 Options optional arguments: -h, --help show this help message and exit -u url, --url url

Henry4E36 13 Nov 09, 2022
Script to calculate Active Directory Kerberos keys (AES256 and AES128) for an account, using its plaintext password

Script to calculate Active Directory Kerberos keys (AES256 and AES128) for an account, using its plaintext password

Matt Creel 27 Dec 20, 2022
Client script for the fisherman phishing tool

Client script for the fisherman phishing tool

Pushkar Raj 1 Feb 23, 2022
Web Scraping com Python - Raspando Vagas para Programadores

Web Scraping com Python - Raspando Vagas para Programadores Sobre o Projeto Web

Kayo Libarino 3 Dec 30, 2021
Everything I needed to understand what was going on with "Spring4Shell" - translated source materials, exploit, links to demo apps, and more.

springcore-0day-en These are all my notes from the alleged confirmed! 0day dropped on 2022-03-29. This vulnerability is commonly referred to as "Sprin

Chris Partridge 105 Nov 26, 2022
DNS hijacking via dead records automation tool

DeadDNS Multi-threaded DNS hijacking via dead records automation tool How it works 1) Dig provided subdomains file for dead DNS records. 2) Dig the fo

45 Dec 20, 2022
CVE-2022-22965 : about spring core rce

CVE-2022-22965: Spring-Core-Rce EXP 特性: 漏洞探测(不写入 webshell,简单字符串输出) 自定义写入 webshell 文件名称及路径 不会追加写入到同一文件中,每次检测写入到不同名称 webshell 文件 支持写入 冰蝎 webshell 代理支持,可

东方有鱼名为咸 53 Nov 09, 2022
NEW FACEBOOK CLONER WITH NEW PASSWORD, TERMUX FB CLONE, FB CLONING COMMAND. M

NEW FACEBOOK CLONER WITH NEW PASSWORD, TERMUX FB CLONE, FB CLONING COMMAND. M

Mr. Error 81 Jan 08, 2023
A Telegram Bot to force users to join a specific channel before sending messages in a group.

Promoter A Telegram Bot to force users to join a specific channel before sending messages in a group. Introduction A Telegram Bot to force users to jo

Mr. Dynamic 1 Jan 27, 2022
🐝 ℹ️ Honeybee extension for export to IES-VE gem file format

honeybee-ies Honeybee extension for export a HBJSON file to IES-VE GEM file format Installation pip install honeybee-ies QuickStart import pathlib fro

Ladybug Tools 4 Jul 12, 2022
Virus-Builder - This tool will generate a virus that can only destroy Windows computer

Virus-Builder - This tool will generate a virus that can only destroy Windows computer. You can also configure to auto run in usb drive

Saad 16 Dec 30, 2022
Log4j-Scanner with Bind-Receipt and custom hostnames

Hrafna - Log4j-Scanner for the masses Features Scanning-system designed to check your own infra for vulnerable log4j-installations start and stop scan

18 Jan 23, 2022
A simple python code for hacking profile views

This code for hacking profile views. Not recommended to adding profile views in profile. This code is not illegal code. This code is for beginners.

Fayas Noushad 3 Nov 28, 2021
Tinyman exploit finder - Tinyman exploit finder for python

tinyman_exploit_finder There was a big tinyman exploit. You can read about it he

fish.exe 9 Dec 27, 2022
A fast sub domain brute tool for pentesters

subDomainsBrute 1.4 A fast sub domain brute tool for pentesters. It works with P

Oliver 2 Oct 18, 2022
Yuyu Scanner is a Web Reconnaissance & Web Analysis Scanner to find assets and information about targets.

Yuyu Scanner Yuyu Scanner is a Web Reconnaissance & Web Analysis Scanner to find assets and information about targets. installation ! run as root

Justakazh 20 Nov 24, 2022
AutoScan 有多个目标时,调用xray+rad进行自动扫描

Usage: 在高级版Xray和rad同目录下运行 python3 X-AutoXray.py xxxx.txt 写的蛮人性化的哦,os,linux,windows通用 生成的xray报告会在当前目录的/result下面 Ctrl+c 打断脚本运行时还可以结算扫描进度,生成已扫描和未扫描的进度文件,

斯文 73 Jan 01, 2023
Signatures and IoCs from public Volexity blog posts.

threat-intel This repository contains IoCs related to Volexity public threat intelligence blog posts. They are organised by year, and within each year

Volexity 130 Dec 29, 2022