Bringing sanity to world of messed-up data

Last update: Oct 26, 2021

Related tags

Overview

Sanitize

sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is distributed under the BSD license.

Usage

>>> from sanitize import HTML
>>> HTML('<b>hello')
'<b>hello</b>'
>>> HTML('<img>')
'<img />'
>>> HTML(("<b><b><b>hello")
... )
'<b><b><b>hello</b></b></b>'
>>> HTML('<img src="foo"/')
''
>>> HTML('<input type="checkbox" checked>')
'<input type="checkbox" checked="checked" />'
>>> # dangerous tags (a small sample)
... 
>>> HTML('safe<applet code="foo.class" codebase="http://example.com/"></applet> <b>description</b>')
'safe <b>description</b>'
>>> HTML('safe<frameset rows="*"><frame src="http://example.com/"></frameset> <b>description</b>')
'safe <b>description</b>'
>>> # bad protocols (a small sample)
>>> HTML('<a href="java' + chr(1) + 'script:foo">bar</a>')
'<a href="#foo">bar</a>'
>>> HTML('<a href="vbscript:foo">bar</a>')
'<a href="#foo">bar</a>'
>>>

To see more usage examples see tests/test_sanitize_html.py.

Installation

python-sanitize is available on pypi

http://pypi.python.org/pypi/sanitize

So easily install it by pip:

pip install sanitize

Or by easy_install:

$ easy_install sanitize

Another way is by cloning python-sanitize's git repository

$ git clone git://github.com/Alir3z4/python-sanitize.git

Then install it by running

$ python setup.py install

Tests

To run unit tests:

$ python setup.py test

License

Sanitize is distributed under BSD license.

You might also like...

PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

Compositional Zero-Shot Learning This is the official PyTorch code of the CVPR 2021 works Learning Graph Embeddings for Compositional Zero-shot Learni

70 Dec 27, 2022

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

piglet PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021] This repo contains code and data for PIGLeT. If you like

51 Oct 8, 2022

The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

105 Dec 30, 2022

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

117 Dec 28, 2022

Releases(2014.10.7)

2014.10.7(Oct 7, 2014)
Version 2014.10.7 - 2014-10-07

Feature: Add ChangeLog.rst file.

Feature: Add AUTHORS.rst file.

Feature: Add setup.cfg for wheel support.`

Feature #2: Add travis-ci testing.

Feature #4: Using unittest for testing.

Feature #7: Add coveralls support.

Feature #8: Add MANIFEST.in file.

Feature #5: Better Readme and documentation.

Feature #1: Python packaging done right.

Feature #9: Change version numbering.

Source code(tar.gz)
Source code(zip)

Bringing sanity to world of messed-up data

Related tags

Overview

Sanitize

Usage

Installation

Tests

License

You might also like...

PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

The first dataset on shadow generation for the foreground object in real-world scenes.

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

Open-World Entity Segmentation

HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

[CVPR2021] De-rendering the World's Revolutionary Artefacts

Learning Open-World Object Proposals without Learning to Classify

Releases(2014.10.7)

2014.10.7(Oct 7, 2014)

Version 2014.10.7 - 2014-10-07

Owner

Alireza Savand

PlenOctrees: NeRF-SH Training & Conversion

MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.

A parametric soroban written with CADQuery.

Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"

Code for Paper: Self-supervised Learning of Motion Capture

Code for Recurrent Mask Refinement for Few-Shot Medical Image Segmentation (ICCV 2021).

[CVPR 2021] NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning

A tool for calculating distortion parameters in coordination complexes.

Package for extracting emotions from social media text. Tailored for financial data.

A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.

ANEA: Automated (Named) Entity Annotation for German Domain-Specific Texts

Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"

Pytorch Implementation of Various Point Transformers

Python version of the amazing Reaction Mechanism Generator (RMG).

The official codes for the ICCV2021 presentation "Uniformity in Heterogeneity: Diving Deep into Count Interval Partition for Crowd Counting"

Source code of all the projects of Udacity Self-Driving Car Engineer Nanodegree.

tensorflow implementation of 'YOLO : Real-Time Object Detection'

Repository of our paper 'Refer-it-in-RGBD' in CVPR 2021

ML models implementation practice

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility