A different spin on dataclasses.

Overview

dataklasses

Dataklasses is a library that allows you to quickly define data classes using Python type hints. Here's an example of how you use it:

from dataklasses import dataklass

@dataklass
class Coordinates:
    x: int
    y: int

The resulting class works in a well civilised way, providing the usual __init__(), __repr__(), and __eq__() methods that you'd normally have to type out by hand:

>>> a = Coordinates(2, 3)
>>> a
Coordinates(2, 3)
>>> a.x
2
>>> a.y
3
>>> b = Coordinates(2, 3)
>>> a == b
True
>>>

It's easy! Almost too easy.

Wait, doesn't this already exist?

No, it doesn't. Yes, certain naysayers will be quick to point out the existence of @dataclass from the standard library. Ok, sure, THAT exists. However, it's slow and complicated. Dataklasses are neither of those things. The entire dataklasses module is less than 100 lines. The resulting classes import 15-20 times faster than dataclasses. See the perf.py file for a benchmark.

Theory of Operation

While out walking with his puppy, Dave had a certain insight about the nature of Python byte-code. Coming back to the house, he had to try it out:

>>> def __init1__(self, x, y):
...     self.x = x
...     self.y = y
...
>>> def __init2__(self, foo, bar):
...     self.foo = foo
...     self.bar = bar
...
>>> __init1__.__code__.co_code == __init2__.__code__.co_code
True
>>>

How intriguing! The underlying byte-code is exactly the same even though the functions are using different argument and attribute names. Aha! Now, we're onto something interesting.

The dataclasses module in the standard library works by collecting type hints, generating code strings, and executing them using the exec() function. This happens for every single class definition where it's used. If it sounds slow, that's because it is. In fact, it defeats any benefit of module caching in Python's import system.

Dataklasses are different. They start out in the same manner--code is first generated by collecting type hints and using exec(). However, the underlying byte-code is cached and reused in subsequent class definitions whenever possible.

A Short Story

Once upon a time, there was this programming language that I'll refer to as "Lava." Anyways, anytime you started a program written in Lava, you could just tell by the awkward silence and inactivity of your machine before the fans kicked in. "Ah shit, this is written in Lava" you'd exclaim.

Questions and Answers

Q: What methods does dataklass generate?

A: By default __init__(), __repr__(), and __eq__() methods are generated. __match_args__ is also defined to assist with pattern matching.

Q: Does dataklass enforce the specified types?

A: No. The types are merely clues about what the value might be and the Python language does not provide any enforcement on its own.

Q: Are there any additional features?

A: No. You can either have features or you can have performance. Pick one.

Q: Does dataklass use any advanced magic such as metaclasses?

A: No.

Q: How do I install dataklasses?

A: There is no setup.py file, installer, or an official release. You install it by copying the code into your own project. dataklasses.py is small. You are encouraged to modify it to your own purposes.

Q: But what if new features get added?

A: What new features? The best new features are no new features.

Q: Who maintains dataklasses?

A: If you're using it, you do. You maintain dataklasses.

Q: Who wrote this?

A: dataklasses is the work of David Beazley. http://www.dabeaz.com.

Owner
David Beazley
Author of the Python Essential Reference (Addison-Wesley), Python Cookbook (O'Reilly), and former computer science professor. Come take a class!
David Beazley
RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Randomised controlled trial abstract result tabulator RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into

2 Sep 16, 2022
Progressive Image Deraining Networks: A Better and Simpler Baseline

Progressive Image Deraining Networks: A Better and Simpler Baseline [arxiv] [pdf] [supp] Introduction This paper provides a better and simpler baselin

190 Dec 01, 2022
Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Microsoft365_devicePhish Abusing Microsoft 365 OAuth Authorization Flow for Phishing Attack This is a simple proof-of-concept script that allows an at

Alex 236 Dec 21, 2022
CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation (CVPR 2021, oral presentation) CoCosNet v2: Full-Resolution Correspondence

Microsoft 308 Dec 07, 2022
My course projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU)

ML2021Spring There are my projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU) Course Web : https://speech.ee.

Ding-Li Chen 15 Aug 29, 2022
A Fast Sequence Transducer Implementation with PyTorch Bindings

transducer A Fast Sequence Transducer Implementation with PyTorch Bindings. The corresponding publication is Sequence Transduction with Recurrent Neur

Awni Hannun 184 Dec 18, 2022
Final project for Intro to CS class.

Financial Analysis Web App https://share.streamlit.io/mayurk1/fin-web-app-final-project/webApp.py 1. Project Description This project is a technical a

Mayur Khanna 1 Dec 10, 2021
【Arxiv】Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution

SANet Exploring Separable Attention for Multi-Contrast MR Image Super-Resolution Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 to

36 Jan 05, 2023
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 140 Nov 23, 2022
Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

The Second Situated Interactive MultiModal Conversations (SIMMC 2.0) Challenge 2021 Welcome to the Second Situated Interactive Multimodal Conversation

Facebook Research 81 Nov 22, 2022
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Jan 03, 2023
CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

FPS-Net Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation", accepted by ISPRS journal of Photogrammetry

15 Nov 30, 2022
Unofficial PyTorch implementation of TokenLearner by Google AI

tokenlearner-pytorch Unofficial PyTorch implementation of TokenLearner by Ryoo et al. from Google AI (abs, pdf) Installation You can install TokenLear

Rishabh Anand 46 Dec 20, 2022
Python Environment for Bayesian Learning

Pebl is a python library and command line application for learning the structure of a Bayesian network given prior knowledge and observations. Pebl in

Abhik Shah 103 Jul 14, 2022
A python program to hack instagram

hackinsta a program to hack instagram Yokoback_(instahack) is the file to open, you need libraries write on import. You run that file in the same fold

2 Jan 22, 2022
Powerful unsupervised domain adaptation method for dense retrieval.

Powerful unsupervised domain adaptation method for dense retrieval

Ubiquitous Knowledge Processing Lab 191 Dec 28, 2022
This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

Microsoft 1.3k Dec 29, 2022
A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

GuidEye A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding h

Munal Jain 0 Aug 09, 2022
PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability

PCACE: A Statistical Approach to Ranking Neurons for CNN Interpretability PCACE is a new algorithm for ranking neurons in a CNN architecture in order

4 Jan 04, 2022