This repository contains a set of benchmarks of different implementations of Parquet (storage format) <-> Arrow (in-memory format).

Overview

Parquet benchmarks

This repository contains a set of benchmarks of different implementations of Parquet (storage format) <-> Arrow (in-memory format).

The results on Azure's Standard D4s v3 (4 vcpus, 16 GiB memory) are available here.

Read uncompressed

read uncompressed i64

read uncompressed bool

read uncompressed utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

read uncompressed dict utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

Read compressed (snappy)

read compressed i64

read compressed bool

read compressed utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

Write uncompressed

write uncompressed i64

write uncompressed bool

write uncompressed utf8

Write compressed (snappy)

write compressed i64

write compressed bool

write compressed utf8

(Note: neither pyarrow nor arrow validate utf8, which can result in undefined behavior.)

Run benchmarks

To reproduce, use

python3 -m venv venv
venv/bin/pip install -U pip
venv/bin/pip install pyarrow

# create files
venv/bin/python write_parquet.py

# run benchmarks
venv/bin/python run.py

# print results to stdout as csv
venv/bin/python summarize.py

Details

The benchmark reads a single column from a file pre-loaded into memory, decompresses and deserializes the column to an arrow array.

The benchmark includes different configurations:

  • dictionary-encoded vs plain encoding
  • single page vs multiple pages
  • compressed vs uncompressed
  • different types:
    • i64
    • bool
    • utf8
Python Rest Testing

pyresttest Table of Contents What Is It? Status Installation Sample Test Examples Installation How Do I Use It? Running A Simple Test Using JSON Valid

Sam Van Oort 1.1k Dec 28, 2022
Voip Open Linear Testing Suite

VOLTS Voip Open Linear Tester Suite Functional tests for VoIP systems based on voip_patrol and docker 10'000 ft. view System is designed to run simple

Igor Olhovskiy 17 Dec 30, 2022
Testing Calculations in Python, using OOP (Object-Oriented Programming)

Testing Calculations in Python, using OOP (Object-Oriented Programming) Create environment with venv python3 -m venv venv Activate environment . venv

William Koller 1 Nov 11, 2021
Scalable user load testing tool written in Python

Locust Locust is an easy to use, scriptable and scalable performance testing tool. You define the behaviour of your users in regular Python code, inst

Locust.io 20.4k Jan 04, 2023
Test for generating stylized circuit traces from images

I test of an image processing idea to take an image and make neat circuit board art automatically. Inspired by this twitter post by @JackRhysider

Miller Hooks 3 Dec 12, 2022
Automating the process of sorting files in my downloads folder by file type.

downloads-folder-automation Automating the process of sorting files in a user's downloads folder on Windows by file type. This script iterates through

Eric Mahasi 27 Jan 07, 2023
Compiles python selenium script to be a Window's executable

Problem Statement Setting up a Python project can be frustrating for non-developers. From downloading the right version of python, setting up virtual

Jerry Ng 8 Jan 09, 2023
Language-agnostic HTTP API Testing Tool

Dredd — HTTP API Testing Framework Dredd is a language-agnostic command-line tool for validating API description document against backend implementati

Apiary 4k Jan 05, 2023
Pytest support for asyncio.

pytest-asyncio: pytest support for asyncio pytest-asyncio is an Apache2 licensed library, written in Python, for testing asyncio code with pytest. asy

pytest-dev 1.1k Jan 02, 2023
Python tools for penetration testing

pyTools_PT python tools for penetration testing Please don't use these tool for illegal purposes. These tools is meant for penetration testing for leg

Gourab 1 Dec 01, 2021
pytest plugin to test mypy static type analysis

pytest-mypy-testing — Plugin to test mypy output with pytest pytest-mypy-testing provides a pytest plugin to test that mypy produces a given output. A

David Fritzsche 21 Dec 21, 2022
PacketPy is an open-source solution for stress testing network devices using different testing methods

PacketPy About PacketPy is an open-source solution for stress testing network devices using different testing methods. Currently, there are only two c

4 Sep 22, 2022
A mocking library for requests

httmock A mocking library for requests for Python 2.7 and 3.4+. Installation pip install httmock Or, if you are a Gentoo user: emerge dev-python/httm

Patryk Zawadzki 452 Dec 28, 2022
A Simple Unit Test Matcher Library for Python 3

pychoir - Python Test Matchers for humans Super duper low cognitive overhead matching for Python developers reading or writing tests. Implemented in p

Antti Kajander 15 Sep 14, 2022
This file will contain a series of Python functions that use the Selenium library to search for elements in a web page while logging everything into a file

element_search with Selenium (Now With docstrings 😎 ) Just to mention, I'm a beginner to all this, so it it's very possible to make some mistakes The

2 Aug 12, 2021
WEB PENETRATION TESTING TOOL 💥

N-WEB ADVANCE WEB PENETRATION TESTING TOOL Features 🎭 Admin Panel Finder Admin Scanner Dork Generator Advance Dork Finder Extract Links No Redirect H

56 Dec 23, 2022
pytest plugin providing a function to check if pytest is running.

pytest-is-running pytest plugin providing a function to check if pytest is running. Installation Install with: python -m pip install pytest-is-running

Adam Johnson 21 Nov 01, 2022
Screenplay pattern base for Python automated UI test suites.

ScreenPy TITLE CARD: "ScreenPy" TITLE DISAPPEARS.

Perry Goy 39 Nov 15, 2022
catsim - Computerized Adaptive Testing Simulator

catsim - Computerized Adaptive Testing Simulator Quick start catsim is a computerized adaptive testing simulator written in Python 3.4 (with modificat

Nguyễn Văn Anh Tuấn 1 Nov 29, 2021
Selenium-python but lighter: Helium is the best Python library for web automation.

Selenium-python but lighter: Helium Selenium-python is great for web automation. Helium makes it easier to use. For example: Under the hood, Helium fo

Michael Herrmann 3.2k Dec 31, 2022