Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Related tags

Databasestream-sqlite
Overview

stream-sqlite CircleCI Test Coverage

Python function to extract all the rows from a SQLite database file concurrently with iterating over its bytes, without needing random access to the file.

Note that the SQLite file format is not designed to be streamed; the data is arranged in pages of a fixed number of bytes, and the information to identify a page often comes after the page in the stream (sometimes a great deal after). Therefore, pages are buffered in memory until they can be identified.

Installation

pip install stream-sqlite

Usage

from stream_sqlite import stream_sqlite
import httpx

# Iterable that yields the bytes of a sqlite file
def sqlite_bytes():
    with httpx.stream('GET', 'http://www.parlgov.org/static/stable/2020/parlgov-stable.db') as r:
        yield from r.iter_bytes(chunk_size=65_536)

# If there is a single table in the file, there will be exactly one iteration of the outer loop.
# If there are multiple tables, each can appear multiple times.
for table_name, pragma_table_info, rows in stream_sqlite(sqlite_bytes(), max_buffer_size=1_048_576):
    for row in rows:
        print(row)

Recommendations

If you have control over the SQLite file, VACUUM; should be run on it before streaming. In addition to minimising the size of the file, VACUUM; arranges the pages in a way that often reduces the buffering required when streaming. This is especially true if it was the target of intermingled INSERTs and/or DELETEs over multiple tables.

Also, indexes are not used for extracting the rows while streaming. If streaming is the only use case of the SQLite file, and you have control over it, indexes should be removed, and VACUUM; then run.

Some tests suggest that if the file is written in autovacuum mode, i.e. PRAGMA auto_vacuum = FULL;, then the pages are arranged in a way that reduces the buffering required when streaming. Your mileage may vary.

Owner
Department for International Trade
Department for International Trade
EmployeeDB - Advanced Redis search functionalities on Python applied on an Employee management backend app

EmployeeDB - Advanced Redis search functionalities on Python applied on an Employee management backend app

Ahmad Bazzi 58 Oct 10, 2022
A very simple document database

DockieDb A simple in-memory document database. Installation Build the Wheel Fork or clone this repository and run python setup.py bdist_wheel in the r

1 Jan 16, 2022
pickleDB is an open source key-value store using Python's json module.

pickleDB pickleDB is lightweight, fast, and simple database based on the json module. And it's BSD licensed! pickleDB is Fun import pickledb

Harrison Erd 738 Jan 04, 2023
Mongita is to MongoDB as SQLite is to SQL

Mongita is a lightweight embedded document database that implements a commonly-used subset of the MongoDB/PyMongo interface. Mongita differs from MongoDB in that instead of being a server, Mongita is

Scott Rogowski 809 Jan 07, 2023
This project is related to a No-SQL database, whose data are referred to autoctone botanic species

This project is related to a No-SQL database, whose data are referred to autoctone botanic species. The final goal is creating a function that performs the estimation of the ornamental value, given t

Amatofrancesco99 2 Mar 08, 2022
A Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library

gdsclient This repo hosts the sources for gdsclient, a Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library. g

Neo Technology 101 Jan 05, 2023
A Persistent Embedded Graph Database for Python

Cog - Embedded Graph Database for Python cogdb.io New release: 2.0.5! Installing Cog pip install cogdb Cog is a persistent embedded graph database im

Arun Mahendra 214 Dec 30, 2022
A simple GUI that interacts with a database to keep track of a collection of US coins.

CoinCollectorGUI A simple gui designed to interact with a database. The goal of the database is to make keeping track of collected coins simple. The G

Builder212 1 Nov 09, 2021
Postgres full text search options (tsearch, trigram) examples

postgres-full-text-search Postgres full text search options (tsearch, trigram) examples. Create DB CREATE DATABASE ftdb; To feed db with an example

Jarosław Orzeł 97 Dec 30, 2022
Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Department for International Trade 16 Nov 09, 2022
Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries.

Mini-SQL-Engine Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries. The query engine supports Project, A

Prashant Raj 1 Dec 03, 2021
PathfinderMonsterDatabase - A database of all monsters in Pathfinder 1e, created by parsing aonprd.com

PathfinderMonsterDatabase A database of all monsters in Pathfinder 1e, created by parsing aonprd.com Setup Run the following line to install all requi

Yoni Lerner 11 Jun 12, 2022
A Painless Simple Way To Create Schema and Do Database Operations Quickly In Python

PainlessDB - Taking Your Pain away to the moon 🚀 Contribute · Community · Documentation 🎫 Introduction : PainlessDB is a Python-based free and open-

Aiden Ellis 3 Jul 15, 2022
Metrics-advisor - Analyze reshaped metrics from TiDB cluster Prometheus and give some advice about anomalies and correlation.

metrics-advisor Analyze reshaped metrics from TiDB cluster Prometheus and give some advice about anomalies and correlation. Team freedeaths mashenjun

3 Jan 07, 2022
Tiny local JSON database for Python.

Pylowdb Simple to use local JSON database 🦉 # This is pure python, not specific to pylowdb ;) db.data['posts'] = ({ 'id': 1, 'title': 'pylowdb is awe

Hussein Sarea 3 Jan 26, 2022
Migrate data from SQL to NoSQL easily

Migrate data from SQL to NoSQL easily Installation 💯 pip install sql2nosql --upgrade Dependencies 📢 For the package to work, it first needs "clients

Facundo Padilla 43 Mar 26, 2022
This repo contains the backend of the KMK project

KMK Backend This repository contains the backend part of the KMK project Demo Watch it on Youtube Getting started Pre-commit hooks After you cloned th

21 Nov 26, 2022
LightDB is a lightweight JSON Database for Python

LightDB What is this? LightDB is a lightweight JSON Database for Python that allows you to quickly and easily write data to a file Installing pip3 ins

Stanislaw 14 Oct 01, 2022
Monty, Mongo tinified. MongoDB implemented in Python !

Monty, Mongo tinified. MongoDB implemented in Python ! Was inspired by TinyDB and it's extension TinyMongo

David Lai 523 Jan 02, 2023
securedb is a fast and lightweight Python framework to easily interact with JSON-based encrypted databases.

securedb securedb is a Python framework that lets you work with encrypted JSON databases. Features: newkey() to generate an encryption key write(key,

Filippo Romani 2 Nov 23, 2022