Python function to extract all the rows from a SQLite database file while iterating over its bytes, such as while downloading it

Related tags

Databasestream-sqlite
Overview

stream-sqlite CircleCI Test Coverage

Python function to extract all the rows from a SQLite database file concurrently with iterating over its bytes, without needing random access to the file.

Note that the SQLite file format is not designed to be streamed; the data is arranged in pages of a fixed number of bytes, and the information to identify a page often comes after the page in the stream (sometimes a great deal after). Therefore, pages are buffered in memory until they can be identified.

Installation

pip install stream-sqlite

Usage

from stream_sqlite import stream_sqlite
import httpx

# Iterable that yields the bytes of a sqlite file
def sqlite_bytes():
    with httpx.stream('GET', 'http://www.parlgov.org/static/stable/2020/parlgov-stable.db') as r:
        yield from r.iter_bytes(chunk_size=65_536)

# If there is a single table in the file, there will be exactly one iteration of the outer loop.
# If there are multiple tables, each can appear multiple times.
for table_name, pragma_table_info, rows in stream_sqlite(sqlite_bytes(), max_buffer_size=1_048_576):
    for row in rows:
        print(row)

Recommendations

If you have control over the SQLite file, VACUUM; should be run on it before streaming. In addition to minimising the size of the file, VACUUM; arranges the pages in a way that often reduces the buffering required when streaming. This is especially true if it was the target of intermingled INSERTs and/or DELETEs over multiple tables.

Also, indexes are not used for extracting the rows while streaming. If streaming is the only use case of the SQLite file, and you have control over it, indexes should be removed, and VACUUM; then run.

Some tests suggest that if the file is written in autovacuum mode, i.e. PRAGMA auto_vacuum = FULL;, then the pages are arranged in a way that reduces the buffering required when streaming. Your mileage may vary.

Owner
Department for International Trade
Department for International Trade
Simpledb-py: Simple JSON database

Simpledb-py: Simple JSON database

тейлс 2 Feb 09, 2022
Migrate data from SQL to NoSQL easily

Migrate data from SQL to NoSQL easily Installation 💯 pip install sql2nosql --upgrade Dependencies 📢 For the package to work, it first needs "clients

Facundo Padilla 43 Mar 26, 2022
This project is related to a No-SQL database, whose data are referred to autoctone botanic species

This project is related to a No-SQL database, whose data are referred to autoctone botanic species. The final goal is creating a function that performs the estimation of the ornamental value, given t

Amatofrancesco99 2 Mar 08, 2022
Python object-oriented database

ZODB, a Python object-oriented database ZODB provides an object-oriented database for Python that provides a high-degree of transparency. ZODB runs on

Zope 574 Dec 31, 2022
Metrics-advisor - Analyze reshaped metrics from TiDB cluster Prometheus and give some advice about anomalies and correlation.

metrics-advisor Analyze reshaped metrics from TiDB cluster Prometheus and give some advice about anomalies and correlation. Team freedeaths mashenjun

3 Jan 07, 2022
Лабораторные работы по Postgresql за 5 семестр

Практикум по Postgresql ERD для заданий 2.x: ERD для заданий 3.x: Их делал вот тут Ниже есть 2 инструкции — по установке postgresql на manjaro и по пе

Danila 10 Oct 31, 2022
Monty, Mongo tinified. MongoDB implemented in Python !

Monty, Mongo tinified. MongoDB implemented in Python ! Was inspired by TinyDB and it's extension TinyMongo

David Lai 523 Jan 02, 2023
A super easy, but really really bad DBMS

Dumb DB Are you looking for a reliable database management system? Then you've come to the wrong place. This is a very small database management syste

Elias Amha 5 Dec 28, 2022
Enfilade: Tool to Detect Infections in MongoDB Instances

Enfilade: Tool to Detect Infections in MongoDB Instances

Aditya K Sood 7 Feb 21, 2022
ClutterDB - Extremely simple JSON database made for infrequent changes which behaves like a dict

extremely simple JSON database made for infrequent changes which behaves like a dict this was made for ClutterBot

Clutter Development 1 Jan 12, 2022
securedb is a fast and lightweight Python framework to easily interact with JSON-based encrypted databases.

securedb securedb is a Python framework that lets you work with encrypted JSON databases. Features: newkey() to generate an encryption key write(key,

Filippo Romani 2 Nov 23, 2022
A Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library

gdsclient This repo hosts the sources for gdsclient, a Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library. g

Neo Technology 101 Jan 05, 2023
A NoSQL database made in python.

CookieDB A NoSQL database made in python.

cookie 1 Nov 30, 2022
MyReplitDB - the most simplistic and easiest wrapper to use for replit's database system.

MyReplitDB is the most simplistic and easiest wrapper to use for replit's database system. Installing You can install it from the PyPI Or y

kayle 4 Jul 03, 2022
Mongita is to MongoDB as SQLite is to SQL

Mongita is a lightweight embedded document database that implements a commonly-used subset of the MongoDB/PyMongo interface. Mongita differs from MongoDB in that instead of being a server, Mongita is

Scott Rogowski 809 Jan 07, 2023
Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries.

Mini-SQL-Engine Given a metadata file with relevant schema, an SQL Engine can be run for a subset of SQL queries. The query engine supports Project, A

Prashant Raj 1 Dec 03, 2021
A Modular MWDB Utility to Collect Fresh Malware Samples

MWDB Feeds A Modular MWDB Utility to Collect Fresh Malware Samples This project is FREE as in FREE 🍺 , use it commercially, privately or however you

c3rb3ru5 32 Jul 07, 2022
Postgres full text search options (tsearch, trigram) examples

postgres-full-text-search Postgres full text search options (tsearch, trigram) examples. Create DB CREATE DATABASE ftdb; To feed db with an example

Jarosław Orzeł 97 Dec 30, 2022
Oh-My-PickleDB is an open source key-value store using Python's json module.

OH-MY-PICKLEDB oh-my-pickleDB is a lightweight, fast, and intuitive data manager written in python 📝 Table of Contents About Getting Started Deployme

Adrián Toral 6 Feb 20, 2022
LightDB is a lightweight JSON Database for Python

LightDB What is this? LightDB is a lightweight JSON Database for Python that allows you to quickly and easily write data to a file Installing pip3 ins

Stanislaw 14 Oct 01, 2022